More utf-8 woes

The second installment of the utf-8 saga I started last week.

Of course, per all the advice that’s out there, I needed to
run some nasty-ass updates directly in MySQL. Buh. Even in my utf-8 Terminal.app I could not display some weirdness properly. So it came down to hex() and unhex()’ing in MySQL. Joy!

Even if it’s only for myself, here is the code that was used:

update articles set body = replace(body, unhex('C3A23F3F'), "'") where body regexp unhex('C3A2');

and let’s not forget this beauty:

update articles set body = replace(body, unhex('C3A23FC29D'), "'") where body regexp unhex('C3A2');

both of which returned various combinations of ??? รข?? and Japanese characters to a simple apostrophe.

And no, the character_set_server still isn’t utf-8. Most data is fine now though.

Continue reading