In the apache error log, the first error to confront me was this:
WordPress database error Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '=' for query SELECT comment_ID FROM wp_comments WHERE comment_post_ID = '2933' AND comment_approved != 'trash' AND .... etc
Collation conflict …. I looked at the database structure, sure enough, there’s a mix of
utf8_general_ci and latin1_swedish_ci
. Most of the default WordPress database tables are Latin1 and it looks like the newer tables and all of my plugins are UTF 8.
Looking at WordPress installs that I haven’t upgraded yet, all of them are set to UTF 8. But these are newer installs. The WordPress installation that has this problem has been in place since version 1.* and WordPress used to default to Latin 1 encoding. The UTF 8 tables are newer.
So, in other words, if you’re maintaining a WordPress install that’s been around for a while, you probably also have this issue in your database.
Now, to get this particular database fixed up …
One suggestion I came across was to export the schema and data separately then recreate the database with UTF 8 charset and reimport the data. However, that will likely break the existing content because the content has Latin 1 characters which will cause the newly minted UTF tables to choke a little.
… When converting the character sets, all TEXT (and similar) fields are converted to UTF-8, but that conversion will BREAK existing TEXT because the conversion expects the data to be in latin1, but WordPress may have stored unicode characters in a latin1 database, and as a result, data could end up as garbage after a conversion!
ref
The better way is to run an ALTER tables query. The steps are:
- backup everything up
- ALTER all TEXT and related fields to their binary counterparts using the SQL statements generated below
- alter the character set
- change the binary data type fields back to TEXT
- Add DB_CHARSET and DB_COLLATE definitions to wp-config.php
Using MySQL’s information_schema to generate the actual ALTER tables statements needed, I wanted to pipe out the statements to a reusable text document so I could do a test run on a copy of the problem database first. Generating the ALTER tables statements using information_schema will then include whatever extra tables or rows have been added by plugins.
The following 4 queries are lifted directly from Haidong Ji via the WordPress Codex. I added the pipe out to text file. Note that the path to mysql executable is on Ubuntu. Your path might be different, for example, another common location might be /usr/local/mysql/bin.
1. ALTER all TEXT and related string field types to binary field types counterparts using the SQL statements generated below. The list of conversions being made is as follows:
CHAR -> BINARY
VARCHAR -> VARBINARY
TINYTEXT -> TINYBLOB
TEXT -> BLOB
MEDIUMTEXT -> MEDIUMBLOB
LONGTEXT -> LONGBLOB
echo "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'char', 'binary'),';') FROM columns WHERE table_schema = 'testblog' and data_type LIKE '%char%';" | /usr/bin/mysql -u<em>user</em> -p<em>password</em> information_schema > /home/<em>userdir</em>/char2binary.txtecho "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'text', 'blob'),';') FROM columns WHERE table_schema = 'testblog' and data_type LIKE '%text%';" | /usr/bin/mysql -u<em>user</em> -p<em>password</em> information_schema > /home/<em>userdir</em>/text2blob.txtecho "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', column_type, 'CHARACTER SET utf8;') FROM columns WHERE table_schema = 'testblog' and data_type LIKE '%char%';" | /usr/bin/mysql -u<em>user</em> -p<em>password</em> information_schema > /home/<em>userdir</em>/char2utf.txt[/code]</li> <li>1echo "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', column_type, 'CHARACTER SET utf8;') FROM columns WHERE table_schema = 'testblog' and data_type LIKE '%text%';" | /usr/bin/mysql -u<em>user</em> -p<em>password</em> information_schema > /home/<em>userdir</em>/text2utf.txt
2. Change the default character set of the database from Latin 1 to UTF 8. This will ensure all new tables created are UTF 8, but doesn't affect existing tables (which is fine since we've already changed those above, right?).
This can be done either using phpmyadmin (select database -> click the Operations tab -> select UTF 8 from dropdown menu at bottom) or in your shell
ALTER DATABASE MyDb CHARACTER SET utf8;
3. change the binary data type fields back to TEXT
This can get sticky if your original structure had datatypes that ... were supposed to be binary or blobs. I looked through my database before altering it and did not find any blobs or binaries. But this could vary depending upon plugins used, etc. Changing the binaries back to TEXT essentially involves running the above sql scripts in reverse.
echo "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'binary', 'char'), ';') FROM columns WHERE table_schema = 'testblog' and data_type LIKE '%binary%';" | /usr/bin/mysql -u<em>user</em> -p<em>password</em> information_schema > /home/<em>userdir</em>/rev-binary2char.txtecho "SELECT CONCAT('ALTER TABLE ', table_name, ' MODIFY ', column_name, ' ', REPLACE(column_type, 'blob', 'text'), ';') FROM columns WHERE table_schema = 'testblog' and data_type LIKE '%blob%';" | /usr/bin/mysql -u<em>user</em> -p<em>password</em> information_schema > /home/<em>userdir</em>/rev-blob2text.txt
4. Add DB_CHARSET and DB_COLLATE definitions to wp-config.php
references
wordpress article on changing collation