Bugzilla: Upgrading MySQL Database from Latin1 to UTF8

IMPORTANT Update 2009-07-09: This information is already several years old! You should not use this information for modern versions of Bugzilla (3.2 and above), which will allow you to convert to UTF-8 using checksetup.pl.

In the present case, we have a Bugzilla database created with a charset of latin1. Unfortunately, now after updating MySQL to 4.1, error occured when trying to assign a new developer to a bug, indicating mismatch of collations (UTF-8 vs. Latin1). This is caused by the fact that there is now latin1-data stored in an UTF-8 table.

The following procedure can be used to upgrade the database to UTF-8, eliminating the problem:

  1. mysqldump -p --default_character-set=latin1 --skip-set-charset bugs > dump.sql
  2. mysql -p --execute="DROP DATABASE bugs; CREATE DATABASE bugs CHARACTER SET utf8 COLLATE utf8_general_ci;"
  3. mysql -p --default-character-set=utf8 bugs < dump.sql
  4. perl -pe 's/latin1_bin/utf8_general_ci/g; s/latin1/utf8/g' dump.sql > dump-utf8.sql
  5. mysql -p --default-character-set=utf8 bugs < dump-utf8.sql

You should of course always check if the Pearl-RegEx only replaced charset declarations and not some matches within the data.

Thanks to TextSnippets for the script.