-
Notifications
You must be signed in to change notification settings - Fork 336
MySQL UTF 8 Encoding issues
utf8mb4 is a MySQL encoding that supports all of Unicode. It stores characters as 1-4 bytes. utf8, on the other hand, only does 1-3 bytes, and so can't encode anything on the Supplementary Multilingual Plane.
In InnoDB, indexes have a maximum length of 767 bytes. Because MySQL assumes each character will use the maximum number of bytes it can, that means an index on a utf8mb4 column can only be floor(767/4)=191 bytes long.
Every polymorphic column in the public_activity migration creates a _type column, which is 255 bytes long. In a InnoDB/utf8mb4 DB, when the migration tries to add these columns to indexes, it fails with
Mysql2::Error: Specified key was too long; max key length is 767 bytes.
To fix this, we need the _type columns or the indexes to shorter. What I did is add this before the the add_indexes:
change_column :activities, :trackable_type, :string, :limit => 100
change_column :activities, :owner_type, :string, :limit => 100
change_column :activities, :recipient_type, :string, :limit => 100
Source: https://github.com/pokonski/public_activity/issues/218
Don't just rely on the above for guidance. There's now a lot of reference material regarding converting to mb4
. If you use MySQL 5.7 or greater you don't have to muddle with the index lengths anymore as you can convert to the DYNAMIC
row format that allows you to store longer keys without trouble. Here's some gists for the migration process:
https://gist.github.com/tjh/1711329
and a handy guide:
https://railsmachine.com/articles/2017/05/19/converting-a-rails-database-to-utf8mb4.html
NOTE! Read everything thoroughly before you start out on your adventure! Do backups, and all that :)