Exporting from MySQL to H2

When it comes to testing, you can't beat the convenience of an embedded database. It can be created, populated and cleaned up without any dependencies on outside infrastructure, making it ideal for self contained test suites.

However, there is a very good chance that you don't use an embedded database in your production environment. In my case, I have a MySQL database in production, and want to use H2 in testing.

Getting data from MySQL into H2 is not as straight forward as you might think. You might have some luck with CopyDB, Scriptella, or OpenDBCopy. But in the end I simply used mysqldump and a couple of hacks to get data from MySQL to H2.

Dump the data

Dump the data from MySQL with the following command:

mysqldump --compatible=ansi,no_table_options,no_field_options,no_key_options --hex-blob --skip-opt -uroot -p Database > importansi.sql

This will strip out most of the MySQL specific syntax and give you something that you can clean up for H2 to import.

Fix up single quotes

MySQL will escape a single quote with a backslash, like \'. H2 expects to have single quotes escaped with another single quote, like ''. There has been a request to have this kind of escaping included as an option in MySQL, but it has not been implemented. So you'll have to do a find and replace for yourself.

Fix up hex numbers

H2 doesn't import hex numbers properly. To fix that up, you'll need to find any hex numbers and replace them with strings.

If you have a text editor that can find and replace with regular expressions, this can be done by finding 0x([A-F0-9]*) and replacing it with '$1'.

Fix up bits

The default value for a bit in MySQL will be represented as b'0'. Replace this with a plain 0 for H2.

Don't include ranges in keys

MySQL has the ability to define a range on a key, like KEY "ATextFieldKey" ("ATextField"(255)). This doesn't work in H2, so replace it with KEY "ATextFieldKey" ("ATextField").

Remove character sets

Remove any character set information on a field like CHARACTER SET latin1.

Remove COLLATE settings

Remove any collate information on a field like COLLATE utf8_unicode_ci.

Remove indexes on BLOBS, CLOBS and TEXT fields

H2 does not support indexes on these fields. Queries on these fields will be slower, but you should be able to take that into account in your tests.

Make all index names unique

MySQL only requires that an index name be unique within a table. H2 will require that it be unique within the database. So change any duplicated index names before importing into H2.

Use the MySQL compatibility mode

Even though we have dumped the MySQL database using the ANSI compatibility mode, we still need to enable the MySQL compatibility mode in H2. This is done through a connection URL that looks like jdbc:h2:~/test;MODE=MySQL.

Reorder the table creation sequence 

MySQL will dump tables in alphabetical order. When it imports a database dumps, some commands in the comments instruct MySQL to ignore relationships, allowing MySQL to create tables with foreign keys to tables that have not yet been created.

H2 does not have the same ability to ignore relationships (I tried SET REFERENTIAL_INTEGRITY FALSE, but it didn't work), and so you'll have to do some copying and pasting to make sure that any table that is pointed to in a relationship exists before the relationship is created.

If you have circular dependencies, you'll have to remove the foreign key creation from the CREATE TABLE command, and create the foreign keys only after both tables have been created.

For those that want to automate this process, fork the repo hold this script. It has saved us quite a few hours worth of work.
Post a Comment

Popular posts from this blog

Fixing OpenVPN "Authenticate/Decrypt packet error: cipher final failed"

MinHash for dummies