I've got a table with around 2.7 million records, 63 fields, and several indexes. I need to be able to get this table from mysql to postgresql quickly (say within an hour). I know mysql can export the data quickly, but postgresql seems to be very slow when importing the data. Any ideas? I've been importing a CSV produced from mysqldump for about 3 hours now.
3 Answers
The fastest way I can think of would be to drop all the indexes and triggers (or at least ALTER TABLE foo DISABLE TRIGGER ALL;) on the table in postgresql, add the \. end-of-input line to the end of your CSV file, then put the file on the server (preferably on a drive other than the one being used for the DB, or maybe somewhere on the network the server can access via NFS) and COPY foo FROM '/some/file.csv' WITH CSV; (8.x syntax, which is supported in 9.0. Depending on the CSV file format you may need to make modifications.) Afterwards, recreate the indexes and re-enable the triggers.
Some of the configuration options for postgres can affect data load times. Disabling autovacuum and increasing the time between checkpoints can help with huge loads, but you'll need to restart postgresql in order to change the checkpoint_segments or autovacuum settings, and you'll want to set autovacuum back on when you're done.
-  +1 for disabling triggers/indexing and usingCOPY(it's much faster to create an index once the table is populated than to update it as you insert). Also definitely crank upcheckpoint_segmentsif your data set is large (many bytes, not just many rows) to avoid rolling WAL segments constantly (the extra disk I/O rolling WAL segments will really slow your load down)voretaq7– voretaq72011-03-31 05:55:27 +00:00Commented Mar 31, 2011 at 5:55
mysqldump --compatible=postgresql -h... -u... -p... dbname tablename > PostgresqlData.sql
This option exists
--compatible=name Change the dump to be compatible with a given mode. By default tables are dumped in a format optimized for MySQL. Legal modes are: ansi, mysql323, mysql40, postgresql, oracle, mssql, db2, maxdb, no_key_options, no_table_options, no_field_options. One can use several modes separated by commas. Note: Requires MySQL server version 4.1.0 or higher. This option is ignored with earlier server versions. Give it a Try !!!
-  I'm not sure if the--compatible=postgresqlfeature generatesCOPYorINSERTstatements (the former being substantially faster than the latter for huge data sets) -- do you happen to know offhand?voretaq7– voretaq72011-03-31 05:53:00 +00:00Commented Mar 31, 2011 at 5:53
-  @Rolando, how this command can be used to convert a whole database instead of a single table?alfish– alfish2012-01-01 18:01:10 +00:00Commented Jan 1, 2012 at 18:01
There are any number of professional ETL tools that can migrate between databases at fairly high speeds - for example, I have used Pentaho's open source PDI (nee Kettle) hit in the range of 100K rows/second. These also bring several other advantages, such as allowing you to massage data to few new structures, a graphical interface for the mapping process, and so forth.

