Migrate from PostgreSQL

On this page

This page has instructions for migrating data from PostgreSQL to CockroachDB using IMPORT's support for reading pg_dump files.

The examples below pull real data from Amazon S3. They use the employees data set that is also used in the MySQL docs. The data was imported to PostgreSQL using pgloader, and then modified for use here as explained below.

Tip:

For best practices for optimizing import performance in CockroachDB, see Import Performance Best Practices.

Step 1. Dump the PostgreSQL database

There are several ways to dump data from PostgreSQL to be imported into CockroachDB:

Dump the entire database
Dump one table at a time

The import will fail if the dump file contains functions or type definitions. In addition to calling pg_dump as shown below, you may need to edit the dump file to remove functions and data types.

Also, note that CockroachDB's IMPORT does not support automatically importing data from PostgreSQL' non-public schemas. As a workaround, you can edit the dump file to change the table and schema names in the CREATE TABLE statements.

Dump the entire database

Most users will want to import their entire PostgreSQL database all at once, as shown below in Import a full database dump.

To dump the entire database, run the pg_dump command shown below.

$ pg_dump employees > /tmp/employees-full.sql

For this data set, the PostgreSQL dump file required the following edits, which have already been performed on the files used in the examples below:

The type of the employees.gender column in the CREATE TABLE statement had to be changed from employees.employees_gender to STRING since PostgreSQL represented the employee's gender using a CREATE TYPE statement that is not supported by CockroachDB.
A CREATE TYPE employee ... statement needed to be removed.

If you only want to import one table from a database dump, see Import a table from a full database dump below.

Dump one table at a time

To dump the employees table from a PostgreSQL database also named employees, run the pg_dump command shown below. You can import this table using the instructions in Import a table from a table dump below.

$ pg_dump -t employees  employees > /tmp/employees.sql

For this data set, the PostgreSQL dump file required the following edits, which have already been performed on the files used in the examples below.

The type of the employees.gender column in the CREATE TABLE statement had to be changed from employees.employees_gender to STRING since PostgreSQL represented the employee's gender using a CREATE TYPE statement that is not supported by CockroachDB.

Step 2. Host the files where the cluster can access them

Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for more information on the types of storage IMPORT can pull from, see the following:

Tip:

We strongly recommend using cloud storage such as Amazon S3 or Google Cloud to host the data files you want to import.

Step 3. Import the PostgreSQL dump file

You can choose from several variants of the IMPORT statement, depending on whether you want to import a full database or a single table:

Import a full database dump
Import a table from a full database dump
Import a table from a table dump

Note that all of the IMPORT statements in this section pull real data from Amazon S3 and will kick off background import jobs that you can monitor with SHOW JOBS.

Note:

As of v21.2 IMPORT TABLE will be deprecated. We recommend using CREATE TABLE followed by IMPORT INTO to import data into a new table. For an example, read Import into a new table from a CSV file.

To import data into an existing table, use IMPORT INTO.

Import a full database dump

This example assumes you dumped the entire database.

The IMPORT statement below reads the data and DDL statements (including existing foreign key relationships) from the full database dump file.

> IMPORT PGDUMP 'https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees-full.sql.gz' WITH ignore_unsupported_statements;

       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 381845110403104769 | succeeded |                  1 | 300024 |             0 |              0 | 11534293
(1 row)

Import a table from a full database dump

This example assumes you dumped the entire database.

IMPORT can import one table's data from a full database dump. It reads the data and applies any CREATE TABLE statements from the file.

> CREATE DATABASE IF NOT EXISTS employees;
> USE employees;
> IMPORT TABLE employees FROM PGDUMP 'https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees-full.sql.gz' WITH ignore_unsupported_statements;

       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 383839294913871873 | succeeded |                  1 | 300024 |             0 |              0 | 11534293
(1 row)

Import a table from a table dump

The examples below assume you dumped one table.

The simplest way to import a table dump is to run IMPORT TABLE as shown below. It reads the table data and any CREATE TABLE statements from the file.

> CREATE DATABASE IF NOT EXISTS employees;
> USE employees;
> IMPORT PGDUMP 'https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees.sql.gz' WITH ignore_unsupported_statements;

       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes   
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 383855569817436161 | succeeded |                  1 | 300024 |             0 |              0 | 11534293
(1 row)

If you need to specify the table's columns for some reason, you can use an IMPORT TABLE statement like the one below, which will import data but ignore any CREATE TABLE statements in the file, instead relying on the columns you specify.

> IMPORT TABLE employees (
    emp_no INT PRIMARY KEY,
    birth_date DATE NOT NULL,
    first_name STRING NOT NULL,
    last_name STRING NOT NULL,
    gender STRING NOT NULL,
    hire_date DATE NOT NULL
  )
  PGDUMP DATA ('https://s3-us-west-1.amazonaws.com/cockroachdb-movr/datasets/employees-db/pg_dump/employees.sql.gz') WITH ignore_unsupported_statements;

Configuration Options

The following options are available to IMPORT ... PGDUMP:

Max row size
Row limit
Ignore unsupported statements
Log unsupported statements
Skip foreign keys

Max row size

The max_row_size option is used to override limits on line size. Default: 0.5MB. This setting may need to be tweaked if your PostgreSQL dump file has extremely long lines, for example as part of a COPY statement.

Example usage:

> IMPORT TABLE employees (
    emp_no INT PRIMARY KEY,
    birth_date DATE NOT NULL,
    first_name STRING NOT NULL,
    last_name STRING NOT NULL,
    gender STRING NOT NULL,
    hire_date DATE NOT NULL
  )
  PGDUMP DATA ('s3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456') WITH max_row_size = '5MB';

Row limit

The row_limit option determines the number of rows to import. If you are importing one table, setting row_limit = 'n' will import the first n rows of the table. If you are importing an entire database, this option will import the first n rows from each table in the dump file. It is useful for finding errors quickly before executing a more time- and resource-consuming import.

Example usage:

> IMPORT PGDUMP 's3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456' WITH row_limit = '10';

Ignore unsupported statements

The ignore_unsupported_statements option specifies whether the import will ignore unsupported statements in the PGDUMP file. Default: false.

If ignore_unsupported_statements is omitted, the import will fail if it encounters a statement that is unsupported by CockroachDB. Use ignore_unsupported_statements with log_ignored_statements to log unsupported statements.

Example usage:

> IMPORT PGDUMP 's3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456' WITH ignore_unsupported_statements;

Log unsupported statements

The log_ignored_statements option is used with the ignore_unsupported_statements option to log unsupported statements in the PGDUMP file to specified a destination file.

Example usage:

> IMPORT PGDUMP 's3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456' WITH ignore_unsupported_statements, log_ignored_statements='userfile://defaultdb.public.userfiles_root/unsupported-statements.log';

Skip foreign keys

By default, IMPORT ... PGDUMP supports foreign keys. Default: false. Add the skip_foreign_keys option to speed up data import by ignoring foreign key constraints in the dump file's DDL. It will also enable you to import individual tables that would otherwise fail due to dependencies on other tables.

Note:

The most common dependency issues are caused by unsatisfied foreign key relationships. You can avoid these issues by adding the skip_foreign_keys option to your IMPORT statement as needed. For more information, see the list of import options.

For example, if you get the error message pq: there is no unique constraint matching given keys for referenced table tablename, use IMPORT ... WITH skip_foreign_keys.

Example usage:

> IMPORT TABLE employees (
    emp_no INTEGER PRIMARY KEY,
    birth_date DATE NOT NULL,
    first_name STRING NOT NULL,
    last_name STRING NOT NULL,
    gender STRING NOT NULL,
    hire_date DATE NOT NULL
  ) PGDUMP DATA ('s3://your-external-storage/employees.sql?AWS_ACCESS_KEY_ID=123&AWS_SECRET_ACCESS_KEY=456') WITH skip_foreign_keys;

Foreign key constraints can be added by using ALTER TABLE ... ADD CONSTRAINT commands after importing the data.

Pricing

Contact us

Sign In

Migrate from PostgreSQL

Step 1. Dump the PostgreSQL database

Dump the entire database

Dump one table at a time

Step 2. Host the files where the cluster can access them

Step 3. Import the PostgreSQL dump file

Import a full database dump

Import a table from a full database dump

Import a table from a table dump

Configuration Options

Max row size

Row limit

Ignore unsupported statements

Log unsupported statements

Skip foreign keys

See also

Tell us about your experience

Thank you for your feedback!

Explore More Documentation:

Migrate from PostgreSQL

Step 1. Dump the PostgreSQL database

Dump the entire database

Dump one table at a time

Step 2. Host the files where the cluster can access them

Step 3. Import the PostgreSQL dump file

Import a full database dump

Import a table from a full database dump

Import a table from a table dump

Configuration Options

Max row size

Row limit

Ignore unsupported statements

Log unsupported statements

Skip foreign keys

See also

Tell us about your experience

Select the problem area

Thank you for your feedback!

Explore More Documentation: