A migration involves transferring data from a pre-existing source database onto a target CockroachDB cluster. Migrating data is a complex, multi-step process, and a data migration can take many different forms depending on your specific business and technical constraints.
Cockroach Labs provides a MOLT (Migrate Off Legacy Technology) toolkit to aid in migrations.
This page provides an overview of the following:
- The generic migration sequence
- MOLT tools
- Variables to consider when choosing a migration approach
- Common migration approaches
Migration sequence
A migration to CockroachDB generally follows a variant of this sequence:
- Assess and discover: Inventory the source database, flag unsupported features, make a migration plan.
- Prepare the environment: Configure networking, users and permissions, bucket locations, and replication settings.
- Convert the source schema: Generate CockroachDB-compatible DDL. Apply the converted schema to the target database. Drop constraints and indexes to facilitate data load.
- Load data into CockroachDB: Bulk load the source data into the CockroachDB cluster.
- Finalize target schema: Recreate indexes or constraints on CockroachDB that you previously dropped to facilitate data load.
- Replicate ongoing changes (optional): Keep CockroachDB in sync with the source. This may be necessary for migrations that minimize downtime.
- Stop application traffic: Limit user read/write traffic to the source database. This begins application downtime.
- Verify data consistency: Confirm that the CockroachDB data is consistent with the source.
- Enable failback (optional): Replicate data from the target back to the source, enabling a reversion to the source database in the event of migration failure.
- Cut over application traffic: Resume normal application use, with the CockroachDB cluster as the target database. This ends application downtime.
This sequence can vary depending on the needs on how your organization considers the migration variables. The common migration approaches describe some standard use cases, but even these may need to be modified to suit the needs of your migration.
The following diagram shows how the MOLT (Migrate Off Legacy Technology) toolkit is used at various stages of the migration sequence.
MOLT tools
MOLT (Migrate Off Legacy Technology) is a set of tools for schema conversion, data load, replication, and validation. Migrations with MOLT are resilient, restartable, and scalable to large data sets.
MOLT Fetch, Replicator, and Verify are CLI-based to maximize control, automation, and visibility during the data load and replication stages.
| Tool | Usage | Tested and supported sources | Release status |
|---|---|---|---|
| Schema Conversion Tool | Schema conversion | PostgreSQL, MySQL, Oracle, SQL Server | GA (Cloud only) |
| Fetch | Initial data load | PostgreSQL 11-16, MySQL 5.7-8.4, Oracle Database 19c (Enterprise Edition) and 21c (Express Edition) | GA |
| Replicator | Continuous replication | PostgreSQL 11-16, MySQL 5.7-8.4, Oracle Database 19c+, CockroachDB | GA |
| Verify | Schema and data validation | PostgreSQL 12-16, MySQL 5.7-8.4, Oracle Database 19c (Enterprise Edition) and 21c (Express Edition), CockroachDB | Preview |
For CockroachDB-to-CockroachDB migrations, contact your account team for guidance.
Schema Conversion Tool
The MOLT Schema Conversion Tool converts a source database schema to a CockroachDB-compatible schema. The tool performs the following actions:
- Identifies unimplemented features.
- Rewrites unsupported DDL syntax.
- Applies CockroachDB schema best practices.
Fetch
MOLT Fetch performs the initial data load to CockroachDB. It supports:
- Multiple migration flows via
IMPORT INTOorCOPY FROM. - Data movement via cloud storage, local file servers, or direct copy.
- Concurrent data export from multiple source tables and shards.
- Schema transformation rules.
- After exporting data with
IMPORT INTO, safe continuation to retry failed or interrupted tasks from specific checkpoints.
Replicator
MOLT Replicator provides continuous replication capabilities for minimal-downtime migrations. It supports:
- Continuous replication from source databases to CockroachDB.
- Multiple consistency modes for balancing throughput and transactional guarantees.
- Failback replication from CockroachDB back to source databases.
- Performance tuning for high-throughput workloads.
- Userscripts for defining data transformations.
Verify
MOLT Verify checks for data and schema discrepancies between the source database and CockroachDB. It performs the following verifications:
- Table structure.
- Column definition.
- Row-level data.
Migration variables
You must decide how you want your migration to handle each of the following variables. These decisions will depend on your specific business and technical considerations. The MOLT toolkit supports any set of decisions made for the supported source databases.
| Variable | Description |
|---|---|
Migration granularity Migration granularity |
Do you want to migrate all of your data at once, or do you want to split your data up into phases and migrate one phase at a time? |
Continuous replication Continuous replication |
After the initial data load (or after the initial load of each phase), do you want to stream further changes to that data from the source to the target? |
Data transformation strategy Data transformation strategy |
If there are discrepancies between the source and target schema, how will you define those data transformations, and when will those transformations occur? |
Validation strategy Validation strategy |
How and when will you verify that the data in CockroachDB matches the source database? |
Rollback plan Rollback plan |
What approach will you use to roll back the migration if issues arise during or after cutover? |
Learn more about the different migration variables, how you should consider the different options for each variable, and how to use the MOLT toolkit for each variable.
Common migration approaches
MOLT supports various migration flows using MOLT Fetch for data loading and MOLT Replicator for ongoing replication. These common approaches are variants of the general migration sequence.
| Migration approach | Description | Best for |
|---|---|---|
| Classic Bulk Load Migration | Perform a one-time bulk load of source data into CockroachDB. | Simple migrations with planned downtime. |
| Phased Bulk Load Migration | Divide your data into separate phases and bulk load each phase. | Larger migrations with planned downtime per phase. |
| Delta Migration | Perform an initial data load, then replicate ongoing changes continuously. | Minimal-downtime migrations. |
| Phased Delta Migration with Failback Replication | Divide your data into separate phases. For each phase, perform an initial data load, then replicate ongoing changes continuously. Enable failback replication. | Risk-averse migrations with minimal downtime per phase. |
Each of these approaches has a detailed walkthrough guide for performing these migrations using the MOLT toolkit. While these approaches are among the most common, you may need to modify these instructions to suit the specific needs of your migration.