MOLT Replicator

On this page Carat arrow pointing down

MOLT Replicator continuously replicates changes from a source database to CockroachDB as part of a database migration. It supports migrations from a source database to CockroachDB with minimal downtime, and enables backfill from CockroachDB to your source database for failback scenarios to preserve a rollback option during a migration window.

MOLT Replicator consumes change data from PostgreSQL logical replication streams, MySQL GTID-based replication, Oracle LogMiner, and CockroachDB changefeeds (for failback).

Terminology

  • Checkpoint: The position in the source database's transaction log from which replication begins or resumes: LSN (PostgreSQL), GTID (MySQL), or SCN (Oracle).
  • Staging database: A CockroachDB database used by Replicator to store replication metadata, checkpoints, and buffered mutations. Specified with --stagingSchema and automatically created with --stagingCreateSchema. For details, refer to Staging database.
  • Forward replication: Replicate changes from a source database (PostgreSQL, MySQL, or Oracle) to CockroachDB during a migration. For usage details, refer to Forward replication (after initial load).
  • Failback: Replicate changes from CockroachDB back to the source database. Used for migration rollback or to maintain data consistency on the source during migration. For usage details, refer to Failback replication.

Prerequisites

Supported databases

MOLT Replicator supports the following source and target databases:

  • PostgreSQL 11-16
  • MySQL 5.7-8.4
  • Oracle Database 19c (Enterprise Edition) and 21c (Express Edition)
  • CockroachDB (all currently supported versions)

Database configuration

The source database must be configured for replication:

Database Configuration Requirements Examples
PostgreSQL source
  • Enable logical replication by setting wal_level = logical.
Configure PostgreSQL for replication
MySQL source
  • Enable global transaction identifiers (GTID) and configure binary logging. Set binlog-row-metadata or binlog-row-image to full.
  • Configure sufficient binlog retention for migration duration.
Configure MySQL for replication
Oracle source Configure Oracle for replication
CockroachDB source (failback)
  • Enable rangefeeds (kv.rangefeed.enabled = true) (CockroachDB self-hosted clusters only).
Configure CockroachDB for replication

User permissions

The SQL user running MOLT Replicator requires specific privileges on both the source and target databases:

Database Required Privileges Examples
PostgreSQL source
  • SUPERUSER role (recommended), or the following granular permissions:
  • CREATE and SELECT on database and tables to replicate.
  • Table ownership for adding tables to publications.
  • LOGIN and REPLICATION privileges to create replication slots and access replication data.
Create PostgreSQL migration user
MySQL source
  • SELECT on tables to replicate.
  • REPLICATION SLAVE and REPLICATION CLIENT privileges for binlog access.
  • For --fetchMetadata, either SELECT on the source database or PROCESS globally.
Create MySQL migration user
Oracle source
  • SELECT, INSERT, UPDATE on REPLICATOR_SENTINEL table.
  • SELECT on V$ views (V$LOG, V$LOGFILE, V$LOGMNR_CONTENTS, V$ARCHIVED_LOG, V$LOG_HISTORY).
  • SELECT on SYS.V$LOGMNR_* views (SYS.V$LOGMNR_DICTIONARY, SYS.V$LOGMNR_LOGS, SYS.V$LOGMNR_PARAMETERS, SYS.V$LOGMNR_SESSION).
  • LOGMINING privilege.
  • EXECUTE on DBMS_LOGMNR.
  • For Oracle Multitenant, the user must be a common user (prefixed with C##) with privileges granted on both CDB and PDB.
Create Oracle migration user

Create sentinel table

Grant LogMiner privileges
CockroachDB target (forward replication)
  • ALL on target database.
  • CREATE on schema.
  • SELECT, INSERT, UPDATE, DELETE on target tables.
  • CREATEDB privilege for creating staging schema.
Create CockroachDB user
PostgreSQL, MySQL, or Oracle target (failback)
  • SELECT, INSERT, UPDATE on tables to fail back to.
  • For Oracle, FLASHBACK is also required.
Grant PostgreSQL user permissions

Grant MySQL user permissions

Grant Oracle user permissions

How it works

MOLT Replicator supports forward replication from PostgreSQL, MySQL, and Oracle, and failback from CockroachDB:

  • PostgreSQL source (pglogical): MOLT Replicator uses PostgreSQL logical replication, which is based on publications and replication slots. You create a publication for the target tables, and a slot marks consistent replication points. MOLT Replicator consumes this logical feed directly and applies the data in sorted batches to the target.

  • MySQL source (mylogical): MOLT Replicator relies on MySQL GTID-based replication to read change data from MySQL binlogs. It works with MySQL versions that support GTID-based replication and applies transactionally consistent feeds to the target. Binlog features that do not use GTIDs are not supported.

  • Oracle source (oraclelogminer): MOLT Replicator uses Oracle LogMiner to capture change data from Oracle redo logs. Both Oracle Multitenant (CDB/PDB) and single-tenant Oracle architectures are supported. Replicator periodically queries LogMiner-populated views and processes transactional data in ascending SCN windows for reliable throughput while maintaining consistency.

  • Failback from CockroachDB (start): MOLT Replicator acts as an HTTPS webhook sink for a single CockroachDB changefeed. Replicator receives mutations from source cluster nodes, can optionally buffer them in a CockroachDB staging cluster, and then applies time-ordered transactional batches to the target database. Mutations are applied as UPSERT or DELETE statements while respecting foreign-key and table dependencies.

Replicator commands

MOLT Replicator provides four commands for different replication scenarios. For example commands, refer to Common uses.

Use pglogical to replicate from PostgreSQL to CockroachDB:

icon/buttons/copy
replicator pglogical

Use mylogical to replicate from MySQL to CockroachDB:

icon/buttons/copy
replicator mylogical

Use oraclelogminer to replicate from Oracle to CockroachDB:

icon/buttons/copy
replicator oraclelogminer

Use start to replicate from CockroachDB to PostgreSQL, MySQL, or Oracle (failback):

icon/buttons/copy
replicator start

Source connection strings

Tip:

Follow the security recommendations in Connection security and credentials.

--sourceConn specifies the connection string of the source database for forward replication.

Note:

The source connection string must point to the primary instance of the source database. Replicas cannot provide the necessary replication checkpoints and transaction metadata required for ongoing replication.

PostgreSQL connection string:

icon/buttons/copy
--sourceConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

MySQL connection string:

icon/buttons/copy
--sourceConn 'mysql://{username}:{password}@{protocol}({host}:{port})/{database}'

Oracle connection string:

icon/buttons/copy
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{service_name}'

For Oracle Multitenant databases, --sourcePDBConn specifies the pluggable database (PDB) connection. --sourceConn specifies the container database (CDB):

icon/buttons/copy
--sourceConn 'oracle://{username}:{password}@{host}:{port}/{cdb_service_name}'
--sourcePDBConn 'oracle://{username}:{password}@{host}:{port}/{pdb_service_name}'

For failback, --stagingConn specifies the CockroachDB connection string:

icon/buttons/copy
--stagingConn 'postgresql://{username}:{password}@{host}:{port}/{database}'

Target connection strings

--targetConn specifies the connection string of the target CockroachDB database for forward replication:

icon/buttons/copy
--targetConn 'postgresql://{username}:{password}@{host}:{port}/{database}'
Note:

For failback, --targetConn specifies the original source database (PostgreSQL, MySQL, or Oracle). For details, refer to Failback replication.

Replication checkpoints

MOLT Replicator requires a checkpoint value to start replication from the correct position in the source database's transaction log.

For PostgreSQL, use --slotName to specify the replication slot created during the data load. The slot automatically tracks the LSN (Log Sequence Number):

icon/buttons/copy
--slotName molt_slot

For MySQL, set --defaultGTIDSet to the cdc_cursor value from the MOLT Fetch output:

icon/buttons/copy
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'

For Oracle, set --scn and --backfillFromSCN to the cdc_cursor values from the MOLT Fetch output:

icon/buttons/copy
--scn 26685786
--backfillFromSCN 26685444

Staging database

The staging database stores replication metadata, checkpoints, and buffered mutations. Specify the staging database with --stagingSchema in fully-qualified database.schema format and create it automatically with --stagingCreateSchema:

icon/buttons/copy
--stagingSchema defaultdb._replicator
--stagingCreateSchema

The staging database is used to:

  • Store checkpoints that enable resuming from the correct point after interruptions.
  • Buffer mutations before applying them to the target in transaction order.
  • Maintain consistency for time-ordered transactional batches while respecting table dependencies.
  • Provide restart capabilities after failures.

Consistency modes

Consistency modes control how MOLT Replicator balances throughput and transactional guarantees.

Failback mode (CockroachDB source)

When using the start command to replicate from CockroachDB to another database, you can configure one of the following consistency modes:

  1. Consistent (default): Preserves per-row order and source transaction atomicity. Mutations are buffered in memory and are flushed out to the target database when --flushSize is reached or once --flushPeriod has passed since the previous flush. Concurrent transactions are controlled by --parallelism.

  2. BestEffort: Relaxes atomicity across tables that do not have foreign key constraints between them (maintains coherence within FK-connected groups). Mutations are buffered in memory and are flushed out to the target database when --flushSize is reached or once --flushPeriod has passed since the previous flush. Enable with --bestEffortOnly or allow auto-entry via --bestEffortWindow set to a positive duration (such as 1s).

    Note:

    For independent tables (with no foreign key constraints), BestEffort mode applies changes immediately as they arrive, without waiting for the resolved timestamp. This provides higher throughput for tables that have no relationships with other tables.

  3. Immediate: Applies updates as they arrive to Replicator with no buffering or waiting for resolved timestamps. Enable with --immediate. Provides highest throughput but requires no foreign keys on the target schema.

Forward replication (PostgreSQL, MySQL, Oracle sources)

When using pglogical, mylogical, or oraclelogminer commands to replicate from PostgreSQL, MySQL, or Oracle sources to CockroachDB, Replicator always preserves per-row order and source transaction atomicity for any changes. This behavior cannot be configured. The consistency mode flags (--immediate, --bestEffortOnly, --bestEffortWindow) do not apply to these commands and will have no effect if specified.

Userscripts

MOLT Replicator can apply userscripts, specified with the --userscript flag, to customize how data is processed and transformed as it moves through the live replication pipeline. Userscripts are customized TypeScript files that apply transformation logic to rows of data on a per-schema and per-table basis.

Userscripts are intended to address unique business or data transformation needs. They perform operations that cannot be handled by the source change data capture (CDC) stream, such as filtering out specific tables, rows, or columns; routing data from a single source table to multiple target tables; transforming column values or adding computed columns; and implementing custom error handling. These tranformations occur in-flight, between the source and target databases.

To have MOLT Replicator apply a userscript, include the --userscript flag with any Replicator command. The flag accepts a path to a TypeScript filename.

icon/buttons/copy
--userscript 'path/to/script.ts'

For more information, read the userscript documentation. Learn how to use the userscript API and refer to the userscript cookbook examples.

Monitoring

Metrics

MOLT Replicator metrics are not enabled by default. Enable Replicator metrics by specifying the --metricsAddr flag with a port (or host:port) when you start Replicator. This exposes Replicator metrics at http://{host}:{port}/_/varz. For example, the following flag exposes metrics on port 30005:

--metricsAddr :30005

Metrics can additionally be written to snapshot files at repeated intervals. Metrics snapshotting is disabled by default. If metrics have been enabled, metrics snapshotting can also be enabled with the --metricsSnapshotPeriod flag. For example, the following flag enables metrics snapshotting every 15 seconds:

--metricsSnapshotPeriod 15s

Metrics snapshots enable access to metrics when the Prometheus server is unavailable, and they can be sent to CockroachDB support to help quickly resolve an issue.

For guidelines on using and interpreting replication metrics, refer to Replicator Metrics.

Logging

By default, MOLT Replicator writes two streams of logs: operational logs to stdout (including warning, info, trace, and some errors) and final errors to stderr.

Redirect both streams to ensure all logs are captured for troubleshooting:

icon/buttons/copy
# Merge both streams to console
./replicator ... 2>&1

# Redirect both streams to a file
./replicator ... > output.log 2>&1

# Merge streams to console while saving to file
./replicator > >(tee replicator.log) 2>&1

# Use logDestination flag to write all logs to a file
./replicator --logDestination replicator.log ...

Enable debug logging with -v. For more granularity and system insights, enable trace logging with -vv. Pay close attention to warning- and error-level logs, as these indicate when Replicator is misbehaving.

Docker usage

Local connection strings

When testing locally, specify the host as follows:

  • For macOS, use host.docker.internal. For example:

    --sourceConn 'postgres://postgres:postgres@host.docker.internal:5432/migration_db?sslmode=disable'
    --targetConn "postgres://root@host.docker.internal:26257/defaultdb?sslmode=disable"
    
  • For Linux and Windows, use 172.17.0.1. For example:

    --sourceConn 'postgres://postgres:postgres@172.17.0.1:5432/migration_db?sslmode=disable'
    --targetConn "postgres://root@172.17.0.1:26257/defaultdb?sslmode=disable"
    

Common uses

Forward replication (after initial load)

In a migration that utilizes continuous replication, run the replicator command after using MOLT Fetch to perform the initial data load. Run the replicator command with the required flags, as shown below:

To start replication after an initial data load with MOLT Fetch, use the pglogical command:

icon/buttons/copy
replicator pglogical

To start replication after an initial data load with MOLT Fetch, use the mylogical command:

icon/buttons/copy
replicator mylogical

To start replication after an initial data load with MOLT Fetch, use the oraclelogminer command:

icon/buttons/copy
replicator oraclelogminer

Specify the source and target database connections. For connection string formats, refer to Source connection strings and Target connection strings:

icon/buttons/copy
--sourceConn $SOURCE
--targetConn $TARGET

For Oracle Multitenant databases, also specify the PDB connection:

icon/buttons/copy
--sourcePDBConn $SOURCE_PDB

Specify the Oracle user that owns the tables to replicate. Oracle capitalizes identifiers by default, so use uppercase:

icon/buttons/copy
--sourceSchema MIGRATION_USER

Specify the target schema on CockroachDB with --targetSchema in fully-qualified database.schema format:

icon/buttons/copy
--targetSchema defaultdb.migration_schema

To replicate from the correct position, specify the appropriate checkpoint value.

Use --slotName to specify the slot created during the data load, which automatically tracks the LSN (Log Sequence Number) checkpoint:

icon/buttons/copy
--slotName molt_slot

Use --defaultGTIDSet from the cdc_cursor field in the MOLT Fetch output:

icon/buttons/copy
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'

Use the --scn and --backfillFromSCN values from the cdc_cursor field in the MOLT Fetch output:

icon/buttons/copy
--scn 26685786
--backfillFromSCN 26685444

Use --stagingSchema to specify the staging database in fully-qualified database.schema format. Use --stagingCreateSchema to create it automatically on first run:

icon/buttons/copy
--stagingSchema defaultdb._replicator
--stagingCreateSchema

At minimum, the replicator command should include the following flags:

icon/buttons/copy
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--targetSchema defaultdb.migration_schema \
--slotName molt_slot \
--stagingSchema defaultdb._replicator \
--stagingCreateSchema
icon/buttons/copy
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--targetSchema defaultdb.public \
--defaultGTIDSet '4c658ae6-e8ad-11ef-8449-0242ac140006:1-29' \
--stagingSchema defaultdb._replicator \
--stagingCreateSchema
icon/buttons/copy
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--targetConn $TARGET \
--sourceSchema MIGRATION_USER \
--targetSchema defaultdb.migration_schema \
--scn 26685786 \
--backfillFromSCN 26685444 \
--stagingSchema defaultdb._replicator \
--stagingCreateSchema

For detailed walkthroughs of migrations that use replicator in this way, refer to these common migration approaches:

Failback replication

A migration that utilizes failback replication replicates data from the CockroachDB cluster back to the source database. In this case, MOLT Replicator acts as an HTTPS webhook sink for a CockroachDB changefeed.

Use the start command for failback:

icon/buttons/copy
replicator start

Specify the target database connection (the database you originally migrated from) with --targetConn. For connection string formats, refer to Target connection strings:

icon/buttons/copy
--targetConn $TARGET

Specify the CockroachDB connection string with --stagingConn. For details, refer to Connect using a URL.

icon/buttons/copy
--stagingConn $STAGING

Specify the staging database name with --stagingSchema in fully-qualified database.schema format. This should be the same staging database created during Forward replication with initial load:

icon/buttons/copy
--stagingSchema defaultdb._replicator

Specify a webhook endpoint address for the changefeed to send changes to with --bindAddr. For example:

icon/buttons/copy
--bindAddr :30004

Specify TLS certificate and private key file paths for secure webhook connections with --tlsCertificate and --tlsPrivateKey:

icon/buttons/copy
--tlsCertificate ./certs/server.crt
--tlsPrivateKey ./certs/server.key

At minimum, the replicator command should include the following flags:

icon/buttons/copy
replicator start \
--targetConn $TARGET \
--stagingConn $STAGING \
--stagingSchema defaultdb._replicator \
--bindAddr :30004 \
--tlsCertificate ./certs/server.crt \
--tlsPrivateKey ./certs/server.key

After starting replicator, create a CockroachDB changefeed to send changes to MOLT Replicator. For a detailed example, refer to Phased Delta Migration with Failback Replication.

Note:

When creating the CockroachDB changefeed, you specify the target database and schema in the webhook URL path. For PostgreSQL targets, use the fully-qualified format /database/schema (/migration_db/migration_schema). For MySQL targets, use the database name (/migration_db). For Oracle targets, use the uppercase schema name (/MIGRATION_SCHEMA).

Explicitly set a default 10s webhook_client_timeout value in the CREATE CHANGEFEED statement. This value ensures that the webhook can report failures in inconsistent networking situations and make crash loops more visible.

After starting replicator, create a CockroachDB changefeed to send changes to MOLT Replicator. For a detailed example, refer to Phased Delta Migration with Failback Replication.

Note:

When creating the CockroachDB changefeed, you specify the target database and schema in the webhook URL path. For PostgreSQL targets, use the fully-qualified format /database/schema (/migration_db/migration_schema). For MySQL targets, use the database name (/migration_db). For Oracle targets, use the uppercase schema name (/MIGRATION_SCHEMA).

Explicitly set a default 10s webhook_client_timeout value in the CREATE CHANGEFEED statement. This value ensures that the webhook can report failures in inconsistent networking situations and make crash loops more visible.

After starting replicator, create a CockroachDB changefeed to send changes to MOLT Replicator. For a detailed example, refer to Phased Delta Migration with Failback Replication.

Note:

When creating the CockroachDB changefeed, you specify the target database and schema in the webhook URL path. For PostgreSQL targets, use the fully-qualified format /database/schema (/migration_db/migration_schema). For MySQL targets, use the database name (/migration_db). For Oracle targets, use the uppercase schema name (/MIGRATION_SCHEMA).

Explicitly set a default 10s webhook_client_timeout value in the CREATE CHANGEFEED statement. This value ensures that the webhook can report failures in inconsistent networking situations and make crash loops more visible.

Resume after an interruption

Whether you're using Replicator to perform forward replication or failback replication, an unexpected issue may cause replication to stop.

To resume replication, run the MOLT Replicator pglogical command using the same --stagingSchema value from your initial replication command.

Be sure to specify the same --slotName value that you used during your initial replication command. The replication slot on the source PostgreSQL database automatically tracks the LSN (Log Sequence Number) checkpoint, so replication will resume from where it left off.

icon/buttons/copy
replicator pglogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--targetSchema defaultdb.migration_schema \
--slotName molt_slot \
--stagingSchema defaultdb._replicator \
--metricsAddr :30005 \
-v

To resume replication, run the MOLT Replicator mylogical command using the same --stagingSchema value from your initial replication command.

Replicator will automatically use the saved GTID (Global Transaction Identifier) from the memo table in the staging schema (in this example, defaultdb._replicator.memo) and track advancing GTID checkpoints there. To have Replicator start from a different GTID instead of resuming from the checkpoint, clear the memo table with DELETE FROM defaultdb._replicator.memo; and run the replicator command with a new --defaultGTIDSet value.

Tip:

For MySQL versions that do not support binlog_row_metadata, include --fetchMetadata to explicitly fetch column metadata. This requires additional permissions on the source MySQL database. Grant SELECT permissions with GRANT SELECT ON migration_db.* TO 'migration_user'@'localhost';. If that is insufficient for your deployment, use GRANT PROCESS ON *.* TO 'migration_user'@'localhost';, though this is more permissive and allows seeing processes and server status.

icon/buttons/copy
replicator mylogical \
--sourceConn $SOURCE \
--targetConn $TARGET \
--targetSchema defaultdb.public \
--stagingSchema defaultdb._replicator \
--metricsAddr :30005 \
--userscript table_filter.ts \
-v

To resume replication, run the MOLT Replicator oraclelogminer command using the same --stagingSchema value from your initial replication command.

Replicator will automatically find the correct restart SCN (System Change Number) from the _oracle_checkpoint table in the staging schema. The restart point is determined by the non-committed row with the smallest startscn column value.

icon/buttons/copy
replicator oraclelogminer \
--sourceConn $SOURCE \
--sourcePDBConn $SOURCE_PDB \
--sourceSchema MIGRATION_USER \
--targetSchema defaultdb.migration_schema \
--targetConn $TARGET \
--stagingSchema defaultdb._replicator \
--metricsAddr :30005 \
--userscript table_filter.ts \
-v
Note:

When filtering out tables in a schema with a userscript, replication performance may decrease because filtered tables are still included in LogMiner queries and processed before being discarded.

Replication resumes from the last checkpoint without performing a fresh load. Monitor the metrics endpoint at http://localhost:30005/_/varz to track replication progress.

Known limitations

  • Replication modes require connection to the primary instance (PostgreSQL primary, MySQL primary/master, or Oracle primary). MOLT cannot obtain replication checkpoints or transaction metadata from replicas.
  • Running DDL on the source or target while replication is in progress can cause replication failures.
  • TRUNCATE operations on the source are not captured. Only INSERT, UPDATE, UPSERT, and DELETE events are replicated.
  • Changes to virtual columns are not replicated automatically. To migrate these columns, you must define them explicitly with transformation rules.

The following limitation is specific to MySQL sources:

  • MySQL replication is supported only with GTID-based configurations. Binlog-based features that do not use GTID are not supported.

The following limitations are specific to Oracle sources:

  • Replication will not work for tables or column names exceeding 30 characters. This is a limitation of Oracle LogMiner.
  • The following data types are not supported for replication:
    • User-defined types (UDTs)
    • Nested tables
    • VARRAY
    • LONGBLOB/CLOB columns (over 4000 characters)
  • If your Oracle workload executes UPDATE statements that modify only LOB columns, these UPDATE statements are not supported by Oracle LogMiner and will not be replicated.
  • If you are using Oracle 11 and execute UPDATE statements on XMLTYPE or LOB columns, those changes are not supported by Oracle LogMiner and will be excluded from ongoing replication.
  • If you are migrating LOB columns from Oracle 12c, use AWS DMS Binary Reader instead of LogMiner. Oracle LogMiner does not support LOB replication in 12c.

See also

×