Migrate from Oracle

On this page Carat arrow pointing down

This page has instructions for migrating data from Oracle into CockroachDB by importing CSV files. Note that IMPORT only works for creating new tables. For information on how to add CSV data to existing tables, see IMPORT INTO.

To illustrate this process, we use the following sample data and tools:

  • Swingbench OrderEntry data set, which is based on the oe schema that ships with Oracle Database 11g and Oracle Database 12c.
  • Oracle Data Pump, which enables the movement of data and metadata from one database to another, and comes with all Oracle installations.
  • SQL*Plus, the interactive and batch query tool that comes with every Oracle Database installation.
Tip:

For best practices for optimizing import performance in CockroachDB, see Import Performance Best Practices.

Step 1. Export the Oracle schema

Using Oracle's Data Pump Export utility, export the schema:

icon/buttons/copy
$ expdp user/password directory=datapump dumpfile=oracle_example.dmp content=metadata_only logfile=example.log

The schema is stored in an Oracle-specific format (e.g., oracle_example.dmp).

Step 2. Convert the Oracle schema to SQL

Using Oracle's Data Pump Import utility, load the exported DMP file to convert it to a SQL file:

icon/buttons/copy
$ impdp user/password directory=datapump dumpfile=oracle_example.dmp sqlfile=example_sql.sql TRANSFORM=SEGMENT_ATTRIBUTES:N:table PARTITION_OPTIONS=MERGE

This SQL output will be used later, in Step 7.

Step 3. Export table data

You need to extract each table's data into a data list file (.lst). We wrote a simple SQL script (spool.sql) to do this:

$ cat spool.sql

SET ECHO OFF
SET TERMOUT OFF
SET FEEDBACK OFF
SET PAGESIZE 0
SET TRIMSPOOL ON
SET WRAP OFF
set linesize 30000
SET RECSEP OFF
SET VERIFY OFF
SET ARRAYSIZE 10000
SET COLSEP '|'

SPOOL '&1'

ALTER SESSION SET nls_date_format = 'YYYY-MM-DD HH24:MI:SS';
  # Used to set a properly formatted date for CockroachDB

SELECT * from &1;

SPOOL OFF

SET PAGESIZE 24
SET FEEDBACK ON
SET TERMOUT ON
Note:

In the example SQL script, | is used as a delimiter. Choose a delimiter that will not also occur in the rows themselves. For more information, see IMPORT.

To extract the data, we ran the script for each table in SQL*Plus:

icon/buttons/copy
$ sqlplus user/password
icon/buttons/copy
> @spool CUSTOMERS
  @spool ADDRESSES
  @spool CARD_DETAILS
  @spool WAREHOUSES
  @spool ORDER_ITEMS
  @spool ORDERS
  @spool INVENTORIES
  @spool PRODUCT_INFORMATION
  @spool LOGON
  @spool PRODUCT_DESCRIPTIONS
  @spool ORDERENTRY_METADATA

A data list file (.lst) with leading and trailing spaces is created for each table.

Exit SQL*Plus:

icon/buttons/copy
> EXIT

Step 4. Configure and convert the table data to CSV

Each table's data list file needs to be converted to CSV and formatted for CockroachDB. We wrote a simple Python script (fix-example.py) to do this:

$ cat fix-example.py

import csv
import string
import sys

for lstfile in sys.argv[1:]:
  filename = lstfile.split(".")[0]

  with open(sys.argv[1]) as f:
    reader = csv.reader(f, delimiter="|")
    with open(filename+".csv", "w") as fo:
      writer = csv.writer(fo)
      for rec in reader:
        writer.writerow(map(string.strip, rec))
icon/buttons/copy
$ python2 fix.py CUSTOMERS.lst ADDRESSES.lst CARD_DETAILS.lst WAREHOUSES.lst ORDER_ITEMS.lst ORDERS.lst INVENTORIES.lst PRODUCT_INFORMATION.lst LOGON.lst PRODUCT_DESCRIPTIONS.lst ORDERENTRY_METADATA.lst

Format the generated CSV files to meet the CockroachDB's CSV requirements.

CSV requirements

You will need to export one CSV file per table, with the following requirements:

  • Files must be in valid CSV format.
  • Files must be UTF-8 encoded.
  • If one of the following characters appears in a field, the field must be enclosed by double quotes:
    • delimiter (, by default)
    • double quote (")
    • newline (\n)
    • carriage return (\r)
  • If double quotes are used to enclose fields, then a double quote appearing inside a field must be escaped by preceding it with another double quote. For example: "aaa","b""bb","ccc"
  • If a column is of type BYTES, it can either be a valid UTF-8 string or a hex-encoded byte literal beginning with \x. For example, a field whose value should be the bytes 1, 2 would be written as \x0102.

CSV configuration options

The following options are available to IMPORT ... CSV:

For usage examples, see Migrate from CSV - Configuration Options.

Step 5. Compress the CSV files

Compress the CSV files for a faster import:

icon/buttons/copy
$ gzip CUSTOMERS.csv ADDRESSES.csv CARD_DETAILS.csv WAREHOUSES.csv ORDER_ITEMS.csv ORDERS.csv INVENTORIES.csv PRODUCT_INFORMATION.csv LOGON.csv PRODUCT_DESCRIPTIONS.csv ORDERENTRY_METADATA.csv

These compressed CSV files will be used to import your data into CockroachDB.

Step 6. Host the files where the cluster can access them

Each node in the CockroachDB cluster needs to have access to the files being imported. There are several ways for the cluster to access the data; for a complete list of the types of storage IMPORT can pull from, see Import File URLs.

Tip:

We strongly recommend using cloud storage such as Amazon S3 or Google Cloud to host the data files you want to import.

Step 7. Map Oracle to CockroachDB data types

Using the SQL file created in Step 2, write IMPORT TABLE statements that match the schemas of the table data you're importing.

Remove all Oracle-specific attributes, remap all Oracle data types, refactor all CREATE TABLE statements to include primary keys.

Data type mapping

Use the table below for data type mappings:

Oracle Data Type CockroachDB Data Type
BLOB BYTES 1
CHAR(n), CHARACTER(n)
n < 256
CHAR(n), CHARACTER(n)
CLOB STRING 1
DATE DATE
FLOAT(n) DECIMAL(n)
INTERVAL YEAR(p) TO MONTH VARCHAR, INTERVAL
INTERVAL DAY(p) TO SECOND(s) VARCHAR, INTERVAL
JSON JSON 2
LONG STRING
LONG RAW BYTES
NCHAR(n)
n < 256
CHAR(n)
NCHAR(n)
n > 255
VARCHAR, STRING
NCLOB STRING
NUMBER(p,0), NUMBER(p)
1 <= p < 5
INT2 3
NUMBER(p,0), NUMBER(p)
5 <= p < 9
INT4 3
NUMBER(p,0), NUMBER(p)
9 <= p < 19
INT8 3
NUMBER(p,0), NUMBER(p)
19 <= p <= 38
DECIMAL(p)
NUMBER(p,s)
s > 0
DECIMAL(p,s)
NUMBER, NUMBER(\*) DECIMAL
NVARCHAR2(n) VARCHAR(n)
RAW(n) BYTES
TIMESTAMP(p) TIMESTAMP
TIMESTAMP(p) WITH TIME ZONE TIMESTAMP WITH TIMEZONE
VARCHAR(n), VARCHAR2(n) VARCHAR(n)
XML JSON 2

  • 1 BLOBS and CLOBS should be converted to BYTES, or STRING where the size is variable, but it's recommended to keep values under 1 MB to ensure performance. Anything above 1 MB would require refactoring into an object store with a pointer embedded in the table in place of the object.
  • 2 JSON and XML types can be converted to JSONB using any XML to JSON conversion. XML must be converted to JSONB before importing into CockroachDB.
  • 3 When converting NUMBER(p,0), consider NUMBER types with Base-10 limits map to the Base-10 Limits for CockroachDB INT types. Optionally, NUMBERS can be converted to DECIMAL.

When moving from Oracle to CockroachDB data types, consider the following:

For more information, see Known Limitations, Online Schema Changes, and Transactions.

NULLs

For information on how CockroachDB handles NULLs, see NULL Handling and NOT NULL Constraint.

Primary key, constraints, and secondary indexes

Cockroach distributes a table by the primary key or by a default ROWID when a primary key is not provided. This also requires the primary key creation to be part of the table creation. Using the above data type mapping, refactor each table DDL to include the primary key, constraints, and secondary indexes.

For more information and examples, refer to the following:

Privileges for users and roles

The Oracle privileges for users and roles must be rewritten for CockroachDB. Once the CockroachDB cluster is secured, CockroachDB follows the same role-based access control methodology as Oracle.

Step 8. Import the CSV

For example, to import the data from CUSTOMERS.csv.gz into a new CUSTOMERS table, issue the following statement in the CockroachDB SQL shell:

icon/buttons/copy
> IMPORT TABLE customers (
        customer_id       DECIMAL
                          NOT NULL
                          PRIMARY KEY,
        cust_first_name   VARCHAR(40) NOT NULL,
        cust_last_name    VARCHAR(40) NOT NULL,
        nls_language      VARCHAR(3),
        nls_territory     VARCHAR(30),
        credit_limit      DECIMAL(9,2),
        cust_email        VARCHAR(100),
        account_mgr_id    DECIMAL,
        customer_since    DATE,
        customer_class    VARCHAR(40),
        suggestions       VARCHAR(40),
        dob               DATE,
        mailshot          VARCHAR(1),
        partner_mailshot  VARCHAR(1),
        preferred_address DECIMAL,
        preferred_card    DECIMAL,
        INDEX cust_email_ix (cust_email),
        INDEX cust_dob_ix (dob),
        INDEX cust_account_manager_ix (
            account_mgr_id
        )
       )
   CSV DATA (
        'https://your-bucket-name.s3.us-east-2.amazonaws.com/CUSTOMERS.csv.gz'
       )
  WITH delimiter = e'\t',
       "nullif" = '',
       decompress = 'gzip';
       job_id       |  status   | fraction_completed |  rows  | index_entries | system_records |  bytes   
--------------------+-----------+--------------------+--------+---------------+----------------+----------
 381866942129111041 | succeeded |                  1 | 300024 |             0 |              0 | 13258389
(1 row)
Note:

To import data into an existing table, use IMPORT INTO.

Then add the computed columns, constraints, and function-based indexes. For example:

icon/buttons/copy
> UPDATE CUSTOMERS SET credit_limit = 50000 WHERE credit_limit > 50000;
  ALTER TABLE CUSTOMERS ADD CONSTRAINT CUSTOMER_CREDIT_LIMIT_MAX CHECK (credit_limit <= 50000);
  ALTER TABLE CUSTOMERS ADD COLUMN LOW_CUST_LAST_NAME STRING AS (lower(CUST_LAST_NAME)) STORED;
  ALTER TABLE CUSTOMERS ADD COLUMN LOW_CUST_FIRST_NAME STRING AS (lower(CUST_FIRST_NAME)) STORED;
  CREATE INDEX CUST_FUNC_LOWER_NAME_IX on CUSTOMERS (LOW_CUST_LAST_NAME,CUST_FIRST_NAME);

Repeat the above for each CSV file you want to import.

Step 9. Refactor application SQL

The last phase of the migration process is to change the transactional behavior and SQL dialect of your application.

Transactions, locking, and concurrency control

Both Oracle and CockroachDB support multi-statement transactions, which are atomic and guarantee ACID semantics. However, CockroachDB operates in a serializable isolation mode while Oracle defaults to read committed, which can create both non-repeatable reads and phantom reads when a transaction reads data twice. It is typical that Oracle developers will use SELECT FOR UPDATE to work around read committed issues. In CockroachDB v20.1 and later, the SELECT FOR UPDATE statement is also supported.

Regarding locks, Cockroach utilizes a lightweight latch to serialize access to common keys across concurrent transactions. Oracle and CockroachDB transaction control flows only have a few minor differences; for more details, refer to Transactions - SQL statements.

As CockroachDB does not allow serializable anomalies, transactions may experience deadlocks or read/write contention. This is expected during concurrency on the same keys. These can be addressed with either automatic retries or client-side intervention techniques.

SQL dialect

Cockroach is ANSI SQL compliant with a Postgres dialect, which allows you to use native drivers to connect applications and ORMs to CockroachDB. CockroachDB’s SQL Layer supports full relational schema and SQL (similar to Oracle).

You will have to refactor Oracle SQL and functions that do not comply with ANSI SQL-92 in order to work with CockroachDB. For more information about the Cockroach SQL Grammar and a SQL comparison, see below:

  • SQL best practices
  • Common table expressions (CTE)
  • DUAL table

    Oracle requires use of the DUAL table, as Oracle requires a SELECT ... FROM. In CockroachDB, all reference to the DUAL table should be eliminated.

  • Function call syntax

  • Hints

    See also: Table Expressions - Force index selection

  • Joins

    CockroachDB supports HASH, MERGE, and LOOKUP joins. Oracle uses the + operator for LEFT and RIGHT joins, but CockroachDB uses the ANSI join syntax.

  • Sequences

    Sequences in CockroachDB do not require a trigger to self-increment; place the sequence in the table DDL:

    > CREATE TABLE customer_list (
        id INT PRIMARY KEY DEFAULT nextval('customer_seq'),
        customer string,
        address string
      );
    
  • Subqueries

  • SYSDATE

    CockroachDB does not support SYSDATE; however, it does support date and time with the following:

    > SELECT Transaction_timestamp(), clock_timestamp();
    
    > SELECT current_timestamp
    
    > SELECT now();
    
  • Views

    CockroachDB does not have a materialized view implementation.

  • Window functions

See also


Yes No
On this page

Yes No