CockroachDB supports bundling multiple SQL statements into a single all-or-nothing transaction. Each transaction guarantees ACID semantics spanning arbitrary tables and rows, even when data is distributed. If a transaction succeeds, all mutations are applied together with virtual simultaneity. If any part of a transaction fails, the entire transaction is aborted, and the database is left unchanged. CockroachDB guarantees that while a transaction is pending, it is isolated from other concurrent transactions with serializable isolation.
For a detailed discussion of CockroachDB transaction semantics, see How CockroachDB Does Distributed Atomic Transactions and Serializable, Lockless, Distributed: Isolation in CockroachDB. The explanation of the transaction model described in this blog post is slightly out of date. See the Transaction Retries section for more details.
SQL statements
The following SQL statements control transactions.
Statement | Description |
---|---|
BEGIN |
Initiate a transaction, as well as control its priority. |
SET TRANSACTION |
Control a transaction's priority. |
COMMIT |
Commit a regular transaction, or clear the connection after committing a transaction using the advanced retry protocol. |
ROLLBACK |
Abort a transaction and roll the database back to its state before the transaction began. |
SHOW |
Display the current transaction settings. |
SAVEPOINT |
Used for nested transactions; also used to implement advanced client-side transaction retries. |
RELEASE SAVEPOINT |
Commit a nested transaction; also used for retryable transactions. |
ROLLBACK TO SAVEPOINT |
Roll back a nested transaction; also used to handle retryable transaction errors. |
If you are using a framework or library that does not have advanced retry logic built in, you should implement an application-level retry loop with exponential backoff. See Client-side intervention.
Syntax
In CockroachDB, a transaction is set up by surrounding SQL statements with the BEGIN
and COMMIT
statements.
To use advanced client-side transaction retries, you should also include the SAVEPOINT
, ROLLBACK TO SAVEPOINT
and RELEASE SAVEPOINT
statements.
BEGIN;
SAVEPOINT cockroach_restart;
<transaction statements>
RELEASE SAVEPOINT cockroach_restart;
COMMIT;
At any time before it's committed, you can abort the transaction by executing the ROLLBACK
statement.
Clients using transactions must also include logic to handle retries.
Error handling
To handle errors in transactions, you should check for the following types of server-side errors:
Type | Description |
---|---|
Retry Errors | Errors with the code 40001 or string restart transaction , which indicate that a transaction failed because it could not be placed in a serializable ordering of transactions by CockroachDB. This is often due to contention: conflicts with another concurrent or recent transaction accessing the same data. In such cases, the transaction needs to be retried by the client as described in client-side intervention. For a reference listing all of the retry error codes emitted by CockroachDB, see the Transaction Retry Error Reference. |
Ambiguous Errors | Errors with the code 40003 which indicate that the state of the transaction is ambiguous, i.e., you cannot assume it either committed or failed. How you handle these errors depends on how you want to resolve the ambiguity. For information about how to handle ambiguous errors, see here. |
SQL Errors | All other errors, which indicate that a statement in the transaction failed. For example, violating the UNIQUE constraint generates a 23505 error. After encountering these errors, you can either issue a COMMIT or ROLLBACK to abort the transaction and revert the database to its state before the transaction began.If you want to attempt the same set of statements again, you must begin a completely new transaction. |
Transaction retries
Transactions may require retries if they experience deadlock or read/write contention with other concurrent transactions which cannot be resolved without allowing potential serializable anomalies.
To mitigate read-write contention and reduce the need for transaction retries, use the following techniques:
- Perform reads using
AS OF SYSTEM TIME
. - Use
SELECT FOR UPDATE
to order transactions by controlling concurrent access to one or more rows of a table. This reduces retries in scenarios where a transaction performs a read and then updates the same row it just read.
There are two cases in which transaction retries can occur:
- Automatic retries, which CockroachDB processes for you.
- Client-side intervention, which your application must handle.
Automatic retries
CockroachDB automatically retries individual statements (implicit transactions) and transactions sent from the client as a single batch, as long as the size of the results being produced for the client, including protocol overhead, is less than 16KiB by default. Once that buffer overflows, CockroachDB starts streaming results back to the client, at which point automatic retries cannot be performed any more. As long as the results of a single statement or batch of statements are known to stay clear of this limit, the client does not need to worry about transaction retries.
You can change the results buffer size for all new sessions using the sql.defaults.results_buffer.size
cluster setting, or for a specific session using the results_buffer_size
session variable. Decreasing the buffer size can increase the number of transaction retry errors a client receives, whereas increasing the buffer size can increase the delay until the client receives the first result row.
Individual statements
Individual statements are treated as implicit transactions, and so they fall
under the rules described above. If the results are small enough, they will be
automatically retried. In particular, INSERT/UPDATE/DELETE
statements without
a RETURNING
clause are guaranteed to have minuscule result sizes.
For example, the following statement would be automatically retried by CockroachDB:
DELETE FROM customers WHERE id = 1;
Batched statements
Transactions can be sent from the client as a single batch. Batching implies that CockroachDB receives multiple statements without being asked to return results in between them; instead, CockroachDB returns results after executing all of the statements, except when the accumulated results overflow the buffer mentioned above, in which case they are returned sooner and automatic retries can no longer be performed.
Batching is generally controlled by your driver or client's behavior. Technically, it can be achieved in two ways, both supporting automatic retries:
When the client/driver is using the PostgreSQL Extended Query protocol, a batch is made up of all queries sent in between two
Sync
messages. Many drivers support such batches through explicit batching constructs.When the client/driver is using the PostgreSQL Simple Query protocol, a batch is made up of semicolon-separated strings sent as a unit to CockroachDB. For example, in Go, this code would send a single batch (which would be automatically retried):
db.Exec( "BEGIN; DELETE FROM customers WHERE id = 1; DELETE orders WHERE customer = 1; COMMIT;" )
Within a batch of statements, CockroachDB infers that the statements are not conditional on the results of previous statements, so it can retry all of them. Of course, if the transaction relies on conditional logic (e.g., statement 2 is executed only for some results of statement 1), then the transaction cannot be all sent to CockroachDB as a single batch. In these common cases, CockroachDB cannot retry, say, statement 2 in isolation. Since results for statement 1 have already been delivered to the client by the time statement 2 is forcing the transaction to retry, the client needs to be involved in retrying the whole transaction and so you should write your transactions to use client-side intervention.
Bounded staleness reads
In the event bounded staleness reads are used along with either the with_min_timestamp
function or the with_max_staleness
function and the nearest_only
parameter is set to true
, the query will throw an error if it can't be served by a nearby replica.
Client-side intervention
Your application should include client-side retry handling when the statements are sent individually, such as:
BEGIN;
UPDATE products SET inventory = 0 WHERE sku = '8675309';
INSERT INTO orders (customer, status) VALUES (1, 'new');
COMMIT;
To indicate that a transaction must be retried, CockroachDB signals an error with the SQLSTATE
error code 40001
(serialization error) and an error message that begins with the string "restart transaction"
. For a complete list of transaction retry error codes, see Transaction retry error reference.
To handle these types of errors, you have the following options:
- If your database library or framework provides a method for retryable transactions (it will often be documented as a tool for handling deadlocks), use it.
- If you're building an application in the following languages, Cockroach Labs has created adapters that include automatic retry handling:
- Go developers using GORM or pgx can use the
github.com/cockroachdb/cockroach-go/crdb
package. For an example, see Build a Go App with CockroachDB. - Python developers using SQLAlchemy can use the
sqlalchemy-cockroachdb
adapter. For an example, see Build a Python App with CockroachDB and SQLAlchemy. - Ruby (Active Record) developers can use the
activerecord-cockroachdb-adapter
. For an example, see Build a Ruby App with CockroachDB and Active Record.
- Go developers using GORM or pgx can use the
- If you're building an application with another driver or data access framework that is supported by CockroachDB, we recommend reusing the retry logic in our "Simple CRUD" Example Apps. For example, Java developers accessing the database with JDBC can reuse the example code implementing retry logic shown in Build a Java app with CockroachDB.
- If you're building an application with a language and framework for which we do not provide example retry logic, you might need to write your own retry logic. For an example, see the Client-side intervention example.
- Advanced users, such as library authors: See Advanced Client-Side Transaction Retries.
Client-side intervention example
The Python-like pseudocode below shows how to implement an application-level retry loop; it does not require your driver or ORM to implement advanced retry handling logic, so it can be used from any programming language or environment. In particular, your retry loop must:
- Raise an error if the
max_retries
limit is reached - Retry on
40001
error codes COMMIT
at the end of thetry
block- Implement exponential backoff logic as shown below for best performance
while true:
n++
if n == max_retries:
throw Error("did not succeed within N retries")
try:
# add logic here to run all your statements
conn.exec('COMMIT')
break
catch error:
if error.code != "40001":
throw error
else:
# This is a retry error, so we roll back the current transaction
# and sleep for a bit before retrying. The sleep time increases
# for each failed transaction. Adapted from
# https://colintemple.com/2017/03/java-exponential-backoff/
conn.exec('ROLLBACK');
sleep_ms = int(((2**n) * 100) + rand( 100 - 1 ) + 1)
sleep(sleep_ms) # Assumes your sleep() takes milliseconds
Testing transaction retry logic
New in v22.1:
To test your transaction retry logic, use the inject_retry_errors_enabled
session variable. When inject_retry_errors_enabled
is set to true
, any statement (with the exception of SET
statements) executed in the session inside of an explicit transaction will return a transaction retry error with the message restart transaction: TransactionRetryWithProtoRefreshError: injected by `inject_retry_errors_enabled` session variable
.
If the client retries the transaction using the special cockroach_restart
SAVEPOINT
name, after the 3rd retry, the transaction will proceed as normal. Otherwise, the errors will continue until the client issues a SET inject_retry_errors_enabled=false
statement.
For example, suppose you've written a wrapper function with some retry logic that you want to use to execute statements across a psycopg2
connection:
def run_transaction(conn, op, max_retries=3):
"""
Execute the operation *op(conn)* retrying serialization failures.
If the database returns an error asking to retry the transaction, retry it
*max_retries* times before giving up (and propagate it).
"""
# leaving this block the transaction will commit or rollback
# (if leaving with an exception)
with conn:
for retry in range(1, max_retries + 1):
try:
op(conn)
# If we reach this point, we were able to commit, so we break
# from the retry loop.
return
except SerializationFailure as e:
# This is a retry error, so we roll back the current
# transaction and sleep for a bit before retrying. The
# sleep time increases for each failed transaction.
logging.debug("got error: %s", e)
conn.rollback()
sleep_ms = (2 ** retry) * 0.1 * (random.random() + 0.5)
logging.debug("Sleeping %s seconds", sleep_ms)
time.sleep(sleep_ms)
except psycopg2.Error as e:
logging.debug("got error: %s", e)
raise e
raise ValueError(f"Transaction did not succeed after {max_retries} retries")
run_transaction
takes a SQL-executing function op(conn)
and attempts to run the function, retrying on serialization failures (exposed in pscyopg2 as the SerializationFailure
exception class) with exponential backoff, until reaching a maximum number of tries.
You can add a quick test to this function using the inject_retry_errors_enabled
session variable.
def run_transaction(conn, op=None, max_retries=3):
"""
Execute the operation *op(conn)* retrying serialization failure.
If no op is specified, the function runs a test, using the
inject_retry_errors_enabled session variable to inject errors.
If the database returns an error asking to retry the transaction, retry it
*max_retries* times before giving up (and propagate it).
"""
# leaving this block the transaction will commit or rollback
# (if leaving with an exception)
with conn:
for retry in range(1, max_retries + 1):
try:
if not op:
with conn.cursor() as cur:
if retry == 1:
cur.execute("SET inject_retry_errors_enabled = 'true'")
if retry == max_retries:
cur.execute("SET inject_retry_errors_enabled = 'false'")
cur.execute("SELECT now()")
logging.debug("status message: %s", cur.statusmessage)
else:
op(conn)
# If we reach this point, we were able to commit, so we break
# from the retry loop.
return
except SerializationFailure as e:
# This is a retry error, so we roll back the current
# transaction and sleep for a bit before retrying. The
# sleep time increases for each failed transaction.
logging.debug("got error: %s", e)
conn.rollback()
sleep_ms = (2 ** retry) * 0.1 * (random.random() + 0.5)
logging.debug("Sleeping %s seconds", sleep_ms)
time.sleep(sleep_ms)
except psycopg2.Error as e:
logging.debug("got error: %s", e)
raise e
raise ValueError(f"Transaction did not succeed after {max_retries} retries")
Calling run_transaction
without an op
input sets inject_retry_errors_enabled
as true
until the final retry attempt, before which the inject_retry_errors_enabled
is set back to false
. For all attempts except the last one, CockroachDB will inject a retryable serialization error for the client to handle. If the client cannot handle the error properly, the retry logic isn't working properly.
Transaction contention
Transactions in CockroachDB lock data resources that are written during their execution. When a pending write from one transaction conflicts with a write of a concurrent transaction, the concurrent transaction must wait for the earlier transaction to complete before proceeding. When a dependency cycle is detected between transactions, the transaction with the higher priority aborts the dependent transaction to avoid deadlock, which must be retried.
For more details about transaction contention and best practices for avoiding contention, see Transaction Contention.
Nested transactions
CockroachDB supports the nesting of transactions using savepoints. These nested transactions are also known as sub-transactions. Nested transactions can be rolled back without discarding the state of the entire surrounding transaction.
This can be useful in applications that abstract database access using an application development framework or ORM. Different components of the application can operate on different sub-transactions without having to know about each others' internal operations, while trusting that the database will maintain isolation between sub-transactions and preserve data integrity.
Just as COMMIT
and ROLLBACK
are used to commit and discard entire transactions, respectively, RELEASE SAVEPOINT
and ROLLBACK TO SAVEPOINT
are used to commit and discard nested transactions. This relationship is shown in the following table:
Statement | Effect |
---|---|
COMMIT |
Commit an entire transaction. |
ROLLBACK |
Discard an entire transaction. |
RELEASE SAVEPOINT |
Commit (really, forget) the named nested transaction. |
ROLLBACK TO SAVEPOINT |
Discard the changes in the named nested transaction. |
For more information, including examples showing how to use savepoints to create nested transactions, see the savepoints documentation.
Transaction priorities
Every transaction in CockroachDB is assigned an initial priority. By default, the transaction priority is NORMAL
.
Set transaction priority
For transactions that should be given higher or lower preference in high-contention scenarios, you can set the priority in the BEGIN
statement:
BEGIN PRIORITY <LOW | NORMAL | HIGH>;
You can also set the priority immediately after a transaction is started:
SET TRANSACTION PRIORITY <LOW | NORMAL | HIGH>;
To set the default transaction priority for all transactions in a session, use the default_transaction_priority
session variable. For example:
SET default_transaction_priority = 'high';
When two transactions contend for the same resources indirectly, they may create a dependency cycle leading to a deadlock situation, where both transactions are waiting on the other to finish. In these cases, CockroachDB allows the transaction with higher priority to abort the other, which must then retry. On retry, the transaction inherits the higher priority. This means that each retry makes a transaction more likely to succeed in the event it again experiences deadlock.
View transaction priority
transaction_priority
is a read-only session variable.
To view the current priority of a transaction, use SHOW transaction_priority
or SHOW TRANSACTION PRIORITY
:
SHOW transaction_priority;
transaction_priority
------------------------
high
SHOW TRANSACTION PRIORITY;
transaction_priority
------------------------
high
Isolation levels
CockroachDB executes all transactions at the strongest ANSI transaction isolation level:
SERIALIZABLE
. All other ANSI transaction isolation levels (e.g., SNAPSHOT
, READ UNCOMMITTED
,
READ COMMITTED
, and REPEATABLE READ
) are automatically upgraded to SERIALIZABLE
. Weaker
isolation levels have historically been used to maximize transaction throughput. However,
ACIDRain: Concurrency-Related Attacks on Database-Backed Web Applications has demonstrated that the use
of weak isolation levels results in substantial vulnerability to concurrency-based attacks.
For a detailed discussion of isolation in CockroachDB transactions, see Serializable, Lockless, Distributed: Isolation in CockroachDB.
Serializable isolation
With SERIALIZABLE
isolation, a transaction behaves as though it has the entire database all to itself for the duration of its execution. This means that no concurrent writers can affect the transaction unless they commit before it starts, and no concurrent readers can be affected by the transaction until it has successfully committed. This is the strongest level of isolation provided by CockroachDB and it's the default.
SERIALIZABLE
isolation permits no anomalies. To prevent write skew anomalies, SERIALIZABLE
isolation may require transaction restarts. For a demonstration of SERIALIZABLE
preventing write skew, see Serializable Transactions.
Comparison to ANSI SQL isolation levels
CockroachDB uses slightly different isolation levels than ANSI SQL isolation levels.
Aliases
SNAPSHOT
, READ UNCOMMITTED
, READ COMMITTED
, and REPEATABLE READ
are aliases for SERIALIZABLE
.
Comparison
The CockroachDB SERIALIZABLE
level is stronger than the ANSI SQL READ UNCOMMITTED
, READ COMMITTED
, and REPEATABLE READ
levels and equivalent to the ANSI SQL SERIALIZABLE
level.
For more information about the relationship between these levels, see A Critique of ANSI SQL Isolation Levels.
Limit the number of rows written or read in a transaction
You can limit the number of rows written or read in a transaction at the cluster or session level. This allows you configure CockroachDB to log or reject statements that could destabilize a cluster or violate application best practices.
Use the cluster settings sql.defaults.transaction_rows_written_log
,
sql.defaults.transaction_rows_written_err
, sql.defaults.transaction_rows_read_log
, and
sql.defaults.transaction_rows_read_err
and session settings transaction_rows_written_log
,
transaction_rows_written_err
, transaction_rows_read_log
, and
transaction_rows_read_err
to limit the number of rows written or read in a
transaction. When the log
limit is reached, the transaction is logged to the SQL_PERF
channel.
When the err
limit is reached, the transaction is rejected. The limits are enforced after each
statement of a transaction has been fully executed.
The "write" limits apply to INSERT
, INSERT INTO SELECT FROM
, INSERT ON CONFLICT
, UPSERT
, UPDATE
,
and DELETE
SQL statements. The "read" limits apply to the SELECT
statement in addition to the statements subject to the "write" limits. The limits do not
apply to CREATE TABLE AS
, SELECT
, IMPORT
, TRUNCATE
, DROP
, ALTER TABLE
, BACKUP
,
RESTORE
, or CREATE STATISTICS
statements.
Enabling transaction_rows_read_err
disables a performance optimization for mutation statements in implicit transactions where CockroachDB can auto-commit without additional network round trips.