Batch Delete Expired Data with Row-Level TTL

On this page Carat arrow pointing down

CockroachDB has support for Time to Live ("TTL") expiration on table rows, also known as Row-Level TTL. Row-Level TTL is a mechanism whereby rows from a table are considered "expired" and can be automatically deleted once those rows have been stored longer than a specified expiration time.

By using Row-Level TTL, you can avoid the complexity of writing and managing scheduled jobs from the application layer to mark rows as expired and perform the necessary deletions. Doing it yourself can become complicated due to the need to balance the timeliness of the deletions vs. the potentially negative performance impact of those deletions on foreground traffic from your application.

Use cases for Row-Level TTL include:

  • Delete inactive data events to manage data size and performance: For example, you may want to delete order records from an online store after 90 days.

  • Delete data no longer needed for compliance: For example, a banking application may need to keep some subset of data for a period of time due to financial regulations. Row-Level TTL can be used to remove data older than that period on a rolling, continuous basis.

  • Outbox pattern: When events are written to an outbox table and published to an external system like Kafka using CockroachDB's Change Data Capture (CDC) feature (also known as "changefeeds"), those events must be deleted to prevent unbounded growth in the size of the outbox table.

How it works

At a high level, Row-Level TTL works by:

  • Issuing a selection query at a historical timestamp, yielding a set of rows that are eligible for deletion (also known as "expired").
  • Issuing batched DELETE statements for the expired rows.
  • As part of the above process, deciding how many rows to SELECT and DELETE at once in each of the above queries.
  • Running the SQL queries described above in parallel as background jobs.
  • To minimize the performance impact on foreground application queries, the background deletion queries are rate limited; they are also submitted at a lower priority level using the admission control system. When foreground traffic increases, CockroachDB will reduce the resources allocated to TTL deletes to handle the foreground traffic. When foreground traffic decreases, CockroachDB will increase the resources allocated to TTL deletes.

The process above is conceptually similar to the process described by Batch delete on an indexed column, except that Row-Level TTL is built into CockroachDB, so it saves you from having to write code to manage the process from your application and/or external job processing framework, including tuning the rate and performance of your background queries so they don't affect foreground application query performance.

When are rows deleted?

Once rows are expired (that is, are older than the specified TTL interval), they are eligible to be deleted. However, eligible rows may not be deleted right away. Instead, they are scheduled for deletion using a background job that is run at the interval defined by the ttl_job_cron storage parameter.

Syntax overview

TTLs are defined using either the ttl_expiration_expression or ttl_expire_after storage parameters.

  • Using ttl_expiration_expression is useful for customizing expiration logic by providing an expression. For example, you could get the same behavior as ttl_expire_after by creating a TIMESTAMPTZ column with a default value and having the ttl_expiration_expression reference that column.
  • Using ttl_expire_after is a convenient way of setting rows to expire a fixed amount of time after they are created or updated.

Starting with CockroachDB v22.2, we recommend that most users use ttl_expiration_expression over ttl_expire_after for the following reasons:

  • If you use ttl_expiration_expression, you can use an existing TIMESTAMPTZ column called e.g. updated_at.
  • If you use ttl_expire_after, it will cause a full table rewrite, which can affect performance. Also, you can't use it with an existing TIMESTAMPTZ column.

Using ttl_expiration_expression

Use ttl_expiration_expression for customizing the expiration logic by providing a SQL expression. For example, you could get the same behavior as ttl_expire_after by creating a column with a default value and having the ttl_expiration_expression reference that column.

To add custom expiration logic using ttl_expiration_expression, issue the following SQL statement that uses the ttl_expiration_expression parameter, which defines a TIMESTAMPTZ after which the row is considered expired:

icon/buttons/copy
CREATE TABLE ttl_test_per_row (
  id UUID PRIMARY KEY default gen_random_uuid(),
  description TEXT,
  expired_at TIMESTAMPTZ
) WITH (ttl_expiration_expression = 'expired_at', ttl_job_cron = '@daily');

The statement has the following effects:

  1. Creates a repeating scheduled job for the table and sets it to run once per day.
  2. Implicitly adds the ttl and ttl_cron storage parameters.

To see the storage parameters, enter the SHOW CREATE TABLE statement:

icon/buttons/copy
SHOW CREATE TABLE ttl_test_per_row;
     table_name    |                                    create_statement
-------------------+------------------------------------------------------------------------------------------
  ttl_test_per_row | CREATE TABLE public.ttl_test_per_row (
                   |     id UUID NOT NULL DEFAULT gen_random_uuid(),
                   |     description STRING NULL,
                   |     expired_at TIMESTAMPTZ NULL,
                   |     CONSTRAINT ttl_test_per_row_pkey PRIMARY KEY (id ASC)
                   | ) WITH (ttl = 'on', ttl_expiration_expression = 'expired_at', ttl_job_cron = '@daily')
(1 row)

Starting with CockroachDB v22.2, we recommend that most users use ttl_expiration_expression over ttl_expire_after for the following reasons:

  • If you use ttl_expiration_expression, you can use an existing TIMESTAMPTZ column called e.g. updated_at.
  • If you use ttl_expire_after, it will cause a full table rewrite, which can affect performance. Also, you can't use it with an existing TIMESTAMPTZ column.

Using ttl_expire_after

To set rows to expire a fixed amount of time after they are created or updated, issue the following SQL statement using the ttl_expire_after storage parameter:

icon/buttons/copy
CREATE TABLE ttl_test_per_table (
  id UUID PRIMARY KEY default gen_random_uuid(),
  description TEXT,
  inserted_at TIMESTAMPTZ default current_timestamp()
) WITH (ttl_expire_after = '3 months', ttl_job_cron = '@daily');

The statement has the following effects:

  1. Creates a repeating scheduled job for the table and sets it to run once per day.
  2. Adds a NOT VISIBLE column called crdb_internal_expiration of type TIMESTAMPTZ to represent the TTL.
  3. Implicitly adds the ttl and ttl_cron storage parameters.

To see the hidden column and the storage parameters, enter the SHOW CREATE TABLE statement:

icon/buttons/copy
SHOW CREATE TABLE ttl_test_per_table;
      table_name     |                                                                                         create_statement
---------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  ttl_test_per_table | CREATE TABLE public.ttl_test_per_table (
                     |     id UUID NOT NULL DEFAULT gen_random_uuid(),
                     |     description STRING NULL,
                     |     inserted_at TIMESTAMPTZ NULL DEFAULT current_timestamp():::TIMESTAMPTZ,
                     |     crdb_internal_expiration TIMESTAMPTZ NOT VISIBLE NOT NULL DEFAULT current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL ON UPDATE current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL,
                     |     CONSTRAINT ttl_test_per_table_pkey PRIMARY KEY (id ASC)
                     | ) WITH (ttl = 'on', ttl_expire_after = '3 mons':::INTERVAL, ttl_job_cron = '@daily')
(1 row)

Starting with CockroachDB v22.2, we recommend that most users use ttl_expiration_expression over ttl_expire_after for the following reasons:

  • If you use ttl_expiration_expression, you can use an existing TIMESTAMPTZ column called e.g. updated_at.
  • If you use ttl_expire_after, it will cause a full table rewrite, which can affect performance. Also, you can't use it with an existing TIMESTAMPTZ column.

TTL storage parameters

The settings that control the behavior of Row-Level TTL are provided using storage parameters. These parameters can be set during table creation using CREATE TABLE, added to an existing table using the ALTER TABLE statement, or reset to default values.

Description Option Associated cluster setting
ttl_expiration_expression Recommended in v22.2+. SQL expression that defines the TTL expiration. Must evaluate to a TIMESTAMPTZ. This and/or ttl_expire_after are required to enable TTL. This parameter is useful when you want to set the TTL for individual rows in the table. For an example, see Create a table with a ttl_expiration_expression. N/A
ttl_expire_after The interval when a TTL will expire. This and/or ttl_expiration_expression are required to enable TTL. Minimum value: '1 microsecond'. N/A
ttl Signifies if a TTL is active. Automatically set. N/A
ttl_select_batch_size How many rows to select at one time during the row expiration check. Default: 500. Minimum: 1. sql.ttl.default_select_batch_size
ttl_delete_batch_size How many rows to delete at a time. Default: 100. Minimum: 1. sql.ttl.default_delete_batch_size
ttl_delete_rate_limit Maximum number of rows to be deleted per second (rate limit). Default: 0 (no limit).

Note: The rate limit is applied per leaseholder per table. In practice, it will vary based on the number of ranges and leaseholders.
For example, if this variable is set to 500, and a table public.foo has its ranges spread across 3 leaseholders, then the cluster-wide rate limit for public.foo is 500 x 3 = 1500.
sql.ttl.default_delete_rate_limit
ttl_row_stats_poll_interval If set, counts rows and expired rows on the table to report as Prometheus metrics while the TTL job is running. Unset by default, meaning no stats are fetched and reported. N/A
ttl_pause If set, stops the TTL job from executing. N/A
ttl_job_cron Frequency at which the TTL job runs, specified using CRON syntax. Default: '@hourly'. N/A

For more information about TTL-related cluster settings, see View TTL-related cluster settings.

TTL metrics

The table below lists the metrics you can use to monitor the effectiveness of your TTL settings. These metrics are visible on the Advanced Debug Page, as well as at the _status/vars endpoint which can be scraped by Prometheus.

Name Description Measurement Type
jobs.row_level_ttl.range_total_duration Duration for processing a range during row level TTL. nanoseconds Histogram
jobs.row_level_ttl.select_duration Duration for select requests during row level TTL. nanoseconds Histogram
jobs.row_level_ttl.delete_duration Duration for delete requests during row level TTL. nanoseconds Histogram
jobs.row_level_ttl.rows_selected Number of rows selected for deletion by the row level TTL job. num_rows Counter
jobs.row_level_ttl.rows_deleted Number of rows deleted by the row level TTL job. num_rows Counter
jobs.row_level_ttl.num_active_ranges Number of active workers attempting to delete for row level TTL. num_active_workers Count
jobs.row_level_ttl.total_rows Approximate number of rows on the TTL table. total_rows Count
jobs.row_level_ttl.total_expired_rows Approximate number of expired rows on the TTL table. total_expired_rows Count

By default, these metrics are aggregated, meaning that all TTL tables will report the metrics under the same label. If you want to have metrics labelled by table name (at the risk of added cardinality), you must take the following steps:

  • Set the server.child_metrics.enabled cluster setting to true.
  • Set the ttl_label_metrics storage parameter to true.
Note:

For more information about the issues (including negative performance impacts) that can arise when you add cardinality, see the considerations listed in Using changefeed metrics labels.

Examples

Create a table with a ttl_expiration_expression

Use the SQL syntax shown below, which uses the ttl_expiration_expression parameter to refer to an expire_at column that determines each row's expiration:

icon/buttons/copy
CREATE TABLE ttl_test_ttl_expiration_expression (
  id INT PRIMARY KEY,
  expire_at TIMESTAMPTZ
) WITH (ttl_expiration_expression = 'expire_at');

The ttl_expiration_expression" parameter takes a SQL expression (often a column name) that defines the TTL expiration. It is used when you want to set the TTL for individual rows in a table.

The ttl_expiration_expression parameter has the following requirements:

  • It must evaluate to a TIMESTAMPTZ.
  • It must not reference any columns outside the table to which it is applied.
  • Any column it references cannot be dropped or have its type altered.
  • Finally, if the column is renamed, the value of ttl_expiration_expression is automatically updated.

Create a table with ttl_expire_after

Use the SQL syntax shown below to create a new table with rows that expire after a 3 month interval, execute a statement like the following:

icon/buttons/copy
CREATE TABLE events (
  id UUID PRIMARY KEY default gen_random_uuid(),
  description TEXT,
  inserted_at TIMESTAMP default current_timestamp()
) WITH (ttl_expire_after = '3 months', ttl_job_cron = '@daily');
CREATE TABLE

Insert some data; it should work as expected:

icon/buttons/copy
INSERT INTO events (description) VALUES ('a thing'), ('another thing'), ('yet another thing');
INSERT 3

To see the rows and their expirations, enter the following query:

icon/buttons/copy
SELECT *, crdb_internal_expiration FROM events;
                   id                  |    description    |        inserted_at         |   crdb_internal_expiration
---------------------------------------+-------------------+----------------------------+--------------------------------
  117c35fe-97f6-43bc-919f-fcd2ea13779e | a thing           | 2022-04-19 18:29:53.846697 | 2022-07-19 18:29:53.846697+00
  c294890f-2f14-4e18-8001-5f806ed9bfd1 | yet another thing | 2022-04-19 18:29:53.846697 | 2022-07-19 18:29:53.846697+00
  ea72189c-2f17-4a8e-b479-6b050a87e3bb | another thing     | 2022-04-19 18:29:53.846697 | 2022-07-19 18:29:53.846697+00
(3 rows)

Add or update the row-level TTL for an existing table

To add or change the row-level TTL expiration for an existing table, use the SQL syntax shown below.

icon/buttons/copy
ALTER TABLE events SET (ttl_expire_after = '1 year');
ALTER TABLE

Warning:

Adding or changing the Row-Level TTL settings for an existing table with a table-wide TTL will result in a schema change that performs the following changes:

Depending on the table size, this can negatively affect performance.

View scheduled TTL jobs

You can use SHOW SCHEDULES to view all TTL-related scheduled jobs by executing the following query:

icon/buttons/copy
SHOW SCHEDULES;

          id         |        label         | schedule_status |        next_run        |  state  | recurrence | jobsrunning | owner |            created            |     command
---------------------+----------------------+-----------------+------------------------+---------+------------+-------------+-------+-------------------------------+-------------------
  747608117920104449 | sql-stats-compaction | ACTIVE          | 2022-03-25 16:00:00+00 | pending | @hourly    |           0 | node  | 2022-03-25 15:31:31.444067+00 | {}
  747609229470433281 | row-level-ttl-112    | ACTIVE          | 2022-03-25 16:00:00+00 | NULL    | @daily    |           0 | root  | 2022-03-25 15:37:10.613056+00 | {"tableId": 112}
(2 rows)

View running TTL jobs

You can use SHOW JOBS to see any running TTL jobs by executing the following query:

icon/buttons/copy
WITH x AS (SHOW JOBS) SELECT * from x WHERE job_type = 'ROW LEVEL TTL';
        job_id       |   job_type    | description | statement | user_name | status | running_status |          created           |          started           |          finished          |          modified          | fraction_completed |                                    error                                    | coordinator_id |      trace_id       |          last_run          |          next_run          | num_runs | execution_errors
---------------------+---------------+-------------+-----------+-----------+--------+----------------+----------------------------+----------------------------+----------------------------+----------------------------+--------------------+-----------------------------------------------------------------------------+----------------+---------------------+----------------------------+----------------------------+----------+-------------------
  751553547665211401 | ROW LEVEL TTL | ttl         |           | node      | failed | NULL           | 2022-04-08 13:59:00.008994 | 2022-04-08 13:59:02.730252 | 2022-04-08 13:59:03.367008 | 2022-04-08 13:59:02.587079 |                  0 | found a recent schema change on the table at 2022-04-08T13:58:35Z, aborting |              5 | 6643876482632317647 | 2022-04-08 13:59:03.224766 | 2022-04-08 13:59:33.224766 |        1 | {}
(1 row)
Tip:

You can also view running TTL jobs using the Jobs page in the DB Console

Reset a storage parameter to its default value

To reset a TTL storage parameter to its default value, use the ALTER TABLE statement:

icon/buttons/copy
ALTER TABLE events RESET (ttl_job_cron);
ALTER TABLE

View TTL storage parameters on a table

To view TTL storage parameters on a table, you can use SHOW CREATE TABLE:

icon/buttons/copy
SHOW CREATE TABLE events;
  table_name |                                                                                         create_statement
-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  events     | CREATE TABLE public.events (
             |     id UUID NOT NULL DEFAULT gen_random_uuid(),
             |     description STRING NULL,
             |     inserted_at TIMESTAMP NULL DEFAULT current_timestamp():::TIMESTAMP,
             |     crdb_internal_expiration TIMESTAMPTZ NOT VISIBLE NOT NULL DEFAULT current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL ON UPDATE current_timestamp():::TIMESTAMPTZ + '3 mons':::INTERVAL,
             |     CONSTRAINT events_pkey PRIMARY KEY (id ASC)
             | ) WITH (ttl = 'on', ttl_expire_after = '3 mons':::INTERVAL, ttl_job_cron = '@daily')
(1 row)

You can also use the following query:

icon/buttons/copy
SELECT relname, reloptions FROM pg_class WHERE relname = 'events';
  relname |                                reloptions
----------+---------------------------------------------------------------------------
  events  | NULL
  events  | {ttl='on',"ttl_expire_after='3 mons':::INTERVAL",ttl_job_cron='@daily'}
(2 rows)

Control how often the TTL job runs

Setting a TTL on a table controls when the rows therein are considered expired, but it only says that such rows may be deleted at any time after the expiration. To control how often the TTL deletion job runs, use the ttl_job_cron storage parameter, which supports CRON syntax. Cockroach Labs recommends setting ttl_job_cron to be equal to or longer than the gc.ttlseconds setting, which is the garbage collection interval for the cluster. The default value of gc.ttlseconds is 14400, or 4 hours. The CRON pattern for every four hours is '0 */4 * * *'.

To control the job interval at CREATE TABLE time, add the storage parameter as shown below:

icon/buttons/copy
CREATE TABLE tbl (
  id UUID PRIMARY KEY default gen_random_uuid(),
  value TEXT
) WITH (ttl_expire_after = '3 weeks', ttl_job_cron = '0 */4 * * *');
CREATE TABLE
Note:

To set the ttl_job_cron storage parameter when creating a table with Row-Level TTL, you must also set either the ttl_expire_after parameter or the ttl_expiration_expression parameter.

To update the TTL deletion job interval on a table that already has Row-Level TTL enabled, use ALTER TABLE:

icon/buttons/copy
ALTER TABLE tbl SET (ttl_job_cron = '@weekly');
ALTER TABLE

Pause the TTL job from running

To pause the TTL job from running on a table, use the ttl_pause storage parameter:

icon/buttons/copy
ALTER TABLE events SET (ttl_pause = 'on');
ALTER TABLE

If you run the TTL pausing statement above against a table that does not have TTL enabled, you will get the following error:

ERROR: "ttl_expire_after" must be set
SQLSTATE: 22023

Filter out expired rows from a selection query

To fetch only those rows from a table with table-wide TTL that have not yet expired their TTL, use the hidden crdb_internal_expiration column:

icon/buttons/copy
SELECT * FROM events WHERE crdb_internal_expiration > now();
icon/buttons/copy
                   id                  |    description    |        inserted_at
---------------------------------------+-------------------+-----------------------------
  6d25862e-2e48-4993-ac3a-a2abbebebf32 | yet another thing | 2022-03-17 20:01:56.138216
  a9404386-c4da-415f-b0b0-0dfad0f13c80 | a thing           | 2022-03-17 20:01:56.138216
  d4ebf8cd-e482-4abb-8968-2ba39c9197d9 | another thing     | 2022-03-17 20:01:56.138216
(3 rows)

To fetch only those rows from a table with a ttl_expiration_expression that have not yet expired their TTL, use the expired_at column you created earlier:

icon/buttons/copy
SELECT * FROM ttl_test_per_row WHERE expired_at > now();

Remove Row-Level TTL from a table

To drop the TTL on an existing table, reset the ttl storage parameter.

icon/buttons/copy
ALTER TABLE events RESET (ttl);

If both ttl_expire_after and ttl_expiration_expression are set, and you want to remove one or the other, you can use either of:

icon/buttons/copy
ALTER TABLE events RESET (ttl_expire_after);
icon/buttons/copy
ALTER TABLE events RESET (ttl_expiration_expression);

Disable TTL jobs for the whole cluster

To disable TTL jobs for the whole cluster, set the sql.ttl.job.enabled cluster setting to false:

icon/buttons/copy
SET CLUSTER SETTING sql.ttl.job.enabled = false;
SET CLUSTER SETTING

To view the cluster settings that control how Row-Level TTL works, issue the following query:

icon/buttons/copy
WITH x AS (SHOW CLUSTER SETTINGS) SELECT * FROM x WHERE variable LIKE 'sql.ttl.%';
              variable              | value | setting_type |                                 description
------------------------------------+-------+--------------+------------------------------------------------------------------------------
  sql.ttl.default_delete_batch_size | 100   | i            | default amount of rows to delete in a single query during a TTL job
  sql.ttl.default_delete_rate_limit | 0     | i            | default delete rate limit for all TTL jobs. Use 0 to signify no rate limit.
  sql.ttl.default_select_batch_size | 500   | i            | default amount of rows to select in a single query during a TTL job
  sql.ttl.job.enabled               | false | b            | whether the TTL job is enabled
(5 rows)

Common errors

If you attempt to update a TTL storage parameter on a table that does not have TTL enabled, you will get an error as shown below:

icon/buttons/copy
ALTER TABLE events SET (ttl_job_cron = '@weekly');
ERROR: "ttl_expire_after" and/or "ttl_expiration_expression" must be set
SQLSTATE: 22023

If you try to reset a TTL storage parameter but resetting that parameter would result in an invalid state of the TTL subsystem, CockroachDB will signal an error. For example, there is only one way to remove Row-Level TTL from a table. If you try to remove the TTL from a table by resetting the ttl_expire_after storage parameter you set earlier, you will get the following error:

icon/buttons/copy
ALTER TABLE tbl RESET (ttl_expire_after);
ERROR: "ttl_expire_after" and/or "ttl_expiration_expression" must be set
SQLSTATE: 22023

Changefeeds

Row-level TTL interacts with changefeeds in the following ways:

Backup and restore

Row-level TTL interacts with backup and restore in the following ways:

  • When you run a BACKUP, all row-level TTL information associated with the tables being backed up (including TTL expiration times) is also backed up.

  • When you RESTORE from a backup, all row-level TTL information associated with the tables being restored (including TTL expiration times) is also restored. Any expired rows in the restored tables are eligible to be deleted by the TTL job.

Required Privileges

To add or update Row-Level TTL settings on a table, you must have one of the following:

Migrating TTL usage from earlier versions of CockroachDB

If you are migrating your TTL usage from an earlier version of CockroachDB, the ttl_expire_after and ttl_expiration_expression storage parameters can co-exist where the ttl_expire_after creates the crdb_internal_expiration column and ttl_expiration_expression overrides the default value of crdb_internal_expiration.

Limitations

  • You cannot use foreign keys to create references to or from a table that uses Row-Level TTL. cockroachdb/cockroach#76407
  • Any queries you run against tables with Row-Level TTL enabled do not filter out expired rows from the result set (this includes UPDATEs and DELETEs). This feature may be added in a future release. For now, follow the instructions in Filter out expired rows from a selection query.
  • Enabling Row-Level TTL on a table with multiple secondary indexes can have negative performance impacts on a cluster, including increased latency and contention. This is particularly true for large tables with terabytes of data and billions of rows that are split up into multiple ranges across separate nodes.
    • Increased latency may occur because secondary indexes aren't necessarily stored on the same underlying ranges as a table's primary indexes. Further, the secondary indexes' ranges may have leaseholders located on different nodes than the primary index.
    • Increased contention may occur because intents must be written as part of performing the deletions.
    • Finally, secondary indexes can also have a negative impact on the overall performance of TTL jobs. According to internal testing, the TTL job processing rate is worse on tables with secondary indexes. If you encounter this situation, decreasing the ttl_delete_batch_size storage parameter may help by decreasing the number of ranges that need to be accessed by the job.

See also


Yes No
On this page

Yes No