CREATE SCHEDULE FOR CHANGEFEED
allows you to create a scheduled changefeed to export data out of CockroachDB. Scheduled changefeeds have the scale, observability, and endpoint sink options that changefeed jobs include, with the convenience of setting a regular schedule. A changefeed job created with CREATE SCHEDULE FOR CHANGEFEED
performs a one-time table scan using the initial scan functionality to create an export of your table data.
For more detail on using changefeeds to create an export of your table data, see Export Data with Changefeeds.
Required privileges
Starting in v22.2, CockroachDB introduces a new system-level privilege model that provides finer control over a user's privilege to work with the database, including creating and managing changefeeds.
There is continued support for the legacy privilege model for changefeeds in v23.1, however it will be removed in a future release of CockroachDB. We recommend implementing the new privilege model that follows in this section for all changefeeds.
You can grant a user the CHANGEFEED
privilege to allow them to create changefeeds on a specific table:
GRANT CHANGEFEED ON TABLE example_table TO user;
When you grant a user the CHANGEFEED
privilege on a set of tables, they can:
- Create changefeeds on the target tables even if the user does not have the
CONTROLCHANGEFEED
role option or theSELECT
privilege on the tables. - Manage the changefeed jobs running on the tables using the
SHOW CHANGEFEED JOB
,PAUSE JOB
,RESUME JOB
, andCANCEL JOB
commands.
These users will be able to create changefeeds, but they will not be able to run a SELECT
query on that data directly. However, they could still read this data indirectly if they have read access to the sink.
To restrict a user's access to changefeed data and sink credentials, enable the changefeed.permissions.require_external_connection_sink.enabled
cluster setting. When you enable this setting, users with the CHANGEFEED
privilege on a set of tables can only create changefeeds into external connections.
Privilege model
The following summarizes the operations users can run when they have changefeed privileges on a table:
Granted privileges | Usage |
---|---|
CHANGEFEED |
Create changefeeds on tables. Manage changefeed jobs on tables. |
CHANGEFEED + USAGE on external connection |
Create changefeeds on tables to an external connection URI. Manage changefeed jobs on tables. Note: If you need to manage access to changefeed sink URIs, set the changefeed.permissions.require_external_connection_sink.enabled=true cluster setting. This will mean that users with these privileges can only create changefeeds on external connections. |
SELECT |
Create a sinkless changefeed that emits messages to a SQL client. |
Deprecated CONTROLCHANGEFEED role option + SELECT |
Create changefeeds on tables. |
You can add CHANGEFEED
to the user or role's default privileges with ALTER DEFAULT PRIVILEGES
:
ALTER DEFAULT PRIVILEGES GRANT CHANGEFEED ON TABLES TO user;
Users with the CONTROLCHANGEFEED
role option must have SELECT
on each table, even if they are also granted the CHANGEFEED
privilege. The CONTROLCHANGEFEED
role option will be deprecated in a future release.
Legacy privilege model
To create a changefeed, the user must be a member of the admin
role or have the CREATECHANGEFEED
parameter set.
Synopsis
Parameters
Parameter | Description |
---|---|
IF NOT EXISTS |
A scheduled changefeed should not be created if the schedule_label already exists. You will receive an error if the schedule label already exists, or if schedule_label is not defined when using IF NOT EXISTS . |
schedule_label |
The name for the scheduled changefeed. This is optional and does not need to be unique. If you do not define a name, the label will default to CHANGEFEED with the timestamp of when you created the schedule. |
changefeed_targets |
The tables to target with the changefeed. For example, movr.users, movr.rides . |
changefeed_sink |
The changefeed sink URI. |
changefeed_option |
The options to control the behavior of your changefeed. For example, WITH format = csv, full_table_name . See Changefeed options for a list of available options. |
target_list |
The columns to emit data from if you're using a CDC query expression. |
insert_target |
The target tables for the changefeed if you're using a CDC query expression. |
where_clause |
An optional WHERE clause to apply filters to the table if you're using a CDC query expression. |
crontab |
The frequency of the changefeed. The schedule is specified as a STRING in crontab format. All times in UTC. For example, '@daily' , '@hourly' , '1 0 * * *' . |
schedule_option |
The schedule options to control the schedule's behavior. For example, first_run = now . See Schedule options. |
Changefeed options
You can include the changefeed options listed on the CREATE CHANGEFEED
page to modify the behavior of your changefeed. The following options are not compatible with scheduled changefeeds:
diff
end_time
mvcc_timestamp
resolved
updated
Scheduled changefeeds have the initial_scan = 'only'
option included implicitly. You cannot specify initial_scan
as 'yes'
or 'no'
.
Schedule options
Option | Value | Description |
---|---|---|
first_run |
TIMESTAMP / now |
Execute the first run of the schedule at this time. If you do not specify first_run , the schedule will execute based on the next RECURRING time set by the crontab. |
on_execution_failure |
retry / reschedule / pause |
Determine how the schedule handles an error. retry : Retry the changefeed immediately.reschedule : Reschedule the changefeed based on the RECURRING expression.pause : Pause the schedule. This requires that you resume the schedule manually.Default: reschedule |
on_previous_running |
start / skip / wait |
Control whether the changefeed schedule should start a changefeed if the previous scheduled changefeed is still running.start : Start the new changefeed anyway, even if the previous one is running.skip : Skip the new changefeed and run the next changefeed based on the RECURRING expression.wait : Wait for the previous changefeed to complete.Default: wait |
New in v24.2: To avoid multiple clusters running the same schedule concurrently, changefeed schedules will pause when restored onto a different cluster or after physical cluster replication has completed.
Examples
Before running any of the examples in this section, it is necessary to enable the kv.rangefeed.enabled
cluster setting. If you are working on a CockroachDB Standard or Basic cluster, this cluster setting is enabled by default.
The Changefeed Sinks page provides detail on the available sinks for your change data messages and connection URIs. We recommend using external connections to interact with external sinks. The examples in this section use an external connection URI for the changefeed sink.
Create a scheduled changefeed
The following statement sets up a scheduled changefeed named users_rides_nightly
that will send changefeed messages in CSV
format 1 minute past midnight every night. As soon as the statement is run, the first changefeed run will execute immediately:
CREATE SCHEDULE users_rides_nightly FOR CHANGEFEED users, rides INTO 'external://kafka-sink' WITH format=csv RECURRING '1 0 * * *' WITH SCHEDULE OPTIONS first_run=now, on_execution_failure=retry, on_previous_running=skip;
The schedule options control the schedule's behavior:
- If it runs into an error,
on_execution_failure=retry
will ensure that the schedule retries the changefeed immediately. - If the previous scheduled changefeed is still running,
on_previous_running=skip
will skip a new changefeed at the next scheduled time.
The output will confirm that the changefeed has added the initial_scan = 'only'
option implicitly:
schedule_id | label | status | first_run | schedule | changefeed_stmt
---------------------+---------------+--------+-------------------------------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------
947257854259855361 | users_nightly | ACTIVE | 2024-02-28 20:02:35.716699+00 | 1 0 * * * | CREATE CHANGEFEED FOR TABLE movr.public.users, TABLE movr.public.rides INTO 'external://kafka-sink' WITH OPTIONS (format = 'csv', initial_scan = 'only')
(1 row)
NOTICE: added missing initial_scan='only' option to schedule changefeed
Create a scheduled changefeed with CDC queries
You can use CDC queries with scheduled changefeeds to define expression syntax that selects columns and applies filters to further restrict or transform the data in your changefeed messages. When you add this expression syntax to your changefeed statement, you can only target one table.
For guidance on syntax and more example use cases, see Change Data Capture Queries.
This scheduled changefeed filters for the usage of promotion codes in the movr
database and sends the changefeed messages on a daily basis:
CREATE SCHEDULE promo_code FOR CHANGEFEED INTO 'external://kafka-sink' AS SELECT user_id, usage_count FROM movr.user_promo_codes WHERE usage_count > 1 RECURRING '@daily' WITH SCHEDULE OPTIONS first_run=now, on_execution_failure=reschedule, on_previous_running=skip;
View scheduled changefeed details
To show all scheduled changefeeds:
SHOW SCHEDULES FOR CHANGEFEED;
To view the details of only running scheduled changefeeds:
SHOW RUNNING SCHEDULES FOR CHANGEFEED;
To view the details of only paused scheduled changefeeds:
SHOW PAUSED SCHEDULES FOR CHANGEFEED;
To view the details of a specific scheduled changefeed:
SHOW SCHEDULE {schedule ID};
To pause a scheduled changefeed:
PAUSE SCHEDULE {schedule ID};
To resume a scheduled changefeed:
RESUME SCHEDULE {schedule ID};
To delete a scheduled changefeed:
DROP SCHEDULE {schedule ID};
To see the full CREATE SCHEDULE
statement for the scheduled changefeed:
SHOW CREATE SCHEDULE {schedule ID};