Hash-sharded Indexes

On this page Carat arrow pointing down

If you are working with a table that must be indexed on sequential keys, you should use hash-sharded indexes. Hash-sharded indexes distribute sequential traffic uniformly across ranges, eliminating single-range hotspots and improving write performance on sequentially-keyed indexes at a small cost to read performance.

Warning:

This is an experimental feature. The interface and output are subject to change.

How hash-sharded indexes work

CockroachDB automatically splits ranges of data in the key-value store based on the size of the range, and on the load streaming to the range. To split a range based on load, the system looks for a point in the range that evenly divides incoming traffic. If the range is indexed on a column of data that is sequential in nature (e.g., an ordered sequence, or a series of increasing, non-repeating TIMESTAMPs), then all incoming writes to the range will be the last (or first) item in the index and appended to the end of the range. As a result, the system cannot find a point in the range that evenly divides the traffic, and the range cannot benefit from load-based splitting, creating a hotspot on the single range.

For details about the mechanics and performance improvements of hash-sharded indexes in CockroachDB, see our Hash Sharded Indexes Unlock Linear Scaling for Sequential Workloads blog post.

How to create a hash-sharded index

To create a hash-sharded index, set the experimental_enable_hash_sharded_indexes session variable to on. Then, add the optional USING HASH WITH BUCKET_COUNT = n_buckets clause to a CREATE INDEX statement, to an INDEX definition in a CREATE TABLE statement, or to an ALTER PRIMARY KEY statement. When this clause is used, CockroachDB creates n_buckets computed columns, shards the index into n_buckets shards, and then stores each index shard in the underlying key-value store with one of the computed column's hash as its prefix.

To change the bucket size of an existing hash-sharded primary key index, use an ALTER PRIMARY KEY statement with a USING HASH WITH BUCKET_COUNT = n_buckets clause that specifies the new bucket size and the existing primary key columns.

Note:

Hash-sharded indexes cannot be interleaved.

Examples

For an example of a hash-sharded index, see Create a hash-sharded secondary index.

See also


Yes No
On this page

Yes No