cockroach debug zip

On this page Carat arrow pointing down

The cockroach debug zip command connects to your cluster and gathers information from each active node into a single .zip file (inactive nodes are not included). For details on the .zip contents, see Files.

You can use the cockroach debug merge-logs command in conjunction with cockroach debug zip to merge the collected logs into one file, making them easier to parse.

Warning:

The files produced by cockroach debug zip can contain highly sensitive, personally-identifiable information (PII), such as usernames, hashed passwords, and possibly table data. Use the --redact flag to configure CockroachDB to redact sensitive data when generating the .zip file (excluding range keys) if intending to share it with Cockroach Labs.

Details

Use cases

Warning:

cockroach debug zip is an expensive operation and impacts cluster performance.

Only use this command as an emergency measure under the guidance of Cockroach Labs.

Particularly fetching stack traces for all goroutines is a "stop-the-world" operation, which can momentarily but significantly increase SQL service latency. Exclude these goroutine stacks by using the --include-goroutine-stacks=false flag.

There are two scenarios in which debug zip is useful:

  • If you experience severe or difficult-to-reproduce issues with your cluster, Cockroach Labs might ask you to send us your cluster's debugging information using cockroach debug zip. We recommend reducing the *.zip file size by only retrieving debugging information for the relevant time range of the issue by using the --files-from, and/or --files-until flags.

  • To collect all of your nodes' logs, which you can then parse to locate issues. You can optionally use the flags to retrieve only the log files. For more information about logs, see Logging. Also note:

    • Nodes that are currently down cannot deliver their logs over the network. For these nodes, you must log on to the machine where the cockroach process would otherwise be running, and gather the files manually.
    • Nodes that are currently up but disconnected from other nodes (e.g., because of a network partition) may not be able to respond to debug zip requests forwarded by other nodes, but can still respond to requests for data when asked directly. In such situations, we recommend using the --host flag to point debug zip at each of the disconnected nodes until data has been gathered for the entire cluster.

Files

cockroach debug zip collects log files, heap profiles, CPU profiles, and goroutine dumps from the last 48 hours, by default.

Tip:

These files can greatly increase the size of the cockroach debug zip output. To limit the .zip file size for a large cluster, we recommend first experimenting with cockroach debug list-files and then using flags to filter the files.

The following files collected by cockroach debug zip, which are found in the individual node directories, can be filtered using the --exclude-files, --include-files, --files-from, and/or --files-until flags:

Information Filename
Log files cockroach-{log-file-group}.{host}.{user}.{start timestamp in UTC}.{process ID}.log
Goroutine dumps goroutine_dump.{date-and-time}.{metadata}.double_since_last_dump.{metadata}.txt.gz
Heap profiles memprof.{date-and-time}.{heapsize}.pprof
Memory statistics memstats.{date-and-time}.{heapsize}.txt
CPU profiles cpuprof.{date-and-time}
Active query dumps activequeryprof.{date-and-time}.csv

The following information is also contained in the .zip file, and cannot be filtered:

  • System tables. The following system tables are not included:
    • system.users
    • system.web_sessions
    • system.join_tokens
    • system.comments
    • system.ui
    • system.zones
    • system.statement_bundle_chunks
    • system.statement_statistics
    • system.transaction_statistics
  • Cluster events
  • Database details
  • Schema change events
  • Database, table, node, and range lists
  • Node details
  • Node liveness
  • Gossip data
  • Stack traces
  • Range details
  • Jobs
  • Cluster Settings
  • Metrics
  • Replication Reports
  • CPU profiles
  • A script (hot-ranges.sh) that summarizes the hottest ranges (ranges receiving a high number of reads or writes)

Subcommands

While the cockroach debug command has a few subcommands, users are expected to use only the zip, encryption-active-key, merge-logs, list-files, tsdump, and ballast subcommands.

We recommend using the encryption-decrypt and job-trace subcommands only when directed by the Cockroach Labs support team.

The other debug subcommands are useful only to Cockroach Labs. Output of debug commands may contain sensitive or secret information.

Synopsis

$ cockroach debug zip {ZIP file destination} {flags}
Note:

The following flags must apply to an active CockroachDB node. If no nodes are live, you must start at least one node.

Flags

The debug zip subcommand supports the following general-use, client connection, and logging flags.

General

Flag Description
--cpu-profile-duration Fetch CPU profiles from the cluster with the specified sample duration in seconds. The debug zip command will block for the duration specified. A value of 0 disables this feature.

Default: 5s
--concurrency The maximum number of nodes to concurrently poll for data. This can be any value between 1 and 15.
--exclude-files Files to exclude from the generated .zip. This can be used to limit the size of the generated .zip, and affects logs, heap profiles, goroutine dumps, and/or CPU profiles. The files are specified as a comma-separated list of glob patterns. For example:

--exclude-files=*.log

Note that this flag is applied after --include_files. Use cockroach debug list-files with this flag to see a list of files that will be contained in the .zip.
--exclude-nodes Specify nodes to exclude from inspection as a comma-separated list or range of node IDs. For example:

--exclude-nodes=1,10,13-15
--files-from Start timestamp for log file, goroutine dump, and heap profile collection. This can be used to limit the size of the generated .zip, which is increased by these files. The timestamp uses the format YYYY-MM-DD, followed optionally by HH:MM:SS or HH:MM. For example:

--files-from='2021-07-01 15:00'

When specifying a narrow time window, we recommend adding extra seconds/minutes to account for uncertainties such as clock drift.

Default: 48 hours before now
--files-until End timestamp for log file, goroutine dump, and heap profile collection. This can be used to limit the size of the generated .zip, which is increased by these files. The timestamp uses the format YYYY-MM-DD, followed optionally by HH:MM:SS or HH:MM. For example:

--files-until='2021-07-01 16:00'

When specifying a narrow time window, we recommend adding extra seconds/minutes to account for uncertainties such as clock drift.

Default: 24 hours beyond now (to include files created during .zip creation)
--include-files Files to include in the generated .zip. This can be used to limit the size of the generated .zip, and affects logs, heap profiles, goroutine dumps, and/or CPU profiles. The files are specified as a comma-separated list of glob patterns. For example:

--include-files=*.pprof

Note that this flag is applied before --exclude-files. Use cockroach debug list-files with this flag to see a list of files that will be contained in the .zip.
--include-goroutine-stacks Fetch stack traces for all goroutines running on each targeted node in nodes/*/stacks.txt and nodes/*/stacks_with_labels.txt files. Note that fetching stack traces for all goroutines is a "stop-the-world" operation, which can momentarily have negative impacts on SQL service latency. Exclude these goroutine stacks by using the --include-goroutine-stacks=false flag. Note that any periodic goroutine dumps previously taken on the node will still be included in nodes/*/goroutines/*.txt.gz, as these would have already been generated and don't require any additional stop-the-world operations to be collected.

Default: true
--include-range-info Include one file per node with information about the KV ranges stored on that node, in nodes/{node ID}/ranges.json.

This information can be vital when debugging issues that involve the KV layer (which includes everything below the SQL layer), such as data placement, load balancing, performance or other behaviors. In certain situations, on large clusters with large numbers of ranges, these files can be omitted if and only if the issue being investigated is already known to be in another layer of the system (for example, an error message about an unsupported feature or incompatible value in a SQL schema change or statement). However, many higher-level issues are ultimately related to the underlying KV layer described by these files. Only set this to false if directed to do so by Cockroach Labs support.

In addition, include problem ranges information in reports/problemranges.json.

Default: true
--include-running-job-traces Include information about each traceable job that is running or reverting (such as backup, restore, import, physical cluster replication) in jobs/*/*/trace.zip files. This involves collecting cluster-wide traces for each running job in the cluster.

Default: true
--nodes Specify nodes to inspect as a comma-separated list or range of node IDs. For example:

--nodes=1,10,13-15
--redact Redact sensitive data from the generated .zip, with the exception of range keys, which must remain unredacted because they are essential to support CockroachDB. This flag replaces the deprecated --redact-logs flag, which only applied to log messages contained within .zip.

To redact hostnames and IP addresses in .json files, such as status.json, details.json, and ranges.json, you will also need to enable the cluster setting debug.zip.redact_addresses.enabled. Note that enabling this cluster setting will not redact all hostnames and IP addresses in the nodes.json and gossip.json files.

For examples, refer to Redact sensitive information.
--redact-logs Deprecated Redact sensitive data from collected log files only. Use the --redact flag instead, which redacts sensitive data across the entire generated .zip as well as the collected log files. Passing the --redact-logs flag will be interpreted as the --redact flag.
--timeout In the process of generating a debug zip, many internal requests are made. Each request is allowed the maximum duration specified by the timeout. If an internal request does not complete within the timeout duration, an error is displayed for that request and its artifact is not included in the zip file.

The timeout is suffixed with s (seconds), m (minutes), or h (hours).

Default: 60s

Client connection

Flag Description
--cert-principal-map A comma-separated list of <cert-principal>:<db-principal> mappings. This allows mapping the principal in a cert to a DB principal such as node or root or any SQL user. This is intended for use in situations where the certificate management system places restrictions on the Subject.CommonName or SubjectAlternateName fields in the certificate (e.g., disallowing a CommonName like node or root). If multiple mappings are provided for the same <cert-principal>, the last one specified in the list takes precedence. A principal not specified in the map is passed through as-is via the identity function. A cert is allowed to authenticate a DB principal if the DB principal name is contained in the mapped CommonName or DNS-type SubjectAlternateName fields.
--certs-dir The path to the certificate directory containing the CA and client certificates and client key.

Env Variable: COCKROACH_CERTS_DIR
Default: ${HOME}/.cockroach-certs/
--cluster-name The cluster name to use to verify the cluster's identity. If the cluster has a cluster name, you must include this flag. For more information, see cockroach start.
--disable-cluster-name-verification Disables the cluster name check for this command. This flag must be paired with --cluster-name. For more information, see cockroach start.
--host The server host and port number to connect to. This can be the address of any node in the cluster.

Env Variable: COCKROACH_HOST
Default: localhost:26257
--insecure Use an insecure connection.

Env Variable: COCKROACH_INSECURE
Default: false
--url A connection URL to use instead of the other arguments. To convert a connection URL to the syntax that works with your client driver, run cockroach convert-url.

Env Variable: COCKROACH_URL
Default: no URL

Logging

By default, this command logs messages to stderr. This includes events with WARNING severity and higher.

If you need to troubleshoot this command's behavior, you can customize its logging behavior.

Examples

Generate a debug zip file

Generate the debug zip file for an insecure cluster:

icon/buttons/copy
$ cockroach debug zip ./cockroach-data/logs/debug.zip --insecure --host=200.100.50.25

Generate the debug zip file for a secure cluster:

icon/buttons/copy
$ cockroach debug zip ./cockroach-data/logs/debug.zip --host=200.100.50.25
Note:

Secure examples assume you have the appropriate certificates in the default certificate directory, ${HOME}/.cockroach-certs/.

Generate a debug zip file for a time range

Generate a debug zip file containing only debugging information for a specified time range:

icon/buttons/copy
$ cockroach debug zip ./cockroach-data/logs/debug.zip --files-from='2023-10-03 13:30' --files-until='2023-10-03 14:30'

Generate a debug zip file with logs only

Generate a debug zip file containing only log files:

icon/buttons/copy
$ cockroach debug zip ./cockroach-data/logs/debug.zip --include-files=*.log

Redact sensitive information

Log redaction

Example of a log string without redaction enabled:

server/server.go:1423 ⋮ password of user ‹admin› was set to ‹"s3cr34?!@x_"›

Enable log redaction:

icon/buttons/copy
$ cockroach debug zip ./cockroach-data/logs/debug.zip --redact --insecure --host=200.100.50.25
server/server.go:1423 ⋮ password of user ‹×› was set to ‹×›

Hostname and IP address redaction

Example of status.json without hostname and IP address redaction enabled:

{
  "node_id": 1,
  "address": {
    "network_field": "tcp",
    "address_field": "200.100.50.25:26257"
  },
  "sql_address": {
    "network_field": "tcp",
    "address_field": "200.100.50.25:26257"
  }
}

First, enable the cluster setting:

icon/buttons/copy
SET CLUSTER SETTING debug.zip.redact_addresses.enabled = true;
Note:

Enabling the debug.zip.redact_addresses.enabled cluster setting will not redact all hostnames and IP addresses in the nodes.json and gossip.json files.

Then, generate .zip with log redaction as well as hostname and IP address redaction:

icon/buttons/copy
cockroach debug zip ./cockroach-data/logs/debug.zip --redact --insecure --host=200.100.50.25

status.json with hostname and IP address redaction:

{
  "node_id": 1,
  "address": {
    "network_field": "tcp",
    "address_field": "‹×›"
  },
  "sql_address": {
    "network_field": "tcp",
    "address_field": "‹×›"
  }
}

See also


Yes No
On this page

Yes No