What if you could get all the advantages of Postgres, PLUS automated sharding, elastic scaling, and a multi-active setup that allows every node to accept reads AND writes for extreme resilience?
You can.Enforcing strict data types is one of the major advantages of relational databases, and PostgreSQL is one of the most popular open-source relational database options. In this article, we’ll look at many of the most commonly used data types in Postgres, how they’re used, and even how they map to more advanced distributed SQL databases.
(Note that when in doubt, you should always refer to Postgres’s official documentation for the latest information).
Data type | Size | Values allowed |
---|---|---|
boolean |
1 byte | true and false (plus aliases, see below) |
The boolean
data type stores true/false values. Note that this datatype also allows inputs of yes
, on
, and 1
(all equivalent to true
) and no
, off
, and 0
(all equivalent to false
).
Data type | Values allowed |
---|---|
character varying(n) , varchar(n) |
Any characters, maximum length of n characters |
character(n) , char(n) , bpchar(n) |
Any characters, maximum length of n characters, blank-padded |
bpchar |
Any characters, no length limit, blank-trimmed |
text |
Any characters, no length limit |
All of these data types are used for storing strings. Generally speaking, varchar(n)
is the right choice if you need to store strings that you know will remain constrained to a fixed length, and text
is the right choice otherwise.
The difference between char(n)
(and its aliases) and varchar(n)
(and its aliases) is that char(n)
is blank-padded, meaning that any strings with fewer than n
characters will have spaces appended to the end until they reach n
length.
So, for example, if we created two columns in a Postgres table, one with the data type char(3)
and one varchar(3)
, and then we added the string hi
to a row in both columns, the char(3)
column would store that as hi
(with a space appended to the end to make it three characters), whereas the varchar(3)
column would store it as hi
(no spaces appended).
Data type | Size | Values allowed |
---|---|---|
smallint |
2 bytes | -32768 to +32767 |
integer |
4 bytes | -2147483648 to +2147483647 |
bigint |
8 bytes | -9223372036854775808 to +9223372036854775807 |
As you can see, all of these data types store integer values. The difference between them is the range of values they can store, and consequently the amount of storage space they take up.
For most use cases, integer
is probably the right choice, but smallint
will save space if you know your values will all fit in that range, and bigint
is there for those who need to store truly massive numbers.
Data type | Size | Values allowed |
---|---|---|
smallserial |
2 bytes | 1 to 32767 |
serial |
4 bytes | 1 to 2147483647 |
bigserial |
8 bytes | 1 to 9223372036854775807 |
These data types are all used to generate auto-incrementing numbers. Again, the only difference between them is the amount of space they take up, and again, there are specific small
and big
options, but serial
is probably the best choice for most use cases.
Data type | Size | Values allowed |
---|---|---|
decimal |
variable | 131,072 digits before the decimal point; 16,383 after it |
numeric |
variable | 131,072 digits before the decimal point; 16,383 after it |
real |
4 bytes | 6 decimal digits |
double precision |
8 bytes | 15 decimal digits |
These data types all allow for the storage of numbers with decimals (i.e., not integers). decimal
and numeric
are the same, and should be used when precision is required, such as in financial applications. real
and double precision
take up less space, but also allow for fewer decimal places and thus may introduce imprecision, depending on the values you’re storing.
Data type | Size | Values allowed |
---|---|---|
timestamp (and timestamptz ) |
8 bytes | 4713 BCE to 294276 ACE |
date |
4 bytes | 4713 BCE 5874897 ACE |
time |
8 bytes | 00:00:00 to 24:00:00 |
timetz |
12 bytes | 00:00:00+1559 to 24:00:00-1559 |
interval |
16 bytes | -178000000 years to 178000000 years |
These data types store time and duration data. Note that timetz
and timestamptz
can also be written as time with timezone
and timestamp with timezone
, respectively.
All time data types except date
allow for microsecond-level precision, and can accept an optional provision value p
(e.g. timetz [(p)]
) to specify the number of fractional digits to store in the seconds field.
date
, for reasons that are probably obvious, stores values with a precision of one day.
Note also that in addition to traditional dates and times, date
and timestamp
can accept some special (and self-explanatory) values such as now
, yesterday
, tomorrow
, etc. now
will be stored as the current time; yesterday
, today
, and tomorrow
are stored as 0:00 UTC on the relevant date.
The uuid
data type is a special type of ID. We’ve got a whole blog post on UUIDs with details on what they are, how they’re used, the different types of UUIDs, etc., so we won’t repeat all that here. The short version, however, is this: UUIDs provide ID values that are highly likely to be unique, which is useful for distributed systems (see that blog post for lots of details), or any system that is likely to merge new data at any point.
(Imagine, for example, your company has a database with sequential IDs using serial
. You then acquire another company and need to merge the two databases, but discover the other company has also used serial
IDs. Now you have a bit of a mess to sort out since you can’t have the same row ID for two separate rows, and each database will have a 1
row, a 2
row, etc. Conversely, if both databases used UUIDs, you could easily merge the two with essentially no chance that two rows could have the same id value.)
Postgres can store JSON data in two formats: json
and jsonb
.
These two formats are effectively identical, except that jsonb
is stored in a decomposed binary format that makes it slower to write but faster to read. jsonb
also supports indexing, whereas json
does not.
Postgres allows you to store most data types as arrays – in other words, lists of values of that data type. Arrays are declared using []
, so for example, here’s how to create a table of users that allows for multiple phone numbers to be input:
CREATE TABLE users (
user_id uuid PRIMARY KEY,
name TEXT
phones TEXT []
);
Array values would be inserted into this table like so:
INSERT INTO users (name, phones)
VALUES('Ellen Ripley', ARRAY ['(555)-555-5555','(555)-555-5556']);
To declare array values, we can also use{}
like so:
INSERT INTO users (name, phones)
VALUES('Ellen Ripley', '{“(555)-555-5555”,”(555)-555-5556”}’);
Postgres also contains quite a few other data types, although they’re typically only needed in specific circumstances. For example, the tsvector
and tsquery
data types store lexemes (words normalized such that different variants of the word are stored as one word). These data types are useful for efficient text search.
Postgres also provides specific data types for network addresses, geometric shapes, enumerated types, XML types, and more. There are also a number of pseudo-types, which cannot be used as the data type for a column but can be used to define a function’s argument or results.
CockroachDB is a distributed SQL database that’s Postgres compatible – imagine Postgres that was designed for the cloud, that auto-shards and can accept writes on every shard, and that can scale elastically. (Sound interesting? Check it out!)
CockroachDB supports Postgres’s data types, although in some cases it uses different names. For example, varchar
and char
in PostgreSQL are equivalent to STRING
in CockroachDB.
For the full list of data types in CockroachDB, refer to our documentation.
Note also that because CockroachDB is a distributed database, the storage size and performance notes in this article may not be the same. Since CockroachDB is Postgres-compatible, migrating a Postgres database onto CockroachDB is quite straightforward, but some schema adjustment may be required to maximize performance (But don’t worry, we have an automated migration tool to help with this).
High CPU usage can bring your database – and with it, your application – grinding to a halt.
This is, unfortunately, a …
Read moreMost SQL content on the web seems to be written with data analysts in mind. And that’s fine, but developers need …
Read moreIt would be wrong to begin a comparison blog post about PostgreSQL without first acknowledging that it is one of the …
Read more