SQL JOINs and how to use them, with examples

SQL JOINs and how to use them, with examples
[ Blog ]

Make your database faster

Learn SQL performance best practices to speed up your production database.

Hit the NOS!

Relational databases allow you to connect data across tables via enforced relationships, ensuring data accuracy and allowing you to keep queries efficient via small tables. But what happens when you need to look at related data that’s stored in two or more different tables at the same time?

That’s where the SQL JOIN statement comes in.

What are SQL JOINs and how do they work?

In SQL, a JOIN statement is used to combine data from two or more tables based on a column that the two of them share. For example, if we have a table of users that contains a user ID, and a table of orders that contains the user ID for the user who placed each order, we could join those tables using that shared user ID column.

Joins can be executed in a variety of ways, and you’ll hear them described using terms like “inner” and “outer”, “left” and “right.” We’ll cover all of these in more depth shortly, but in a nutshell:

  • Inner joins select for only the values that both tables share
  • Outer joins select for shared values plus all of the other values in one (or more) of the tables
  • Left joins select for all of the records from the first table in your join, combining them with the shared values from the second table. (Think of the tables like words in a sentence; you read from left to right, so the first table in your join is the “left” table).
  • Right joins select for all of the records from the second table in your join, combining them with the shared values from the first table.

If that sounds confusing, don’t worry – we’re going to look at all of these in more detail. But first, let’s establish some example tables so that we can look at something tangible rather than discussing joins in the abstract.

Our example tables

Let’s imagine we’re looking at data from an ecommerce website. We’ve got two very simple tables.

users

Our users table, which we can also think of as the “left” table or table 1 for the purposes of these exercises, describes a few of the website’s users, including their IC, name, and city. It looks like this:

user_id name city
1 beverly copernicus city
2 jean-luc la barre
3 geordi mogadishu

orders

Our orders table, which we can also think of as the “right” table or table 2 for the purposes of these exercises, describes orders from the website, including the order id, the user id of the user who placed the order, and a product name. It looks like this:

order_id user_id product
1001 2 tea (earl grey)
1002 4 risa gift card
1003 1 medical tricorder

Note that our two tables share a related column, user_id. This is the column that links these two tables, and it is the column that we will use to execute our various joins in this article.

Note: in a real-world database, we’d probably want to use a foreign key linking the orders table’s user_id column to the user_id column in users, to ensure orders couldn’t enter the database without a valid user associated with them. However, for our purposes demonstrating how joins work here, we won’t bother.

Want to follow along on your own machine?

If you would like to follow along using a real SQL database, you can copy and paste the following commands to generate these two tables and insert the relevant data. These commands are written in CockroachDB’s SQL syntax, which is very similar to PostgreSQL. You can create a free serverless CockroachDB database here and then follow along using the CockroachDB SQL client or another SQL client of your choice. You can also use these commands with pretty much any other SQL database, although slight adjustments may be required.

CREATE TABLE users (
   user_id INT PRIMARY KEY,
   name STRING,
   city STRING
);

INSERT INTO users (user_id, name, city) VALUES
(1, 'beverly', 'copernicus city'),
(2, 'jean-luc', 'la barre'),
(3, 'geordi', 'mogadishu');

CREATE TABLE orders (
   order_id INT PRIMARY KEY,
   user_id INT,
   product STRING
);

INSERT INTO orders (order_id, user_id, product) VALUES
(1001, 2, 'tea (earl grey)'),
(1002, 4, 'risa gift card'),
(1003, 1, 'medical tricorder');

Types of SQL JOINs

Now, we’ll take a look at each type of join and demonstrate an example of this type of join using the example tables we just established.

INNER JOIN

sql-inner-join

An INNER JOIN selects for only the values that both tables share.

Let’s look at how this works in practice: imagine we want to look at a table that makes it easier to see which users have purchased which products. We can do a join that returns the user ID, and name from the users table, and the product name from the orders table. An inner join will return those values only for rows where the order_id exists in both the users and orders tables.

Here’s the SQL query that will return our desired results:

SELECT users.user_id, users.name, orders.product
FROM users
INNER JOIN orders ON users.user_id = orders.user_id;

If we read this query line by line, we are:

  1. Selecting the column values we want the query to return
  2. Specifying the “left” table, the one we’re pulling these results from
  3. Specifying the join conditions, the table we’re joining with (i.e. orders, the “right” table), and the column to join on. In this case, we’re doing an inner join, so the query should only return the id, name, and city for users whose user ID appears in both tables.

And here’s the output that query returns when we execute it in the database with our example tables:

  user_id |   name   |      product
----------+----------+--------------------
        1 | beverly  | medical tricorder
        2 | jean-luc | tea (earl grey)
(2 rows)

As we can see, the inner join only returned results for the rows with user IDs 1 and 2, because these are the only user ids that occur in both tables.

LEFT (OUTER) JOIN

sql left join, left outer join

A LEFT JOIN selects for all of the values from the “left” (first) table, and matching values from the right table. Note that LEFT OUTER JOIN and LEFT JOIN do the same thing; some SQL databases may require using one or the other. In CockroachDB, either will work.

Let’s look at how that works in practice. We’ll use the same query we just used, except that we’ll specify that we want a LEFT JOIN rather than INNER JOIN. This will return results for all of the user IDs that are in the users table, and match each with the products they purchased in the orders table, if one exists. If not, the missing value will be returned as NULL.

SELECT users.user_id, users.name, orders.product
FROM users
LEFT JOIN orders ON users.user_id = orders.user_id;

Here are the results that query returns when executed against our example tables:

  user_id |   name   |      product
----------+----------+--------------------
        1 | beverly  | medical tricorder
        2 | jean-luc | tea (earl grey)
        3 | geordi   | NULL
(3 rows)

Here, we can see that the query has indeed returned all results from the “left” table (users) and matched them correctly to the products they purchased. Note that now the row including Geordi has been returned, whereas it wasn’t returned as part of the inner join. Since Geordi doesn’t appear in the orders table, a NULL value has been returned for the product column.

RIGHT (OUTER) JOIN

sql right join, right outer join

A RIGHT JOIN selects for all of the values from the “right” (second) table, and matching values from the right table (again, the OUTER part of the command is optional).

To put that into practice, we’ll use the same query again, making it a RIGHT JOIN this time:

SELECT users.user_id, users.name, orders.product
FROM users
RIGHT JOIN orders ON users.user_id = orders.user_id;

And here’s what that query returns when executed against our example tables:

  user_id |   name   |      product
----------+----------+--------------------
        1 | beverly  | medical tricorder
        2 | jean-luc | tea (earl grey)
     NULL | NULL     | risa gift card
(3 rows)

We can see that the query has returned all of the specified results from the “right” (orders) table, and filled in the details from the “left” (users) table where available. Note that again, missing values are indicated with NULL.

FULL (OUTER) JOIN

sql full join, full outer join

A FULL JOIN returns all of the results from both tables.

Again, to see what this looks like, we’ll use the same query, changing only the type of join we’re executing:

SELECT users.user_id, users.name, orders.product
FROM users
FULL JOIN orders ON users.user_id = orders.user_id;

And here are the results that query returns when executed against our example tables:

  user_id |   name   |      product
----------+----------+--------------------
        1 | beverly  | medical tricorder
        2 | jean-luc | tea (earl grey)
        3 | geordi   | NULL
     NULL | NULL     | risa gift card
(4 rows)

Here, we can see we have all of the relevant results from both tables. As with the other join types, NULL values are inserted to represent missing values.

Less common join types

While the four types of joins listed above are most frequently used, there are some other types of joins (and join-like operations) that are worth mentioning.

SELF JOIN

SQL tables can be joined with themselves using a JOIN command, and this can be useful in some situations. For example, imagine we have the following directory table, which lists the people – both teachers and students – in a school directory:

id name teacher_id
1 keiko
2 john 1
3 susan 1
4 alexander 1

Since this table is very small, it’s not difficult to see that Keiko is the teacher, and John, Susan, and Alexander are her students. But what if this was a real table with thousands of rows, and we wanted to output the names of students with their assigned teacher’s name next to them, rather than having to constantly cross-reference the ID numbers? We can use a self-join for that.

First, here’s the SQL required to create the directory table and insert the data, for those following along:

CREATE TABLE directory (
   id INT PRIMARY KEY,
   name STRING,
   teacher_id INT NULL
);

INSERT INTO directory (id, name, teacher_id) VALUES
(1, 'keiko', NULL),
(2, 'john', 1),
(3, 'susan', 1),
(4, 'alexander', 1);

Now that we’ve created our table and inserted the data, we can execute our self-join, which just uses the JOIN command. Note that we’re giving the table two different aliases, s and t, so that we can join it with itself. We’ll also set up some aliases here to make the final output a bit easier to read:

SELECT
   s.id AS student_id,
   s.name AS student_name,
   t.name AS teacher_name
FROM directory AS s
JOIN directory AS t ON s.teacher_id = t.id;

And here’s what that query will output:

  student_id | student_name | teacher_name
-------------+--------------+---------------
           2 | john         | keiko
           3 | susan        | keiko
           4 | alexander    | keiko
(3 rows)

CROSS JOIN

A CROSS JOIN is another special type of SQL. In mathematical terms, cross joins return the Cartesian product of the columns you opt to join.

Let’s go back to our original example table to demonstrate how a cross join functions. Note that this is not a great use case for cross joins; we’re just using this to demonstrate how they work.

Here is the query to execute the join:

SELECT users.name, orders.product
FROM users
CROSS JOIN orders;

And here is what that query outputs when executed against our example tables.

    name   |      product
-----------+--------------------
  beverly  | tea (earl grey)
  beverly  | risa gift card
  beverly  | medical tricorder
  jean-luc | tea (earl grey)
  jean-luc | risa gift card
  jean-luc | medical tricorder
  geordi   | tea (earl grey)
  geordi   | risa gift card
  geordi   | medical tricorder
(9 rows)

As we can see, using a cross join gives us all of the possible combinations of the values in users.name and orders.product. In the case of these two particular tables that’s not very helpful, but it could be useful in other circumstances.

As a simple example, imagine you have one table with a list of staff members, and another table with a list of tasks. If all staff members need to complete all tasks, a CROSS JOIN could be used to quickly generate a checklist similar to the list above, listing each staff member with each task they need to complete.

UNION

While UNION is not a join, it is a method for merging data from two distinct tables into a single output, and thus deserves mention here. UNION is used to return all distinct results from two separate SELECT statements, provided both return the similar data types, and provided both statements have the same number of columns.

For example, if we wanted to return a list of all of the names and products in our example database, we could use UNION like so:

SELECT name FROM users
UNION
SELECT product FROM orders;

That query returns the following output:

        name
---------------------
  beverly
  jean-luc
  geordi
  tea (earl grey)
  risa gift card
  medical tricorder
(6 rows)

SQL JOINs and database performance

We’ve looked at a variety of different JOIN commands here, and while an in-depth discussion of SQL performance best practices is outside the scope of this post, it’s worth noting that joins are not always executed in the same way, and a variety of factors including indexes and the join algorithm used can impact join performance, which can in turn impact the performance of your database as whole.

These factors will vary somewhat depending on the specifics of the relational database you’re using.

You can read more about how to make joins highly performance in CockroachDB here, as well as more about join algorithms and how they’re selected. If you’re using a different relational database, refer to its documentation for guidance on how to optimize the performance of your frequently-used joins.

About the author

Charlie Custer github link linkedin link

Charlie is a former teacher, tech journalist, and filmmaker who’s now combined those three professions into writing and making videos about databases and application development (and occasionally messing with NLP and Python to create weird things in his spare time).

Keep Reading

3 common foreign key mistakes (and how to avoid them)

Foreign keys are an important element of any relational database. But when you’re setting up your database schema, it’s …

Read more
Executing SQL queries from the browser

Over the last couple weeks, you might have noticed a new tab on the cluster view of CockroachDB Cloud! We just released …

Read more
How online schema changes are possible in CockroachDB

I periodically need to make changes to tables. Adding columns is very simple with the ALTER TABLE command… But my tables …

Read more