When you build a database it is more convenient to store information in several tables. Indeed, it avoids redundancy, makes tables easier to maintain, and establishes the right connections between them. Working on real projects, you will often need to use more than one table to get the information you want.
This is where JOINs come in. In theory, SQL join is a SQL (Structured Query Language) instruction to combine records from two tables based on a search condition. Records are joined when the search condition evaluates to true. The resulting row consists of all the columns from both joined tables.
Types of JOINs
In a JOIN, there are two tables: the left table and the right table. The left table and the right table are specified in the JOIN clause, and the result of the joining will contain all the rows from the left table and the matching rows from the right table. If there is no match in the right table, the result will contain the non-matching rows from the left table with NULL values for the columns from the right table.
There are different types of JOINs in SQL:
[INNER] JOIN: for each record from the left table, all records from the right table that satisfy the merge condition will be merged with the original record and added to the resulting table. If no matching records are found in the right table, the original record is also not added to the resulting table (using keyword
INNERis optional).LEFT/RIGHT JOIN: returns all the records from the left/right table and the matching values from the right/left table. If some records in one of the tables (for example, the left one) do not have a match in the other (for example, the right one) table, the corresponding columns in the output of this right table will be assigned NULLs.
FULL [OUTER] JOIN: returns all records when there is a match in either the left or the right table. Records that do not have a match are assigned NULLs (using keyword
OUTERis optional).
CROSS JOIN: joins up every record from the left table and every record in the right table (the algorithm is illustrated in the picture below). Note that we don’t specify the condition on which we join the two tables.
So id1 from table1 gets joined with idA, idB and idC, id2 also gets joined with idA, idB and idC and the same holds true for id3.
INNER JOIN
To see how different types of JOINs work, let's consider the following tables: character, pet, and universe.
The character table has columns named character_id (INT), character_name (VARCHAR(30)), and universe_id (INT):
character_id | character_name | universe_id |
|---|---|---|
1 | Harry Potter | 1 |
2 | Severus Snape | 1 |
3 | Voldemort | 1 |
4 | Hermione Granger | 1 |
5 | Tyrion Lannister | 2 |
6 | Daenerys Targaryen | 2 |
7 | Argus Filch | 1 |
The pet table has columns named character_id (INT)and pet_species (VARCHAR(50)):
character_id | pet_species |
|---|---|
1 | owl |
3 | snake |
4 | cat |
6 | dragon "Drogo" |
6 | dragon "Rhaegal" |
6 | dragon "Viserion" |
8 | puppy |
9 | unicorn |
The universe table has columns named universe_id (INT), universe_name (VARCHAR(30)), and author (VARCHAR(30)):
universe_id | universe_name | author |
|---|---|---|
1 | Harry Potter | Joanne Rowling |
2 | Game of Thrones | George R. R. Martin |
The common syntax for INNER JOIN is the following:
SELECT table1.col_name1, table2.col_name2
FROM table1 [AS a]
INNER JOIN table2 [AS b]
ON table1.id = table2.id;Notice, that the join condition can be some other than equality. The commands in brackets are optional.
Ordering results of INNER JOIN
Returning to our practical example, imagine you would like to know what pet belongs to what character. To do this, we need to execute the following query:
SELECT character_name, pet_species
FROM character
INNER JOIN pet
ON character.character_id = pet.character_id
ORDER BY character.character_name;The resulting table from this query will be the following:
character_name | pet_species |
|---|---|
Daenerys Targaryen | dragon "Drogo" |
Daenerys Targaryen | dragon "Rhaegal" |
Daenerys Targaryen | dragon "Viserion" |
Harry Potter | owl |
Hermione Granger | cat |
Voldemort | snake |
However, as you can see, not all characters were included in the result. This happened because some records in the left table didn’t have matches in the right table. In our example not all characters had pets. To fix this problem and include all the characters in the resulting table, we can use LEFT JOIN. We've combined information from two tables by matching rows where character_id was the same in the two tables.
SELECT character_name, pet_species
FROM character
LEFT JOIN pet
ON character.character_id = pet.character_id
ORDER BY character.character_name;character_name | pet_species |
|---|---|
Argus Filch | NULL |
Daenerys Targaryen | dragon "Drogo" |
Daenerys Targaryen | dragon "Rhaegal" |
Daenerys Targaryen | dragon "Viserion" |
Harry Potter | owl |
Hermione Granger | cat |
Severus Snape | NULL |
Tyrion Lannister | NULL |
Voldemort | snake |
So far we've worked only with two tables, but we can also join together multiple tables. For example, we also want to know which universe our characters come from. As you can see, values that didn’t have matches in the right table were assigned with NULL values.
SELECT c.character_name, p.pet_species, u.universe_name
FROM universe AS u
LEFT JOIN character AS c
ON u.universe_id = c.universe_id
LEFT JOIN pet AS p
ON p.character_id = c.character_idcharacter_name | pet_species | universe_name |
|---|---|---|
Harry Potter | owl | Harry Potter |
Severus Snape | NULL | Harry Potter |
Voldemort | snake | Harry Potter |
Hermione Granger | cat | Harry Potter |
Tyrion Lannister | NULL | Game of Thrones |
Daenerys Targaryen | dragon "Drogo" | Game of Thrones |
Daenerys Targaryen | dragon "Rhaegal" | Game of Thrones |
Daenerys Targaryen | dragon "Viserion" | Game of Thrones |
Argus Filch | NULL | Harry Potter |
To understand how CROSS JOINs may be useful, think of the chess competitions in which organizers of these competitions need to pair up all the participants. Also, note that when selecting columns, it is not mandatory to write table_name1.col_name. If the column name is unique for all tables, you can omit it. However, we still recommend specifying it because the names of the columns become ambiguous when you have multiple tables.
Note that we can use aliases for table names (please, recall that SQL aliases are used to give a table (or a column in a table) a temporary name, for example, SELECT person_name AS name). They are extremely helpful if tables have long names.
We should also mention that it is not mandatory to use JOIN if you want to connect tables. You can use the WHERE condition to get the same result. Please consider the script below. It would give an identical result to the one above:
SELECT table1.col_name1, table2.col_name2
FROM table1 [AS a], table2 [AS b]
WHERE table1.id = table2.id;The commands in brackets are optional.
Conclusion
To join multiple tables, follow this general template:
SELECT val1 [AS name1], ..., valN [AS nameN]
FROM table1
[type_of_join] JOIN table2
ON table1.col_name_table1 = table2.col_name_table2
[type_of_join] JOIN table3
ON table2.col_name_table2 = table3.col_name_table3;Congratulations! Now you know how to query multiple tables. Let's practice, shall we?