You are on page 1of 8

SQL Basics Cheat Sheet

SQL FILTERING THE OUTPUT QUERYING MULTIPLE TABLES


SQL, or Structured Query Language, is a language to talk to COMPARISON OPERATORS INNER JOIN FULL JOIN
databases. It allows you to select specific data and build
Fetch names of cities that have a rating above 3: JOIN(or explicitly INNER JOIN) returns rows that have FULL JOIN (or explicitly FULL OUTER JOIN) returns all rows
complex reports. Today, SQL is a universal language of data. It is
matching values in both tables. from both tables – if there's no matching row in the second
used in practically all technologies that process data. SELECT name table, NULLs are returned.
FROM city
SAMPLE DATA WHERE rating > 3; SELECT city.name, country.name
FROM city
SELECT city.name, country.name
FROM city
COUNTRY [INNER] JOIN country FULL [OUTER] JOIN country
id name population area ON city.country_id = country.id;
Fetch names of cities that are neither Berlin nor Madrid: ON city.country_id = country.id;
1 France 66600000 640680
2 Germany 80700000 357000 CITY COUNTRY
SELECT name CITY COUNTRY id name country_id id name
... ... ... ...
FROM city id name country_id id name 1 Paris 1 1 France
CITY WHERE name != 'Berlin' 1 Paris 1 1 France 2 Berlin 2 2 Germany
id name country_id population rating AND name != 'Madrid'; 2 Berlin 2 2 Germany 3 Warsaw 4 NULL NULL
1 Paris 1 2243000 5 3 Warsaw 4 3 Iceland NULL NULL NULL 3 Iceland
2 Berlin 2 3460000 3
... ... ... ... ...
TEXT OPERATORS
Fetch names of cities that start with a 'P' or end with an 's':
QUERYING SINGLE TABLE
Fetch all columns from the country table: SELECT name LEFT JOIN CROSS JOIN
FROM city LEFT JOINreturns all rows from the left table with CROSS JOIN returns all possible combinations of rows from
SELECT * WHERE name LIKE 'P%' corresponding rows from the right table. If there's no both tables. There are two syntaxes available.
FROM country; OR name LIKE '%s'; matching row, NULLs are returned as values from the second SELECT city.name, country.name
table. FROM city
Fetch id and name columns from the city table: CROSS JOIN country;
SELECT city.name, country.name
SELECT id, name Fetch names of cities that start with any letter followed by FROM city
FROM city; 'ublin' (like Dublin in Ireland or Lublin in Poland): SELECT city.name, country.name
LEFT JOIN country FROM city, country;
SELECT name ON city.country_id = country.id;
Fetch city names sorted by the rating column CITY COUNTRY
in the default ASCending order: FROM city CITY COUNTRY id name country_id id name
WHERE name LIKE '_ublin'; id name country_id id name 1 Paris 1 1 France
SELECT name 1 Paris 1 1 France 1 Paris 1 2 Germany
FROM city 2 Berlin 2 2 Germany 2 Berlin 2 1 France
ORDER BY rating [ASC]; 3 Warsaw 4 NULL NULL 2 Berlin 2 2 Germany
OTHER OPERATORS
Fetch city names sorted by the rating column Fetch names of cities that have a population between
in the DESCending order: 500K and 5M:
SELECT name SELECT name
FROM city FROM city RIGHT JOIN NATURAL JOIN
ORDER BY rating DESC; WHERE population BETWEEN 500000 AND 5000000; NATURAL JOIN will join tables by all columns with the same
RIGHT JOIN returns all rows from the right table with
corresponding rows from the left table. If there's no name.
ALIASES Fetch names of cities that don't miss a rating value:
matching row, NULLs are returned as values from the left
table.
SELECT city.name, country.name
FROM city
COLUMNS SELECT name NATURAL JOIN country;
SELECT city.name, country.name
SELECT name AS city_name FROM city CITY COUNTRY
FROM city
FROM city; WHERE rating IS NOT NULL; country_id id name name id
RIGHT JOIN country 6 6 San Marino San Marino 6
ON city.country_id = country.id;
TABLES 7 7 Vatican City Vatican City 7
5 9 Greece Greece 9
SELECT co.name, ci.name Fetch names of cities that are in countries with IDs 1, 4, 7, or 8: CITY COUNTRY
10 11 Monaco Monaco 10
id name country_id id name
FROM city AS ci SELECT name NATURAL JOIN used these columns to match rows:
1 Paris 1 1 France
JOIN country AS co FROM city 2 Berlin 2 2 Germany city.id, city.name, country.id, country.name
ON ci.country_id = co.id; WHERE country_id IN (1, 4, 7, 8); NULL NULL NULL 3 Iceland NATURAL JOIN is very rarely used in practice.
SQL Basics Cheat Sheet
AGGREGATION AND GROUPING SUBQUERIES SET OPERATIONS
GROUP BY groups together rows that have the same values in specified columns. A subquery is a query that is nested inside another query, or another subquery. There Set operations are used to combine the results of two or more queries into a
It computes summaries (aggregates) for each unique combination of values. are different types of subqueries. single result. The combined queries must return the same number of columns and
compatible data types. The names of the corresponding columns can be different.
CITY SINGLE VALUE
id name country_id
1 Paris 1
The simplest subquery returns exactly one column and exactly one row. It can be CYCLING SKATING
101 Marseille 1
CITY used with comparison operators =, <, <=, >, or >=. id name country id name country
country_id count 1 YK DE 1 YK DE
102 Lyon 1
1 3 This query finds cities with the same rating as Paris: 2 ZG DE 2 DF DE
2 Berlin 2
2 3 SELECT name FROM city 3 WT PL 3 AK PL
103 Hamburg 2
4 2 WHERE rating = ( ... ... ... ... ... ...
104 Munich 2
3 Warsaw 4 SELECT rating
105 Cracow 4 FROM city
WHERE name = 'Paris' UNION
AGGREGATE FUNCTIONS ); UNION combines the results of two result sets and removes duplicates.
UNION ALL doesn't remove duplicate rows.
• avg(expr) − average value for rows within the group MULTIPLE VALUES
This query displays German cyclists together with German skaters:
• count(expr) − count of values for rows within the group A subquery can also return multiple columns or multiple rows. Such subqueries can be
SELECT name
• max(expr) − maximum value within the group used with operators IN, EXISTS, ALL, or ANY. FROM cycling
• min(expr) − minimum value within the group This query finds cities in countries that have a population above 20M: WHERE country = 'DE'
• sum(expr) − sum of values within the group SELECT name UNION / UNION ALL
FROM city SELECT name
FROM skating
EXAMPLE QUERIES WHERE country_id IN (
SELECT country_id WHERE country = 'DE';
Find out the number of cities: FROM country
SELECT COUNT(*) WHERE population > 20000000 INTERSECT
FROM city; );
INTERSECT returns only rows that appear in both result sets.
Find out the number of cities with non-null ratings: CORRELATED This query displays German cyclists who are also German skaters at the same time:
SELECT COUNT(rating) A correlated subquery refers to the tables introduced in the outer query. A correlated SELECT name
FROM city; subquery depends on the outer query. It cannot be run independently from the outer FROM cycling
query. WHERE country = 'DE'
Find out the number of distinctive country values: INTERSECT
This query finds cities with a population greater than the average population in the
SELECT COUNT(DISTINCT country_id) country: SELECT name
FROM city; FROM skating
SELECT *
WHERE country = 'DE';
Find out the smallest and the greatest country populations: FROM city main_city
WHERE population > (
SELECT MIN(population), MAX(population) SELECT AVG(population) EXCEPT
FROM country; FROM city average_city EXCEPT for returns only the rows that appear in the first result set but do not
WHERE average_city.country_id = main_city.country_id appear in the second result set.
Find out the total population of cities in respective countries: );
SELECT country_id, SUM(population) This query displays German cyclists unless they are also German skaters at the
FROM city This query finds countries that have at least one city: same time:
GROUP BY country_id; SELECT name SELECT name
FROM country FROM cycling
Find out the average rating for cities in respective countries if the average is above 3.0: WHERE EXISTS ( WHERE country = 'DE'
SELECT country_id, AVG(rating) SELECT * EXCEPT / MINUS
FROM city FROM city SELECT name
GROUP BY country_id WHERE country_id = country.id FROM skating
HAVING AVG(rating) > 3.0; ); WHERE country = 'DE';
SQL Window Functions Cheat Sheet
WINDOW FUNCTIONS AGGREGATE FUNCTIONS VS. WINDOW FUNCTIONS PARTITION BY ORDER BY
compute their result based on a sliding window unlike aggregate functions, window functions do not collapse rows. divides rows into multiple groups, called partitions, to specifies the order of rows in each partition to which the
frame, a set of rows that are somehow related to which the window function is applied. window function is applied.
the current row. PARTITION BY city PARTITION BY city ORDER BY month
Aggregate Functions Window Functions month city sold month city sold sum sold city month sold city month
1 Rome 200 1 Paris 300 800 200 Rome 1 300 Paris 1

2 Paris 500 2 Paris 500 800 500 Paris 2 500 Paris 2
∑ 1 London 100 1 Rome 200 900 100 London 1 200 Rome 1
current row 1 Paris 300 2 Rome 300 900 300 Paris 1 300 Rome 2
2 Rome 300 3 Rome 400 900 300 Rome 2 400 Rome 3
∑ ∑
2 London 400 1 London 100 500 400 London 2 100 London 1
3 Rome 400 2 London 400 500 400 Rome 3 400 London 2

Default Partition: with no PARTITION BY clause, the entire Default ORDER BY: with no ORDER BY clause, the order of
SYNTAX result set is the partition. rows within each partition is arbitrary.

SELECT city, month, SELECT <column_1>, <column_2>, WINDOW FRAME


sum(sold) OVER ( <window_function>() OVER ( is a set of rows that are somehow related to the current row. The window frame is evaluated separately within each partition.
PARTITION BY city PARTITION BY <...>
ORDER BY month ORDER BY <...> ROWS | RANGE | GROUPS BETWEEN lower_bound AND upper_bound
RANGE UNBOUNDED PRECEDING) total <window_frame>) <window_column_alias> PARTITION UNBOUNDED
FROM sales; FROM <table_name>; PRECEDING The bounds can be any of the five options:
N PRECEDING

N ROWS
∙ UNBOUNDED PRECEDING
Named Window Definition CURRENT
∙ n PRECEDING
ROW ∙ CURRENT ROW
M ROWS ∙ n FOLLOWING
SELECT country, city, SELECT <column_1>, <column_2>, M FOLLOWING ∙ UNBOUNDED FOLLOWING
rank() OVER country_sold_avg <window_function>() OVER <window_name> UNBOUNDED
FROM sales FROM <table_name> FOLLOWING The lower_bound must be BEFORE the upper_bound
WHERE month BETWEEN 1 AND 6 WHERE <...>
GROUP BY country, city GROUP BY <...>
ROWS BETWEEN 1 PRECEDING RANGE BETWEEN 1 PRECEDING GROUPS BETWEEN 1 PRECEDING
HAVING sum(sold) > 10000 HAVING <...> AND 1 FOLLOWING AND 1 FOLLOWING AND 1 FOLLOWING
WINDOW country_sold_avg AS ( WINDOW <window_name> AS ( city sold month city sold month city sold month
PARTITION BY country PARTITION BY <...> Paris 300 1 Paris 300 1 Paris 300 1
ORDER BY avg(sold) DESC) ORDER BY <...> Rome 200 1 Rome 200 1 Rome 200 1
ORDER BY country, city; <window_frame>) Paris 500 2 Paris 500 2 Paris 500 2
ORDER BY <...>; Rome 100 4 Rome 100 4 Rome 100 4
current Paris 200 4 current Paris 200 4 current Paris 200 4
row row row
Paris 300 5 Paris 300 5 Paris 300 5
Rome 200 5 Rome 200 5 Rome 200 5
PARTITION BY, ORDER BY, and window frame definition are all optional. London 200 5 London 200 5 London 200 5
London 100 6 London 100 6 London 100 6
Rome 300 6 Rome 300 6 Rome 300 6
LOGICAL ORDER OF OPERATIONS IN SQL 1 row before the current row and values in the range between 3 and 5 1 group before the current row and 1 group
1 row after the current row ORDER BY must contain a single expression after the current row regardless of the value

1. FROM, JOIN 7. SELECT As of 2020, GROUPS is only supported in PostgreSQL 11 and up.
2. WHERE 8. DISTINCT
3. GROUP BY 9. UNION/INTERSECT/EXCEPT
4. aggregate functions 10. ORDER BY ABBREVIATIONS DEFAULT WINDOW FRAME
5. HAVING 11. OFFSET Abbreviation Meaning
6. window functions If ORDER BY is specified, then the frame is
12. LIMIT/FETCH/TOP UNBOUNDED PRECEDING BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW RANGE BETWEEN UNBOUNDED PRECEDING AND
n PRECEDING BETWEEN n PRECEDING AND CURRENT ROW CURRENT ROW.
You can use window functions in SELECT and ORDER BY. However, you can’t put window functions anywhere in the FROM, CURRENT ROW BETWEEN CURRENT ROW AND CURRENT ROW Without ORDER BY, the frame specification is
WHERE, GROUP BY, or HAVING clauses. n FOLLOWING BETWEEN AND CURRENT ROW AND n FOLLOWING ROWS BETWEEN UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING UNBOUNDED FOLLOWING.
SQL Window Functions Cheat Sheet
LIST OF WINDOW FUNCTIONS RANKING FUNCTIONS DISTRIBUTION FUNCTIONS
∙ row_number() − unique number for each row within a partition, with different ∙ percent_rank() − the percentile ranking number of a row—a value in [0, 1] interval:
Aggregate Functions numbers for tied values (rank - 1) / (total number of rows - 1)
∙ avg() ∙ rank() − ranking within a partition, with gaps and the same ranking for tied values ∙ cume_dist() − the cumulative distribution of a value within a group of values, i.e., the number of
∙ count() ∙ dense_rank() − ranking within a partition, with no gaps and the same ranking for tied rows with values less than or equal to the current row’s value divided by the total number of rows;
∙ max() values a value in (0, 1] interval
row_number rank dense_rank
∙ min() city price over(order by price) percent_rank() OVER(ORDER BY sold) cume_dist() OVER(ORDER BY sold)
∙ sum() Paris 7 1 1 1 city sold percent_rank city sold cume_dist
Rome 7 2 1 1 Paris 100 0 Paris 100 0.2
Ranking Functions London 8.5 3 3 2 Berlin 150 0.4
Berlin 150 0.25
∙ row_number() Berlin 8.5 4 3 2 Rome 200 0.5 Rome 200 0.8
Moscow 9 5 5 3 Moscow 200 0.5 without this row, 50% Moscow 200 0.8 80% of values are
∙ rank()
Madrid 10 6 6 4 London 300 1 of values are less than London 300 1 less than or equal
∙ dense_rank() Oslo 10 7 6 4 this row’s value to this one

Distribution Functions ORDER BY and Window Frame: rank() and dense_rank() require ORDER BY, but ORDER BY and Window Frame: Distribution functions require ORDER BY. They do not accept window frame
∙ percent_rank() row_number() does not require ORDER BY. Ranking functions do not accept window definitions (ROWS, RANGE, GROUPS).
∙ cume_dist() frame definitions (ROWS, RANGE, GROUPS).

Analytic Functions
∙ lead() ANALYTIC FUNCTIONS ∙ first_value(expr) − the value for the first row within the window frame
∙ lead(expr, offset, default) − the value for the row offset rows after the current; offset and ∙ last_value(expr) − the value for the last row within the window frame
∙ lag()
∙ ntile() default are optional; default values: offset = 1, default = NULL
∙ lag(expr, offset, default) − the value for the row offset rows before the current; offset and first_value(sold) OVER last_value(sold) OVER
∙ first_value() default are optional; default values: offset = 1, default = NULL (PARTITION BY city ORDER BY month) (PARTITION BY city ORDER BY month
∙ last_value() RANGE BETWEEN UNBOUNDED PRECEDING
lead(sold) OVER(ORDER BY month) lag(sold) OVER(ORDER BY month) city month sold first_value AND UNBOUNDED FOLLOWING)
∙ nth_value() Paris 1 500 500
month sold month sold Paris 2 300 500 city month sold last_value
order by month

order by month
1 500 300 1 500 NULL Paris 1 500 400
Paris 3 400 500
2 300 400 2 300 500 Paris 2 300 400
Rome 2 200 200
3 400 100 3 400 300 Paris 3 400 400
AGGREGATE FUNCTIONS 4 100 500
Rome 3 300 200
Rome 2 200 500
4 100 400 Rome 4 500 200
5 500 NULL 5 500 100 Rome 3 300 500
∙ avg(expr) − average value for Rome 4 500 500
rows within the window frame lag(sold, 2, 0) OVER(ORDER BY month)
lead(sold, 2, 0) OVER(ORDER BY month)
month sold Note: You usually want to use RANGE BETWEEN
month sold
order by month

∙ count(expr) − count of values


order by month

offset=2
1 500 400 1 500 0 UNBOUNDED PRECEDING AND UNBOUNDED
for rows within the window 2 300 100 2 300 0 FOLLOWING with last_value(). With the default
frame 3 400 500 3 400 500 window frame for ORDER BY, RANGE UNBOUNDED
4 100 300 PRECEDING, last_value() returns the value for
offset=2

4 100 0
∙ max(expr) − maximum value 5 500 0 5 500 400 the current row.
within the window frame

∙ ntile(n) − divide rows within a partition as equally as possible into n groups, and assign each ∙ nth_value(expr, n) − the value for the n-th row within the window frame; n must be an integer
∙ min(expr) − minimum value
row its group number. nth_value(sold, 2) OVER (PARTITION BY city
within the window frame ORDER BY month RANGE BETWEEN UNBOUNDED
ntile(3) PRECEDING AND UNBOUNDED FOLLOWING)
∙ sum(expr) − sum of values within city sold city month sold nth_value
the window frame Rome 100 1 Paris 1 500 300
Paris 100 1 1 1 Paris 2 300 300
London 200 1 Paris 3 400 300
Moscow 200 2 Rome 2 200 300
ORDER BY and Window Frame: Berlin 200 2 2 2 ORDER BY and Window Frame: ntile(), ORDER BY and Window Frame: first_value(),
Rome 3 300 300
Aggregate functions do not require an Madrid 300 2 lead(), and lag() require an ORDER BY. Rome 4 500 300 last_value(), and nth_value() do not
ORDER BY. They accept window frame Oslo 300 3 They do not accept window frame definition Rome 5 300 300 require an ORDER BY. They accept window frame
Dublin 300 3 3 3 (ROWS, RANGE, GROUPS). London 1 100 NULL definitions (ROWS, RANGE, GROUPS).
definition (ROWS, RANGE, GROUPS).
SQL JOINs Cheat Sheet
JOINING TABLES LEFT JOIN
JOIN combines data from two tables. LEFT JOIN returns all rows from the left table with matching rows from the right table. Rows without a match are filled
with NULLs. LEFT JOIN is also called LEFT OUTER JOIN.
TOY CAT
toy_id toy_name cat_id cat_id cat_name SELECT * toy_id toy_name cat_id cat_id cat_name
1 ball 3 1 Kitty FROM toy 5 ball 1 1 Kitty
2 spring NULL 2 Hugo LEFT JOIN cat 3 mouse 1 1 Kitty
3 mouse 1 3 Sam ON toy.cat_id = cat.cat_id; 1 ball 3 3 Sam
4 mouse 4 4 Misty 4 mouse 4 4 Misty
5 ball 1 2 spring NULL NULL NULL
whole left table

JOIN typically combines rows with equal values for the specified columns. Usually, one table contains a primary key,
which is a column or columns that uniquely identify rows in the table (the cat_id column in the cat table).
The other table has a column or columns that refer to the primary key columns in the first table (the cat_id column in RIGHT JOIN
the toy table). Such columns are foreign keys. The JOIN condition is the equality between the primary key columns in
one table and columns referring to them in the other table. RIGHT JOIN returns all rows from the right table with matching rows from the left table. Rows without a match are
filled with NULLs. RIGHT JOIN is also called RIGHT OUTER JOIN.

JOIN SELECT *
FROM toy
toy_id
5
toy_name
ball
cat_id
1
cat_id
1
cat_name
Kitty
JOIN returns all rows that match the ON condition. JOIN is also called INNER JOIN. RIGHT JOIN cat 3 mouse 1 1 Kitty
ON toy.cat_id = cat.cat_id; NULL NULL NULL 2 Hugo
SELECT * toy_id toy_name cat_id cat_id cat_name 1 ball 3 3 Sam
FROM toy 5 ball 1 1 Kitty 4 mouse 4 4 Misty
JOIN cat 3 mouse 1 1 Kitty whole rig ht table
1 ball 3 3 Sam
ON toy.cat_id = cat.cat_id;
4 mouse 4 4 Misty
There is also another, older syntax, but it isn't recommended.
List joined tables in the FROM clause, and place the conditions in the WHERE clause. FULL JOIN
SELECT * FULL JOIN returns all rows from the left table and all rows from the right table. It fills the non-matching rows with
FROM toy, cat NULLs. FULL JOIN is also called FULL OUTER JOIN.
WHERE toy.cat_id = cat.cat_id; SELECT * toy_id toy_name cat_id cat_id cat_name
FROM toy 5 ball 1 1 Kitty
JOIN CONDITIONS FULL JOIN cat 3
NULL
mouse
NULL
1
NULL
1
2
Kitty
Hugo
ON toy.cat_id = cat.cat_id;
The JOIN condition doesn't have to be equality – it can be any condition you want. JOIN doesn't interpret the JOIN 1 ball 3 3 Sam
condition, it only checks if the rows satisfy the given condition. 4 mouse 4 4 Misty
2 spring NULL NULL NULL
whole left table whole right table
To refer to a column in the JOIN query, you have to use the full column name: first the table name, then a dot (.) and the
column name:
ON cat.cat_id = toy.cat_id
You can omit the table name and use just the column name if the name of the column is unique within all columns in the CROSS JOIN
joined tables.
CROSS JOIN returns all possible combinations of rows from the left and right tables.

NATURAL JOIN SELECT *


FROM toy
toy_id
1
toy_name
ball
cat_id
3
cat_id
1
cat_name
Kitty
If the tables have columns with the same name, you can use CROSS JOIN cat; 2 spring NULL 1 Kitty
NATURAL JOIN instead of JOIN. cat_id toy_id toy_name cat_name 3 mouse 1 1 Kitty
1 5 ball Kitty Another syntax: 4 mouse 4 1 Kitty
SELECT * 1 3 mouse Kitty 5 ball 1 1 Kitty
SELECT * 1 ball 3 2 Hugo
FROM toy 3 1 ball Sam
4 4 mouse Misty FROM toy, cat; 2 spring NULL 2 Hugo
NATURAL JOIN cat;
3 mouse 1 2 Hugo
The common column appears only once in the result table. 4 mouse 4 2 Hugo
Note: NATURAL JOIN is rarely used in real life. 5 ball 1 2 Hugo
1 ball 3 3 Sam
··· ··· ··· ··· ···
SQL JOINs Cheat Sheet
COLUMN AND TABLE ALIASES MULTIPLE JOINS
Aliases give a temporary name to a table or a column in a table. You can join more than two tables together. First, two tables are joined, then the third table is joined to the result of the
CAT AS c OWNER AS o previous joining.
cat_id cat_name mom_id owner_id id name
1 Kitty 5 1 1 John Smith
TOY AS t
2 Hugo 1 2 2 Danielle Davis CAT AS c
3 Sam 2 2 toy_id toy_name cat_id OWNER AS o
cat_id cat_name mom_id owner_id
4 Misty 1 NULL 1 ball 3 id name
1 Kitty 5 1
2 spring NULL 1
John
2 Hugo 1 2 Smith
A column alias renames a column in the result. A table alias renames a table within the query. If you define a table alias, 3 mouse 1
3 Sam 2 2 2
Danielle
you must use it instead of the table name everywhere in the query. The AS keyword is optional in defining aliases. 4 mouse 4 Davis
4 Misty 1 NULL
5 ball 1
SELECT
o.name AS owner_name, cat_name owner_name
c.cat_name Kitty John Smith JOIN & JOIN JOIN & LEFT JOIN LEFT JOIN & LEFT JOIN
FROM cat AS c Sam Danielle Davis
Hugo Danielle Davis SELECT SELECT SELECT
JOIN owner AS o t.toy_name, t.toy_name, t.toy_name,
ON c.owner_id = o.id; c.cat_name, c.cat_name, c.cat_name,
o.name AS owner_name o.name AS owner_name o.name AS owner_name
SELF JOIN FROM toy t
JOIN cat c
FROM toy t
JOIN cat c
FROM toy t
LEFT JOIN cat c
You can join a table to itself, for example, to show a parent-child relationship. ON t.cat_id = c.cat_id ON t.cat_id = c.cat_id ON t.cat_id = c.cat_id
CAT AS child CAT AS mom JOIN owner o LEFT JOIN owner o LEFT JOIN owner o
cat_id cat_name owner_id mom_id cat_id cat_name owner_id mom_id ON c.owner_id = o.id; ON c.owner_id = o.id; ON c.owner_id = o.id;
1 Kitty 1 5 1 Kitty 1 5
toy_name cat_name owner_name toy_name cat_name owner_name toy_name cat_name owner_name
2 Hugo 2 1 2 Hugo 2 1
ball Kitty John Smith ball Kitty John Smith ball Kitty John Smith
3 Sam 2 2 3 Sam 2 2 mouse Kitty John Smith mouse Kitty John Smith mouse Kitty John Smith
4 Misty NULL 1 4 Misty NULL 1 ball Sam Danielle Davis ball Sam Danielle Davis ball Sam Danielle Davis
mouse Misty NULL mouse Misty NULL
Each occurrence of the table must be given a different alias. Each column reference must be preceded with an spring NULL NULL
appropriate table alias.
SELECT
child.cat_name AS child_name, child_name mom_name
mom.cat_name AS mom_name Hugo Kitty
FROM cat AS child Sam Hugo JOIN WITH MULTIPLE CONDITIONS
JOIN cat AS mom Misty Kitty
You can use multiple JOIN conditions using the ON keyword once and the AND keywords as many times as you need.
ON child.mom_id = mom.cat_id;

CAT AS c OWNER AS o
NON-EQUI SELF JOIN cat_id cat_name mom_id owner_id age id name age
1 Kitty 5 1 17 1 John Smith 18
You can use a non-equality in the ON condition, for example, to show all different pairs of rows. 2 Hugo 1 2 10 2 Danielle Davis 10
TOY AS a TOY AS b 3 Sam 2 2 5
toy_id toy_name cat_id cat_id toy_id toy_name 4 Misty 1 NULL 11
3 mouse 1 1 3 mouse
5 ball 1 1 5 ball
1 ball 3 3 1 ball SELECT
4 mouse 4 4 4 mouse cat_name,
2 spring NULL NULL 2 spring o.name AS owner_name,
c.age AS cat_age, cat_name owner_name age age
SELECT cat_a_id toy_a cat_b_id toy_b o.age AS owner_age Kitty John Smith 17 18
a.toy_name AS toy_a, 1 mouse 3 ball FROM cat c Sam Danielle Davis 5 10
b.toy_name AS toy_b 1 ball 3 ball
JOIN owner o
FROM toy a 1 mouse 4 mouse
1 ball 4 mouse ON c.owner_id = o.id
JOIN toy b AND c.age < o.age;
3 ball 4 mouse
ON a.cat_id < b.cat_id;
Standard SQL Functions Cheat Sheet
TEXT FUNCTIONS NUMERIC FUNCTIONS NULLs CASE WHEN
CONCATENATION BASIC OPERATIONS To retrieve all rows with a missing value in the price column: The basic version of CASE WHEN checks if the values are equal
WHERE price IS NULL (e.g., if fee is equal to 50, then normal' is returned). If there
Use the || operator to concatenate two strings: Use / to do some basic math. To get the number of
isn't a matching value in the CASE WHEN, then the ELSE value
SELECT 'Hi ' || 'there!'; seconds in a week:
To retrieve all rows with the weight column populated: will be returned (e.g., if fee is equal to 49, then not
-- result: Hi there SELECT 60 * 60 * 24 * 7; -- result: 604800
WHERE weight IS NOT NULL available will show up.
Remember that you can concatenate only character strings using
CASTING SELECT
|. Use this trick for numbers: Why shouldn't you use price = NULL weight != NULL? CASE fee
From time to time, you need to change the type of a number. The
SELECT '' || 4 || 2; WHEN 50 THEN 'normal'
CAST() function is there to help you out. It lets you change the Because databases don't know if those expressions are true or
-- result: 42 false – they are evaluated as NULL WHEN 10 THEN 'reduced
type of value to almost anything (integer numeric double
Some databases implement non-standard solutions for Moreover, if you use a function or concatenation on a column that WHEN 0 THEN 'free
precision varchar, and many more).
concatenating strings like CONCAT() CONCAT_WS(). Check s NULL in some rows, then it will get propagated. Take a look: ELSE 'not available'
Get the number as an integer (without rounding):
the documentation for your specific database. END AS tariff
SELECT CAST(1234.567 AS integer)
domain LENGTH(domain) FROM ticket_types;
LIKE OPERATOR–PATTERN MATCHING -- result: 1234
Change a column type to double precision LearnSQL.com 12 The most popular type is the searched CASE WHEN – it lets you
Use the character to replace any single character. Use the %
SELECT CAST(column AS double precision) LearnPython.com 15 pass conditions (as you'd write them in the WHERE clause),
character to replace any number of characters (including 0
characters). evaluates them in order, then returns the value for the first
USEFUL FUNCTIONS ULL ULL
condition met.
Fetch all names that start with any letter followed by Get the remainder of a division: vertabelo.com 13 SELECT
'atherine SELECT MOD(13, 2) CASE
SELECT -- result: 1 WHEN score >= 90 THEN 'A'
ROM USEFUL FUNCTIONS WHEN score > 60 THEN 'B'
WHERE name LIKE '_atherine' Round a number to its nearest integer: COALESCE(x, y, ...) ELSE 'F
Fetch all names that end with a SELECT ROUND(1234.56789) To replace NULL in a query with something meaningful: ND AS grade
SELECT -- result: 1235 SELECT FROM test_results;
ROM Round a number to three decimal places: domain, Here, all students who scored at least 90 will get an A, those with
WHERE name LIKE '%a SELECT ROUND(1234.56789, 3) COALESCE(domain, 'domain missing') the score above 60 (and below 90) will get a B, and the rest will
FROM contacts; receive an F.
USEFUL FUNCTIONS -- result: 1234.568
PostgreSQL requires the first argument to be of the type domain coalesce
Get the count of characters in a string:
SELECT LENGTH('LearnSQL.com')
numeric – cast the number when needed.
LearnSQL.com LearnSQL.com
TROUBLESHOOTING
-- result: 12 To round the number up Integer division
NULL domain missing When you don't see the decimal places you expect, it means that
Convert all letters to lowercase: SELECT CEIL(13.1) -- result: 14
SELECT CEIL(-13.9); -- result: -13 The COALESCE() function takes any number of arguments and you are dividing between two integers. Cast one to decimal
SELECT LOWER('LEARNSQL.COM )
The CEIL(x) function returns the smallest integer not less than returns the value of the first argument that isn't NULL CAST(123 AS decimal) / 2
-- result: learnsql.com
x. In SQL Server, the function is called CEILING() Division by 0
Convert all letters to uppercase:
SELECT UPPER('LearnSQL.com ) To round the number down: NULLIF(x, y) To avoid this error, make sure that the denominator is not equal
-- result: LEARNSQL.COM SELECT FLOOR(13.8); -- result: 13 To save yourself from division by 0 errors: to 0. You can use the NULLIF() function to replace 0 with a
SELECT FLOOR(-13.2); -- result: -14 SELECT NULL, which will result in a NULL for the whole expression:
Convert all letters to lowercase and all first letters to uppercase
last_month, count / NULLIF(count_all, 0)
(not implemented in MySQL and SQL Server): The FLOOR(x) function returns the greatest integer not the
greater this_month,
SELECT INITCAP('edgar frank ted cODD ) Inexact calculations
than x this_month * 100.0
-- result: Edgar Frank Ted Codd If you do calculations using real (floating point) numbers, you'll
/ NULLIF(last_month, 0)
Get just a part of a string: To round towards 0 irrespective of the sign of a number: end up with some inaccuracies. This is because this type is meant
AS better_by_percent
SELECT SUBSTRING('LearnSQL.com', 9) SELECT TRUNC(13.5); -- result: 13 for scientific calculations such as calculating the velocity.
FROM video_views;
-- result: .com SELECT TRUNC(-13.5); -- result: -13 Whenever you need accuracy (such as dealing with monetary
last_month this_month better_by_percent values), use the decimal / numeric type (or money if
SELECT SUBSTRING('LearnSQL.com', 0, 6) TRUNC(x) works the same way as CAST(x AS integer). In
-- result: Learn MySQL, the function is called TRUNCATE() available).
723786 1085679 150.0
Replace part of a string: To get the absolute value of a number: Errors when rounding with a specified precision
0 178123 ULL
SELECT REPLACE('LearnSQL.com', 'SQL', Most databases won't complain, but do check the documentation
SELECT ABS(-12); -- result: 12
'Python') The NULLIF(x, y) function will return NULL if is the same as if they do. For example, if you want to specify the rounding
-- result: LearnPython.com To get the square root of a number: y, else it will return the value. precision in PostgreSQL, the value must be of the numeric type.
SELECT SQRT(9); -- result: 3
Standard SQL Functions Cheat Sheet
AGGREGATION AND GROUPING DATE AND TIME INTERVALs TIME ZONES
COUNT(expr) − the count of values for the rows within the There are 3 main time-related types: date time, and Note: In SQL Server, intervals aren't implemented – use the In the SQL Standard, the date type can't have an associated time
group timestamp. Time is expressed using a 24-hour clock, and it can DATEADD() and DATEDIFF() functions. zone, but the time and timestamp types can. In the real world,
SUM(expr) − the sum of values within the group be as vague as just hour and minutes (e.g., 15:30 – 3:30 p.m.) or time zones have little meaning without the date, as the offset can
AVG(expr) − the average value for the rows within the group as precise as microseconds and time zone (as shown below): To get the simplest interval, subtract one-time value from vary through the year because of daylight saving time. So, it's
MIN(expr) − the minimum value within the group another: best to work with the timestamp values.
2021-12-31 14:39:53.662522-05
MAX(expr) − the maximum value within the group
date time SELECT CAST('2021-12-31 23:59:59' AS When working with the type timestamp with time zone
To get the number of rows in the table: timestamp
timestamp) CAST( 2021-06-01 12:00:00' AS (abbr. timestamptz), you can type in the value in your local
SELECT COUNT( ) timestamp) time zone, and it'll get converted to the UTC time zone as it is
YYYY-mm-dd HH:MM:SS.ssssss±TZ -- result: 213 days 11:59:59 inserted into the table. Later when you select from the table it
FROM city;
4:39:53.662522-05 is almost 2:40 p.m. CDT (e.g., in gets converted back to your local time zone. This is immune to
To get the number of non-NULL values in a column: time zone changes.
SELECT COUNT(rating) Chicago; in UTC it'd be 7:40 p.m.). The letters in the above To define an interval: INTERVAL '1' DAY
ROM city; example represent: This syntax consists of three elements: the INTERVAL keyword, a
In the date part: In the time part: quoted value, and a time part keyword (in a singular form.) You AT TIME ZONE
To get the count of unique values in a column: canuse the following time parts: YEAR MONTH WEEK DAY To operate between different time zones, use the AT TIME
YYYY – the 4-digit HH – the zero-padded hour in a 24-
SELECT COUNT(DISTINCT country_id) HOUR MINUTE, and SECOND. In MySQL, omit the quotes. You ZONE keyword.
year. hour clock.
FROM city; can join many different INTERVALs using the + operator:
mm – the zero-padded MM – the minutes. If you use this format: {timestamp without time zone}
month (01—January SS – the seconds. Omissible INTERVAL '1' YEAR + INTERVAL '3' MONTH
GROUP BY AT TIME ZONE {time zone}, then the database will read
through 12 ssssss – the smaller parts of a the time stamp in the specified time zone and convert it to the
CITY December). second –can be expressed using In some databases, there's an easier way to get the above value. time zone local to the display. It returns the time in the
country_id dd – the zero-padded 1 to 6 digits. Omissible And it accepts plural forms! INTERVAL '1 year 3 formatted timestamp with the time zone
day. ±TZ – the timezone. It must start months
Paris 1 If you use this format: {timestamp with time zone} AT
CITY with either or, and use two There are two more syntaxes in the Standard SQL:
Marseille 1 TIME ZONE {time zone}, then the database will convert the
digits relative to UTC. Omissible
country_id count Syntax What it does time in one time zone to the target time zone specified by AT
Lyon 1
1 3 TIME ZONE. It returns the time in the formatted timestamp
INTERVAL 'x-y' YEAR TO INTERVAL 'x year y
Berlin 2 What time is it? without the time zone, in the target time zone.
2 3 MONTH month
To answer that question in SQL, you can use:
Hamburg 2 You can define the time zone with popular shortcuts like UTC
4 2 CURRENT_TIME – to find what time it is. INTERVAL 'x-y' DAY TO INTERVAL 'x day y
Munich 2 MST, or GMT, or by continent/city such as
CURRENT_DATE – to get today's date. (GETDATE() in SQL SECOND second
America/New_York Europe/London, and Asia/Tokyo
Warsaw 4 Server.)
CURRENT_TIMESTAMP – to get the timestamp with the two In MySQL, write year_month instead of YEAR TO MONTH and
Cracow 4 day_second instead of DAY TO SECOND
above. Examples
The example above – the count of cities in each country:
We set the local time zone to 'America/New_York'
SELECT name, COUNT(country_id) To get the last day of a month, add one month and subtract one
FROM city Creating values SELECT TIMESTAMP '2021-07-16 21:00:00' AT
day:
GROUP BY name; To create a date time or timestamp, simply write the value as TIME ZONE 'America/Los_Angeles';
a string and cast it to the proper type. SELECT CAST('2021-02-01' AS date
-- result: 2021-07-17 00:00:00-04
The average rating for the city: INTERVAL '1' MONTH
SELECT CAST( 2021-12-31' AS date)
SELECT city_id, AVG(rating) INTERVAL '1' DAY Here, the database takes a timestamp without a time zone and it's
SELECT CAST( 15:31' AS time)
ROM ratings told it's in Los Angeles time, which is then converted to the local
SELECT CAST( 2021-12-31 23:59:29+02' AS
GROUP BY city_id; time – New York for display. This answers the question "At what
timestamp) To get all events for the next three months from today:
time should I turn on the TV if the show starts at 9 PM in Los
Common mistake: COUNT(*) and LEFT JOIN SELECT CAST( 15:31.124769' AS time) SELECT event_date, event_name
Angeles?"
When you join the tables like this: client LEFT JOIN Be careful with the last example – it will be interpreted as 15 FROM calendar
project, and you want to get the number of projects for every minutes 31 seconds and 124769 microseconds! It is always a good WHERE event_date BETWEEN CURRENT_DATE AND
idea to write 00 explicitly for hours: '00:15:31.124769 CURRENT_DATE + INTERVAL '3' MONTH SELECT TIMESTAMP WITH TIME ZONE '2021-06-20
client you know, COUNT(*) will return 1 for each client even if
19:30:00' AT TIME ZONE 'Australia/Sydney';
you've never worked for them. This is because they're still
You might skip casting in simple conditions – the database will -- result: 2021-06-21 09:30:00
present in the list but with the NULL in the fields related to the To get part of the date:
project after the JOIN. To get the correct count (0 for the clients know what you mean. SELECT EXTRACT(YEAR FROM birthday) Here, the database gets a timestamp specified in the local time
you've never worked for), count the values in a column of the SELECT airline, flight_number, departure_time FROM artists; zone and converts it to the time in Sydney (note that it didn't
other table, e.g., COUNT(project_name). Check out this FROM airport_schedule One of the possible returned values: is 1946. In SQL Server, use the return a time zone.) This answers the question "What time is it in
exercise to see an example. WHERE departure_time < '12:00'; DATEPART(part, date) function. Sydney if it's 7:30 PM here?"

You might also like