You are on page 1of 5

Sign in Get started

Follow 622K Followers · Editors' Picks Features Deep Dives Grow Contribute About

Sign in to your account (ya__@h__.com) for your personalized experience.

Send login link Not you? Sign in or create an account

SQL Interview Preparation Guide


A resource covering four common types of SQL problems

Priyanka Meena 1 day ago · 3 min read

Photo by Caspar Camille Rubin on Unsplash

SQL interviews are one of the most determining parts of any analytics
interview, be it data or product, let alone business analytics. Major tech firms
such as Amazon, Uber, Facebook, just to name a few, rely heavily on this round.

If you are preparing for one, going through and revising all possible variations
of SQL questions might seem a bit daunting. In order to help you in the process,
here is a guide with some sample questions and queries that you must practice
before sitting for your next SQL interview. I have prepared this based on my
own experiences, while being on both sides of the table.

SQL problems can be divided into 4 levels. As a part of this guide, we will go
through each of these levels with some standard examples that you can
practice along (I would suggest without reading the solution).

Level I : Problems based on Aggregate Functions


SQL is excellent when it comes to aggregations. There are many functions
such as SUM(), AVG(), MAX(), MIN(), COUNT() etc. Knowledge of
aggregate functions is the basic level of competency that is expected from
an interviewee.

Consider the following employees table. Each row of this table indicates the
employee details such as their department, salary, manager etc.

1 -- Table: employees
2 -- | dept_id | employee_id | amount | manager_id |
3 -- |---------|-------------|--------|------------|
4 -- | 1 | 1 | 8000 | 3 |
5 -- | 1 | 2 | 5000 | 3 |
6 -- | 1 | 3 | 10000 | null |
7 -- | 2 | 4 | 15000 | null |
8 -- | 2 | 5 | 16000 | 4 |
9 -- | 3 | 6 | 8000 | null |

table details.txt
hosted with ❤ by GitHub view raw

Based on this table, write a SQL query to find the id of employees who earn
the highest amount in each department.

Best way to solve any problem is to map the problem in a step by step logic.
For example, understanding what we are solving for, in this particular case,
highest amount for each department. Then, figuring out our output
format; here we just need employee_id.

1 -- Part 1: Get the highest salary in each department


2 SELECT max(amount) AS salary
3 From employees
4 GROUP BY dept_id
5
6 -- Part 2: Get the desired output format employee_id
7 -- Since employee_id cannot be directly used in the group by aggregation, we might have resort t
8
9 SELECT e1.employee_id
10 FROM employees e1
11 WHERE e1.amount IN (
12 SELECT max(e2.amount) AS amount
13 From employees as e2
14 GROUP BY e2.dept_id
15 HAVING e1.dept_id = e2.dept_id )
16

solution1.sql
hosted with ❤ by GitHub view raw

Level II : Problems based on JOINs and SET operations


SQL provides its users capability to combine results from two or more tables
with the help of joins and set operations. Some of the popular joins are
inner join, left join, right join and cross join. While most popularly used set
operators are UNION, UNION ALL, EXCEPT, INTERCEPT etc.

Consider the above-mentioned employees table. Write a SQL query to find


the employees who earn more than their manager.

1 -- part1 : Bring manager salary along side employee salary using self join
2
3 SELECT e1.employee_id
4 FROM employees as e1
5 LEFT JOIN employees as e2 ON e1.manager_id = e2.employee_id
6
7 -- part2 : Filter employees who earn more than manager salary
8
9 SELECT e1.employee_id
10 FROM employees as e1
11 LEFT JOIN employees as e2 ON e1.manager_id = e2.employee_id
12 AND e1.amount > e2.amount

solution2.sql
hosted with ❤ by GitHub view raw

Level III : Problems based on Windows function


Windows functions, also known as analytics functions, are the most
amazing thing that SQL provides. Some of the popular analytics functions
are RANK(), DENSE_RANK(), LEAD(), LAG() etc.

Let’s go back to the first problem. We have used a subquery to find the
employee who earns the highest salary. We can do it easily using a windows
function as well. Try without looking at the solution.

1 -- Part 1: Rank the employee_ids by highest salary for each department using DENSE_RANK()
2
3 SELECT employee_id,
4 DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY amount desc) rnk
5 from employees
6
7 -- Part 2: Filter the rows where rnk = 1
8
9 SELECT employee_id
10 FROM
11 (SELECT employee_id,
12 DENSE_RANK() OVER (PARTITION BY dept_id ORDER BY amount desc) rnk
13 from employees) a
14 WHERE rnk = 1

solution3.sql
hosted with ❤ by GitHub view raw

Level IV : Problems based on combination of the above-mentioned


levels
Sometimes you will come across problems that might seem a bit difficult at
first. The best strategy for solving such problems is following a stepwise
logical approach. Break down the problem into smaller problems. Someone
has rightly said that practice makes perfect. The more you solve such
problems, the better you become at breaking them logically and finally
solving them. Try a few on leetcode.

Consider the following attendance table. Each row in the table contains the
employee_id and the date on which they visited the office.

Write a SQL query to find the longest streak for each of the employees.
Output should contain the employee’s name and its longest streak.

1 -- Table: attendance
2 -- | employee_id | attend_dt |
3 -- |-------------|-------------|
4 -- | 1 | 2022-01-01 |
5 -- | 1 | 2022-01-02 |
6 -- | 1 | 2022-01-05 |
7 -- | 2 | 2022-01-01 |
8 -- | 2 | 2022-01-02 |
9 -- | 2 | 2022-01-04 |
10 --`| 2 | 2022-01-05 |
11 -- | 2 | 2022-01-06 |
12 -- | 3 | 2022-01-02 |
13 -- | 3 | 2022-01-04 |
14
15 -- Table: employees
16 -- | employee_id | name |
17 -- |-------------|-------------|
18 -- | 1 | samuel |
19 -- | 2 | karthik |
20 -- | 3 | casey |

island_gaps_table.txt
hosted with ❤ by GitHub view raw

1 -- part 1: Give id to each row in the table


2 select * , row_number() over (partition by employee_id order by attend_dt asc) rn
3 from attendance
4
5 -- part 2: Find the day from the date field and find the difference between rn and day
6 -- This will help us in create groups of continous streaks
7
8 select *, day(attend_at) - rn
9 from
10 (select *, row_number() over (partition by employee_id order by attend_at asc) rn
11 from attendance)
12
13 -- Our table will look something like this now
14 -- rn| employee_id | attend_dt |day|group_name(day-rn)|
15 -- --|-------------|-------------|---|------------------
16 -- 1 | 1 | 2022-01-01 |1 |0
17 -- 2 | 1 | 2022-01-02 |2 |0
18 -- 3 | 1 | 2022-01-05 |5 |2
19 -- 1 | 2 | 2022-01-01 |1 |0
20 -- 2 | 2 | 2022-01-02 |2 |0
21 -- 3 | 2 | 2022-01-04 |4 |1
22 -- 4 | 2 | 2022-01-05 |5 |1
23 -- 5 | 2 | 2022-01-06 |6 |1
24 -- 1 | 3 | 2022-01-02 |2 |1
25 -- 2 | 3 | 2022-01-04 |4 |2
26
27 -- part 3 : Find the count for each group_name and each employee
28
29 select employee_id, group_name, count(*) streak
30 from
31 (select *, (day(attend_at) - rn ) group_name
32 from
33 (select *, row_number() over (partition by employee_id order by attend_at asc) rn
34 from attendance) a ) b
35 group by employee_id, group_name
36
37 -- part 4 : Finding the longest streak
38
39 select employee_id, max(streak) longest_streak
40 from
41 (select employee_id, group_name, count(*) streak
42 from
43 (select *, (day(attend_at) - rn ) group_name
44 from
45 (select *, row_number() over (partition by employee_id order by attend_at asc) rn
46 from attendance) a ) b
47 group by employee_id, group_name ) c
48
49 -- part 5 : Arranging the data in the desired output format
50
51 select e.name, d.longest_streak
52 from
53 (select employee_id, max(streak) longest_streak
54 from
55 (select employee_id, group_name, count(*) streak
56 from
57 (select *, (day(attend_at) - rn ) group_name
58 from
59 (select *, row_number() over (partition by employee_id order by attend_at asc) rn
60 from attendance) a ) b
61 group by employee_id, group_name ) c ) d
62 join
63 (select *
64 from employees) e on d.employee_id = e.employee_id

island_gaps_solution.sql
hosted with ❤ by GitHub view raw
It’s a Wrap!
In this article, we have discussed some standard SQL problems that you
might come across in your next SQL tech interview. These problems are
based on my personal experiences. Write down more such problems that
you might have faced in the comments below. It will surely help the
community to ace it together.

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from
hands-on tutorials and cutting-edge research to original features you don't want to
miss. Take a look.

Get this newsletter

Sql Interview Data Analyst Data Science Programming

About Write Help Legal

You might also like