You are on page 1of 23

CAPSTONE PROJECT

GRADED PROJECT

NAME: PRIYANKA D S

Problem Statement:
This is a Store wise Sales and inventory datasets for a retail store in the United
States of America. The object of this project is to use the data and analyse the
sales at various layers (such as product, store, city, state etc.) We shall also be
answering a few key questions that shall help in pricing and product placement
decisions.
Domain: Retail Analytics
Data Description: The data contains three tables related to orders placed in a
super store.
 OrderDetails: The table contains 5000 different Order IDs, the order
date, Property ID, Product ID and the quanities
 Products: The table contains 94 different products, the category they
belong to and the price at which they are sold.
 PropertyInfo: The table contains 20 cities where the stores are based,
along with the state names.

Page | 1
1. What is the maximum quantity of any order ID in the data? Also,
determine the number of orders placed which have this maximum
quantity.(2 marks)

select orderid,max(quantity)
from tr_orderdetails
group by orderid;

select orderid, max(quantity) maximum_quantity,count(price) as


maximum_order
from tr_orderdetails a left join tr_products b on a.productid = b.productid
where orderid = 1000;

Page | 2
2. Find the number of unique products that are sold.

select count(distinct(ProductName)) as unique_product


from tr_products;

3. List the different types of “Chair” that are sold by using product
table (Hint:TR_Products) (2marks)

select distinct(productName)
from tr_products
where productName like '%chair%';

Page | 3
4. What is the average price of each of these chair listed in the output of
previous question? (2 marks)

select productName, avg(price)


from tr_products
where productName like '%chair%'
group by productname;

5. Find the details of the Properties where the state names are more
than 10 characters in length? (2 marks)

Page | 4
select *
from tr_propertyinfo
where length(propertystate)>10;

6. Find the details of the Properties where the second character of the
city name is “e”.(2 marks)

select *
from tr_propertyinfo
where propertycity like '_e%';

7. Find the minimum and maximum prices for products in the “Office
Supplies” category (2 marks)

select productname,min(price),max(price)
from tr_products
Page | 5
where productcategory = 'office supplies'
group by productname;

8. What is the purpose of using GROUP BY in SQL? (Hint: This is a


theoretical question and needs to be explained with an clear example
other than the application given in this project) (2 marks)

GROUP BY

To obtain the summary data based on one or more groups use the GROUP BY
clause. One or more columns may be used to form the groups.
For instance, the GROUP BY query will be used to calculate the total salary for
each department or to count the number of employees in each department.
 The groups of records are created using the GROUP BY clause.
 If a WHERE clause is present, it must appear before the HAVING clause
and be followed by the GROUP BY clause.

Page | 6
 One or more columns may be included in the GROUP BY clause to
create one or more groups depending on those columns.
 The SELECT clause can only contain columns from the GROUP BY
table.

Syntax:
SELECT column1, column2,...columnN FROM table_name
[WHERE]
[GROUP BY column1, column2...columnN]
[HAVING]
[ORDER BY]

Example:

Employee Table
EmpId FirstName LastName Email Salary DeptId
1 John King john.king@abc.com 33000 1
2 Jame Bond 1
3 Neena Kochhar neena@test.com 17000 2
4 Lex De Haan lex@test.com 15000 1
5 Amit Patel 18000 1
6 Abdul Kalam abdul@test.com 25000 2

Department Table
DeptId Name
1 Finance
2 HR

Query and result


select DepId, count(empid)
from employee
groupby depid;
DeptId count(empid)
1 4
2 2

Page | 7
9. List the different states in which sales are made and count how many
orders are there in each of the states? (Hint: Consider order details
as the primary table) (2 marks)

select propertystate, count(orderid) orders_count


from tr_orderdetails a left join tr_propertyinfo b
on a.propertyid = b.propertyid
group by propertystate;

10.Find the average price of items sold in each Product Category and
sort it in a decreasing order. (2 marks)

select productcategory,productname, avg(price)


from tr_products
group by productcategory,productname

Page | 8
order by productcategory,avg(price) desc;

Page | 9
11.Find the Product Category that sells the least number of products?
Something for the management to focus on. (2 marks)
select productcategory,count(productid)
from tr_products
group by productcategory
order by count(productid)
limit 1;

Page | 10
12.What is the difference between a WHERE v/s HAVING clause in
SQL? (Hint: This is a theoretical question and needs to be explained
with an clear example other than the application given in this
project) (2 marks)

WHERE clause

Used when merging multiple tables, the WHERE Clause is used to filter the
records from the table.
Only the records that meet the condition stated in the WHERE clause will be
extracted. It can be used with statements that SELECT, UPDATE, or DELETE
data.
Employee Table
EmpId FirstName LastName Email Salary DeptId
1 John King john.king@abc.com 33000 1
2 James Bond 1
3 Neena Kochhar neena@test.com 17000 2
4 Lex De Haan lex@test.com 15000 1
5 Amit Patel 18000 1
6 Abdul Kalam abdul@test.com 25000 2

Department Table
DeptId Name
1 Finance

Page | 11
DeptId Name
2 HR

Select firstname, lastname, salary


From employee
Where salary > 25000;
firstname lastName salary
John King 33000

HAVING clause
The HAVING Clause is used to filter the records from the groups according to
the specified criterion.
The groups that meet the requirement will be included in the result. Having
clause can only be used with SELECT statement.
Example:
Select depId, avg(salary)
From employee
Group by depId
Having avg(salary) >20,000
DeptId avg(salary)
2 21,000

Difference between WHERE and HAVING clause


1) The WHERE Clause is used to filter the table's records according to the
given condition.
Records from groups are filtered using the HAVING clause according to the
given condition.
2) You can use the WHERE clause without the GROUP BY clause.
Without the GROUP BY Clause, you cannot use the HAVING Clause.
3) Row operations employ the WHERE Clause

Page | 12
Implements the HAVING Clause in Column Operation
4) An aggregate function cannot be contained in the WHERE Clause.
Agg function may be contained in the HAVING Clause.
13.Select the Product categories where the average price is more than 25
(2 marks)

select productcategory, avg(price)


from tr_products
group by productcategory
having avg(price)>25
order by avg(price);

14.Find the top 5 products IDs that sold the maximum quantities? (2
marks)

select productid , max(quantity) maximum_quantity


from tr_orderdetails
group by productid
limit 5;

Page | 13
15.For the above question, print the product names instead of Product
IDs. (2 marks)

select productname , max(quantity) max_quantity


from tr_orderdetails a left join tr_products b on a.productid = b.productid
group by productname;

Page | 14
16.Mention the different types of joins in SQL? Give simple examples of
each. Also represent them using Venn diagrams (Hitnt: This is a
theoretical question, the explanation needs to be in detail along with
an example other than the one given in this project) (2 marks)

In order to combine data or rows from two or more tables based on a shared
field, use the SQL Join statement.
The following list includes many join types:
 INNER JOIN
 LEFT JOIN
 RIGHT JOIN
 FULL JOIN

Employee table
EmpID EmpFname EmpLname Age EmailID PhoneNo Address
1 Vardhan Kumar 22 vardy@abc.com 9876543210 Delhi
2 Himani Sharma 32 himani@abc.com 9977554422 Mumbai
3 Aayushi Shreshth 24 aayushi@abc.com 9977555121 Kolkata
4 Hemanth Sharma 25 hemanth@abc.com 9876545666 Bengaluru
5 Swatee Kapoor 26 swatee@abc.com 9544567777 Hyderabad

Project table
ProjectID EmpID ClientID ProjectName ProjectStartDate
111 1 3 Project1 2019-04-21
222 2 1 Project2 2019-02-12
333 3 5 Project3 2019-01-10
444 3 2 Project4 2019-04-16
555 5 4 Project5 2019-05-23
666 9 1 Project6 2019-01-12
777 7 2 Project7 2019-07-25
888 8 3 Project8 2019-08-20

Client table
ClientID ClientFname ClientLname Age ClientEmailID PhoneNo Address EmpID

Page | 15
1 Susan Smith 30 susan@adn.com 9765411231 Kolkata 3
2 Mois Ali 27 mois@jsq.com 9876543561 Kolkata 3
3 Soma Paul 22 soma@wja.com 9966332211 Delhi 1
4 Zainab Daginawala 40 zainab@qkq.com 9955884422 Hyderabad 5
5 Bhaskar Reddy 32 bhaskar@xyz.com 9636963269 Mumbai 2

INNER JOIN

If the criteria are met, the INNER JOIN keyword selects all rows from both
tables. This keyword will combine all rows from both tables whose conditions,
i.e., the common field's value, are met to produce the result set.
SYNTAX
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
INNER JOIN table2
ON table1.matching_column = table2.matching_column;

Fig : Venn diagram of inner join

Example:
SELECT Employee.EmpID, Employee.EmpFname, Employee.EmpLname,
Projects.ProjectID, Projects.ProjectName
FROM Employee
INNER JOIN Projects ON Employee.EmpID=Projects.EmpID;
EmpID EmpFname EmpLname ProjectID ProjectName
1 Vardhan Kumar 111 Project1
2 Himani Sharma 222 Project2
3 Aayushi Shreshth 333 Project3
3 Aayushi Shreshth 444 Project4
5 Swatee Kapoor 555 Project5

Page | 16
LEFT JOIN

This join returns all the rows of the table on the left side of the join and matches
rows for the table on the right side of the join. The result-set will include null
for all rows for which there is no matching row on the right side.
LEFT OUTER JOIN is another name for LEFT JOIN.
SYNTAX
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;

Fig : Venn diagram of left join


Example:
SELECT Employee.EmpFname, Employee.EmpLname, Projects.ProjectID,
Projects.ProjectName
FROM Employee
LEFT JOIN
ON Employee.EmpID = Projects.EmpID ;
EmpFname EmpLname ProjectID ProjectName
Vardhan Kumar 111 Project1
Himani Sharma 222 Project2
Aayushi Shreshth 333 Project3
Aayushi Shreshth 444 Project4
Swatee Kapoor 555 Project5
Hemanth Sharma NULL NULL

Page | 17
RIGHT JOIN
This join returns all the rows of the table on the right side of the join and
matches rows for the table on the left side of the join. The result-set will include
null for all rows for which there is no matching row on the left side.
RIGHT OUTER JOIN is another name for RIGHT JOIN.
SYNTAX
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
RIGHT JOIN table2
ON table1.matching_column = table2.matching_column;

Fig : Venn diagram of right join


Example:
SELECT Employee.EmpFname, Employee.EmpLname, Projects.ProjectID,
Projects.ProjectName
FROM Employee
RIGHT JOIN
ON Employee.EmpID = Projects.EmpID;
EmpFname EmpLname ProjectID ProjectName
Vardhan Kumar 111 Project1
Himani Sharma 222 Project2
Aayushi Shreshth 333 Project3
Aayushi Shreshth 444 Project4
Swatee Kapoor 555 Project5
NULL NULL 666 Project6
NULL NULL 777 Project7
NULL NULL 888 Project8

Page | 18
FULL JOIN
FULL JOIN creates the result-set by combining results of both LEFT JOIN and
RIGHT JOIN. The rows from both tables are all included in the result-set.
The result-set will contain NULL values for the rows where there was no match.
SYNTAX
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_column;

Fig : Venn diagram of full join


Example:
SELECT Employee.EmpFname, Employee.EmpLname, Projects.ProjectID
FROM Employee
FULL JOIN Projects
ON Employee.EmpID = Projects.EmpID;
EmpFname EmpLname ProjectID
Vardhan Kumar 111
Himani Sharma 222
Aayushi Shreshth 333
Aayushi Shreshth 444
Hemanth Sharma NULL
Swatee Kapoor 555
NULL NULL 666
NULL NULL 777
NULL NULL 888

17.Determine the 5 products that give the overall minimum sales? (Hint:
Sales = Quantity * Price) (2 marks)

Page | 19
select a.ProductID,b.productname,quantity*price as sales
from tr_orderdetails a left join tr_products b on a.ProductID = b.ProductID
group by a.ProductID,sales
order by sales
limit 5;

18.Repeat the above query for the City of “Orlando”. (2 marks)

select a.ProductID,b.productname,quantity*price as sales


from tr_orderdetails a left join tr_products b on a.ProductID = b.ProductID
where a.PropertyID in (select a.PropertyID from tr_orderdetails a inner join
tr_propertyinfo b
on a.PropertyID=b.PropertyID
where b.PropertyCity = 'orlando')
group by a.ProductID,sales
order by sales
limit 5;

Page | 20
19.What is the difference between Drop, Truncate and Delete? Explain
with examples. (2 marks)

TRUNCATE

A SQL statement deletes all entries from a table without logging the deletion of
each individual row.
Example:
Truncate table employee;
 TRUNCATE is a DDL command.
 To remove all records, TRUNCATE uses a table lock to execute, locking
the entire table.
 TRUNCATE does not support the WHERE clause.
 TRUNCATE removes a table of all rows.
DELETE
The Data Manipulation Language, a subset of SQL that enables the
manipulation of data in databases, includes the Delete command.
Existing records in a table can be deleted with this command.

Page | 21
We can use this to either delete all the records from a table or selected records
based on a condition.
Example:
Delete from employee
Where empid = 2;
 DELETE is a DML command .
 We can select and delete particular records by using the where clause
with DELETE.
 A table's rows can be deleted using the DELETE command depending on
the WHERE criteria.
 It takes longer than TRUNCATE since it keeps the log.
 Rows are deleted one at a time using the DELETE command, which also
logs each removed row in the transaction log.
DROP
DROP table query removes one or more table definitions and all data, indexes,
triggers, constraints, and permission specifications for those tables.
Example:
Drop table department;
 A table is deleted from the database using the DROP command.
 The privileges, rows, and indexes of each table will also be deleted.
 The operation cannot be rolled back.
 While DELETE is a DML operation, DROP and TRUNCATE are DDL
commands.
20.Which are the cities that belong to the same states? (2 marks)
select a.propertycity, a.propertystate
from tr_propertyinfo as a
inner join tr_propertyinfo as b
on a.PropertyState = b.PropertyState and a.PropertyCity<>b.PropertyCity;

Page | 22
Page | 23

You might also like