You are on page 1of 29

THE

LOOK E-COMMERCE
INTERMEDIATE AND ADVANCE
ASSIGNMENT

Start Slide
The Look
About The
Project

DATA DEFINITION
Public Dataset in BigQuery TheLook is a fictitious eCommerce clothing
TheLook E-commerce site developed by the Looker team. The
dataset contains information about
customers, product, orders, logistics, web
events and digital marketing campaigns. The
contents of this dataset are synthetic, and are
provided to industry practitioners for the
purpose of product discovery, testing, and
evaluation.
The Look
About The
Project

ENTITY
RELATIONSHIP
DIAGRAM (ERD)
PK Primary Key

FK Foreign Key
The Look

QUESTION #1
Create a query to get the number of unique users, number of orders, and total sale price
per status and month with a time frame from Jan 2019 until Aug 2022
The Look

QUESTION #1
SELECT
DATE_TRUNC(DATE(orders.created_at),month) AS month_year,
orders.status AS order_status,
COUNT(DISTINCT users.id) AS total_of_unique_users,
COUNT(orders.order_id) AS total_orders,
ROUND(SUM(order_items.sale_price),2) AS total_sale_price
FROM
`bigquery-public-data.thelook_ecommerce.users` users
INNER JOIN
`bigquery-public-data.thelook_ecommerce.orders` orders
ON
users.id = orders.user_id
INNER JOIN
`bigquery-public-data.thelook_ecommerce.order_items` order_items ON
users.id = order_items.order_id
WHERE
DATE(orders.created_at) BETWEEN '2019-01-01'
AND '2022-08-31'
GROUP BY
1,
2
ORDER BY
1
The Look

QUESTION #2
Create a query to get frequencies, average order value, and total number of unique users
where status is completely grouped by month with the time frame from Jan 2019 until
Aug 2022
The Look

QUESTION #2
SELECT
DATE_TRUNC(DATE(orders.created_at),month) AS month_year,
COUNT(orders.order_id) AS frequencies,
ROUND(SUM(order_items.sale_price)/COUNT(orders.order_id),2) AS AOV,
COUNT(DISTINCT users.id) AS unique_buyers
FROM
`bigquery-public-data.thelook_ecommerce.users` users
LEFT JOIN
`bigquery-public-data.thelook_ecommerce.orders` orders
ON
users.id = orders.user_id
LEFT JOIN
`bigquery-public-data.thelook_ecommerce.order_items` order_items ON
users.id = order_items.order_id
WHERE
DATE(orders.created_at) BETWEEN '2019-01-01'
AND '2022-08-31'
AND LOWER(orders.status) = 'complete'
GROUP BY
1
ORDER BY
1
The Look

QUESTION #3
Find the user id, email, first and last name of users whose status is refunded on Aug 22
The Look

QUESTION #3
WITH
users AS (
SELECT
SELECT
user_id,
*
FROM user_email,
`bigquery-public-data.thelook_ecommerce.users`), user_first_name,

orders AS ( user_last_name

SELECT FROM
*
main
FROM
ORDER BY
`bigquery-public-data.thelook_ecommerce.orders`
WHERE 3
status = 'Returned'AND DATE(returned_at) BETWEEN '2022-08-01'
AND '2022-08-31'),
main AS (
SELECT
DISTINCT users.id AS user_id,
users.email AS user_email,
users.first_name AS user_first_name,
users.last_name AS user_last_name
FROM
users
INNER JOIN
orders
ON
users.id = orders.user_id)
The Look

QUESTION #4
Get the top 5 least and most profitable product over all time
The Look

QUESTION #4
WITH ON ON
orders AS ( products.id = order_items.product_id order_items.order_id =
SELECT INNER JOIN orders.order_id
* orders GROUP BY
FROM ON 1,2,3,4
`bigquery-public-data.thelook_ecommerce.orders`), order_items.order_id = orders.order_id ORDER BY
order_items AS ( GROUP BY 5
SELECT 1,2,3,4 LIMIT
* ORDER BY 5)(
FROM 5 DESC SELECT
`bigquery-public-data.thelook_ecommerce.order_items`), LIMIT *,
products AS ( 5), RANK() OVER (ORDER BY a.profit
SELECT b AS ( DESC) ranking
* SELECT FROM
FROM DISTINCT products.id AS product_id, a)
`bigquery-public-data.thelook_ecommerce.products`), products.name AS product_name, UNION ALL (
a AS ( ROUND(products.retail_price,2) AS retail_price, SELECT
SELECT ROUND(products.cost,2) AS cost, *,
DISTINCT products.id AS product_id, ROUND((SUM(products.retail_price)-SUM(products.cost)),2) AS profit, RANK() OVER (ORDER BY b.profit)
products.name AS product_name, 'least profitable' AS product_conclusion ranking
ROUND(products.retail_price,2) AS retail_price, FROM FROM
ROUND(products.cost,2) AS cost, products b)
ROUND((SUM(products.retail_price)-SUM(products.cost)),2) AS profit, INNER JOIN order by 6 desc, 7
'most profitable' AS product_conclusion order_items
FROM ON
products products.id = order_items.product_id
INNER JOIN INNER JOIN
order_items orders
The Look

QUESTION #5
Create a query to get Month to Date of total profit in each product categories of past 3
months (current date 15 Aug 2022), breakdown by month and categories

Every 15 days on the


same month, we’ll get
cumulative sum, but
after entering the
new month, the
cumulative number
will return to the
beginning

With this table, we


can compare profit
per day across
different month.
The Look

QUESTION #5
WITH mtd_july AS (
products AS ( SELECT
SELECT DATE(order_items.created_at) AS date1,
* products.category AS category,
FROM ROUND(SUM(products.retail_price)-SUM(products.cost),2) AS profit
`bigquery-public-data.thelook_ecommerce.products`), FROM
order_items AS ( products
SELECT INNER JOIN
* order_items
FROM ON
`bigquery-public-data.thelook_ecommerce.order_items`), products.id = order_items.product_id
mtd_june AS ( WHERE
SELECT DATE(order_items.created_at) BETWEEN '2022-07-01'
DATE(order_items.created_at) AS date1, AND '2022-07-15'
products.category AS category, GROUP BY
ROUND(SUM(products.retail_price)-SUM(products.cost),2) AS profit 1,
FROM 2
products ORDER BY
INNER JOIN 1),
order_items
ON
products.id = order_items.product_id
WHERE
DATE(order_items.created_at) BETWEEN '2022-06-01'
AND '2022-06-15'
GROUP BY
1,
2
ORDER BY
1),
The Look

QUESTION #5
mtd_august AS ( UNION ALL (
SELECT SELECT
DATE(order_items.created_at) AS date1, *,
products.category AS category, ROUND(SUM(profit) OVER(PARTITION BY category ORDER BY
date1), 2) AS cumulative_profit_per_month
ROUND(SUM(products.retail_price)-SUM(products.cost),2) AS profit
FROM
FROM
mtd_july)
products
UNION ALL (
INNER JOIN
SELECT
order_items *,
ON ROUND(SUM(profit) OVER(PARTITION BY category ORDER BY
products.id = order_items.product_id date1), 2) AS cumulative_profit_per_month
FROM
WHERE
mtd_august)
DATE(order_items.created_at) BETWEEN '2022-08-01'
ORDER BY
AND '2022-08-15'
2,1)
GROUP BY
SELECT
1,2
*
ORDER BY
FROM
1),
union_1
union_1 AS ((
SELECT
*,
ROUND(SUM(profit) OVER(PARTITION BY category ORDER BY date1), 2) AS
cumulative_profit_per_month
FROM
mtd_june)
The Look

QUESTION #6
Find monthly inventory growth in percentage breakdown by product categories with a time frame from Jan
2019 until Apr 2022. After analyzing the monthly growth, is there any fascinating insight that we can get?
The Look

QUESTION #6
The Look

INSIGHTS
Let’s take a look at growth line
chart for all categories.

● There are 26 categories of clothing that


TheLook sold.
● Trend profit throughout 2019 is very low,
after entering 2020 until now, there’s a
significant increase in profit. I assume that
this phenomenon was occured because of
the pandemic which started at the
beginning of 2020. Consumers begin to buy
clothing online.
There are significant decrease from Mar
2022 to Apr 2020, I assume that the data
is not complete yet, or indeed there’s a
real decrease in profit during this month,
but for now I can’t conclude anything so it
needs further investigating.
The Look

MORE
INSIGHTS
Top 3 categories that
contribute the most profit

Outerwear & Coats


Jeans

Sweaters
The Look

MORE
INSIGHTS

Top 3 categories that


contribute the least profit

Clothing Sets

Jumpsuits & Rompers

Leggings
The Look

QUESTION #7 (Cohort Analysis)


Create monthly retention cohorts (the groups, or cohorts, can be defined based upon the date that a user
completely purchased a product) and then how many of them (%) coming back for the following months in
2022
The Look

QUESTION #7 (Cohort Analysis)


WITH WHERE
a AS ( DATE(DATE_TRUNC(created_at, month)) between '2022-01-01' and
SELECT '2022-09-01'
user_id, AND orders.status='Complete'
MIN(DATE(DATE_TRUNC(created_at, month))) AS cohort_month GROUP BY
FROM 1,
`bigquery-public-data.thelook_ecommerce.order_items` 2 ),
WHERE c AS (
DATE(DATE_TRUNC(created_at, month)) between '2022-01-01' and SELECT
'2022-09-01' cohort_month,
AND status='Complete' COUNT(*) AS num_orders
GROUP BY FROM
1 ), a
b AS ( GROUP BY
SELECT 1
orders.user_id AS user_id, ORDER BY
DATE_DIFF(DATE(DATE_TRUNC(orders.created_at, month)), 1 ),
a.cohort_month,MONTH) AS month_number
FROM
`bigquery-public-data.thelook_ecommerce.order_items` orders
LEFT JOIN
a
ON
orders.user_id=a.user_id
The Look

QUESTION #7 (Cohort Analysis)


d AS ( FROM
SELECT d
a.cohort_month, LEFT JOIN
b.month_number, c
COUNT(*) AS num_orders ON
FROM d.cohort_month=c.cohort_month
b WHERE
LEFT JOIN d.cohort_month IS NOT NULL
a ORDER BY
ON 1,
a.user_id=b.user_id 3
GROUP BY
1,
2)
SELECT
d.cohort_month,
c.num_orders AS cohort_size,
d.month_number,
d.num_orders AS total_orders,
CONCAT(ROUND((d.num_orders/c.num_orders)*100, 2),'%') AS
percentage
The Look

Cohort Retention Table


I choose time frame between Jan 2022 - Sept 2022. Okt 2022 not included because the month is not over yet,
so the data might be incomplete.

Number of users who purchase again the next month are considerably
small, only around 0% - 6% from initial date. This problem needs more
investigating.
The Look

QUESTION #7 (FunnelAnalysis)
Funnel User’s Orders
User Registered —> User purchase 1x —> User purchase > 1x

Row 1 indicates # of registered users


Row 2 indicates # of users who purchase 1x
Row 3 indicates # of users who purchase >
1x (There’s a repeated order)
The Look

QUESTION #7 (FunnelAnalysis)
WITH c AS (
a AS ( SELECT
SELECT users.id,
COUNT(DISTINCT users.id) AS registered_users COUNT(orders.order_id) AS users_purchase_1time
FROM FROM
`bigquery-public-data.thelook_ecommerce.users` users ), `bigquery-public-data.thelook_ecommerce.users` users
b AS ( INNER JOIN
SELECT `bigquery-public-data.thelook_ecommerce.orders` orders
users.id, ON
COUNT(orders.order_id) AS users_purchase_1time users.id = orders.user_id
FROM WHERE
`bigquery-public-data.thelook_ecommerce.users` users orders.status ='Complete'
INNER JOIN GROUP BY
`bigquery-public-data.thelook_ecommerce.orders` orders 1
ON HAVING
users.id = orders.user_id users_purchase_1time > 1 ), d as ((
WHERE SELECT
orders.status ='Complete' a.registered_users AS num_of_users
GROUP BY FROM
1 a)
HAVING
users_purchase_1time = 1),
The Look

QUESTION #7 (FunnelAnalysis)
UNION ALL (
SELECT
COUNT(*) users_purchase_1time
FROM
b)
UNION ALL (
SELECT
COUNT(*) users_purchase_more_than_1time
FROM
c)
ORDER BY
1 desc)
select *, concat(round((num_of_users/100000)*100,2),'%')
percentage
from d
The Look

QUESTION #7 (FunnelAnalysis)
The Look

HYPOTHESES
After doing cohort and funnel analysis, some hypotheses need to be tested and proven so that the company
would find the best metrics to work on to increase profit and be able to fix their management. But this time, I do not
bring the answer right away.

More than 50% of registered users never make any purchase because they’re not
Hypothesis 1
engage enough with the content on TheLook website.

TheLook E-commerce needs more activities in all marketing channels with focused
Hypothesis 2
on reaching out to old users to make users come back after 1st purchase.
The Look
E-commers

THANK YOU
If you have any questions, feel free to contact me!
EMAIL ADDRESS
ramadhandwiyanuar@gmail.com

End Slide

You might also like