You are on page 1of 11

Olympic Dataset

Analysis
With AWS S3, Databricks & Snowflake
Data Quality Check

--Check the Count of Rows in athlete_events

--Check the Count of Rows in athletes


1. Which team has won the maximum gold medals over the years?

Output:
2. For each team print total silver medals and year in which they won maximum
silver medal. Output 3 columns -> team,total_silver_medals, year_of_max_silver

Output:
3. which player has won the maximum gold medals amongst the players which
have won the only gold medal (never won silver or bronze) over the years

Output:
4. In each year which player has won maximum gold medal. Write a query to print
year, player name and no of golds won in that year. In case of a tie print comma
separated player names.

Output:
5. In which event and year India has won its first gold medal, first silver medal and
first bronze medal print 3 columns medal, year, sport

Output:
6. Find players who won gold medal in summer and winter Olympics both.

Output:
7. Find players who won gold, silver and bronze medal in a single Olympics. print
player name along with year.

Output:
8. Find players who have won gold medals in consecutive 3 summer Olympics in
the same event. Consider only Olympics 2000 onwards. Assume summer
Olympics happens every 4-year starting 2000. print player name and event
name.

Output:
THANK YOU
You Can find the code to ingest the dataset from AWS s3 to Databricks and
establish a connection to snowflake in below link

Olympic_Dataset_Analysis_Github

You might also like