You are on page 1of 9

60+ Python Projects for All Levels of

Expertise​
60+ Python Projects for All Levels of Expertise​
Beginner Python Projects​
As a beginner, you should leverage Python projects to retain what you learned and acquire new
skills. These set of projects mostly revolve around exploratory data analysis tasks, alongside
simple modeling and forecasting tasks on relevant real-world datasets.​
1. Diamond Prices Data Analysis​
Diamonds are divided into five impurity types based on the structure of their carbon atoms. The
Diamonds dataset from Kaggle gives you even more info — cut, clarity, color, and price. Develop
your data visualization skills on it with some exploratory data analysis. ​
2. Age of Abalone Shells Data Analysis​
This is a unique dataset from zoology. Abalone shells are miracles of nature, and you can
determine their age by counting the circles inside their shells. Can you determine the age of
Abalone shells with Python data analysis skills?​
3. Premier League Data Analysis​
A football (or soccer) dataset where you can explore, analyze, and visualize events from the
2018-2019 season of the English Premier League. ​
4. Telecom Churn Prediction​
Customer churn is one of the most foundational machine learning problems. In this customer
dataset, you’ll be able to predict churn for a telecom provider based on usage data from their
customers. ​
5. Stock Prices Analysis and Prediction​
Do you want to find out the reason behind the 100% spike in Tesla's stocks two years ago? If yes,
the 2010–2021 tech stocks dataset will be the first place to start.​
6. NBA Shooting Data​
At which range do basketball players are most likely to score a shot? In this NBA shooting
dataset captured from the 2021 NBA playoffs, you’ll be able to answer just that question. ​
7. Forecast E-commerce Sales​
Using this e-commerce dataset from an online retailer, leverage data visualization and
forecasting techniques to predict future sales.​
8. Analyze Airbnb Listings​
This is an excellent dataset for understanding the dynamics behind Airbnb rental listings. With
exploratory data analysis and visualization, you’ll be able to understand which neighborhoods
have the most popular listings, understand the relationship between price and room type, and
more. ​
9. Analyze GDP Data ​
Gross domestic product is one of the strongest indicators of a region or nation’s economic
health. In this dataset, analyze how GDP has evolved for countries over the past 50 years. ​
10. Olympics Data Analysis ​
Who is the winningest country in Judo? How does athlete height impact success in a sport? With
exploratory analysis of the Olympics dataset, you’ll be able to answer just this question. ​

Intermediate Python Projects​


Going beyond beginner tasks and datasets, this set of Python projects will challenge you by
working with non-tabular data sets (e.g., images, audio) and test your machine learning chops
on various problems.​
1. Classify Song Genres from Audio Data​
Are you a genuine music lover? Then, you will enjoy predicting music genres with machine
learning on a music dataset in this audio recognition project.​
2. Analyze and Visualize Uber Pickups in New York​
Datasets with geolocations are always fun to analyze and visualize on a map. This uber pick-up
dataset of more than 20 million ride-hails in New York City is no exception. ​
3. Handwritten Character Recognition​
MNIST digits recognition is a great starting point for practicing deep learning. However, this
dataset adds another layer of challenge because you are predicting English handwritten letters.​
4. Credit Card Fraud Detection​
Credit card fraud is always a challenge — mainly because there will be a severe class imbalance
in the data. See if you can get around that in this credit card fraud dataset. ​
5. Gender Prediction Using Sound​
In this audio data project, you will use the fuzzy package to categorize the gender of names
based on phonemes and how they sound.
6. Hotel Booking Cancellation Rates​
If you are into real estate, this is an excellent dataset to play around with to understand hotel
booking cancellation rates. With simple machine learning techniques, you can try to predict the
likelihood of hotel cancellations based on historical data.​
7. Face Detection in Images​
Ever wonder how your iPhone puts little boxes around your face? That's because it performs
face detection under the hood. You can create similar functionality using this small dataset of
annotated images with faces.​
8. Predict the Species of Bees from Images​
Can a machine learning algorithm detect the species of bees based on an image? In this image
recognition project, you’ll do just that. ​
9. Analyze and Predict Bike Sharing Demand​
This bike-sharing dataset contains a wealth of information on bike rides for a bike-sharing
startup. With this dataset, you can analyze the drivers behind fluctuating demand and even
predict future demand with time series analysis and machine learning. ​
10. Build a Tweet Classifier​
Different personalities have distinct tweeting styles. In this social media analysis project,
you’ll use machine learning and natural language processing to classify whether tweets are
authored by Donald Trump or Justin Trudeau.​

Advanced Python Projects​


These advanced projects go beyond complex datasets and challenge you to apply creative
solutions to interesting problems. Whether it is creating movie recommender systems, network
analysis between characters in books, or interpreting sign language with machine learning,
these projects will provide you with enough complexity to learn new skills on the go.​
1. Build a Movie Recommender System​
Streaming platforms provide granular recommendations based on how you and others like you
interact with content. In this project, you’ll learn how to build a movie recommender system.​
2. American Signal Language Recognition​
American Sign Language is the primary language used by many deaf individuals in North
America. In this image recognition project, you’ll use Deep learning to recognize ASL letters.​
3. Real-time license plate recognition​
An awesome project on recognizing license plate numbers in real-time using deep learning on
video datasets. Check out the GitHub project containing the dataset and the code. ​
4. Sentiment Analysis in Stock News Headlines​
Investor sentiment is an incredibly important indicator when looking for clues on the future
performance of a stock. With natural language processing and machine learning, you can extract
sentiment from news headlines automatically in this natural language processing project. ​
5. SMS Spam Detection​
Spam detection is a cornerstone of data science and requires a combination of natural language
processing and machine learning techniques. Create a spam detection tool with this SMS
dataset.​
6. Network Analysis of Game of Thrones​
While a bit dated at this point, Game of Thrones captured the world’s imagination, unlike any
other show. With such a vast set of characters and lore, was the most important one of them of
all? In this Network Analysis project, you’ll answer just this question.​
7. Reducing Traffic Mortality with Machine Learning​
In this machine learning project, you’ll dig through historical data on traffic mortality in the
USA by state and apply machine learning to find similarities and differences between states and
provide granular policy recommendations.​
8. Movie Similarity in Plot Summaries ​
With so many movies available, it’s easy to think of movies that are similar to each other. What
if you can find natural language processing and machine learning to categorize movies based on
their plot summaries? In this Python project, you’ll do exactly that. ​
9. Movie Genre Classification with Multi-Label Output​
A movie can combine genres. With this Netflix movie dataset, you can apply multi-label
classification to predict the many genres a movie may have based on its description, rating, and
more.​
10. Build and Deploy a Machine Learning Pipeline ​
While this is not a specific project, deploying and maintaining the other projects on this list is an
incredibly useful skill to showcase to employers. In this tutorial, you’ll learn exactly how to do
that. ​

Fun Python Projects to Build Your Python Skills​


While not the most complex, these projects provide interesting and engaging datasets to explore
and get started with to accelerate your Python learning journey. ​
1. Spooky Author Identification​
Classify the works of mystery writers. Find out if an excerpt belongs to either Edgar Allen Poe,
HP Lovecraft, or Mary Shelley.​
2. Video game sales prediction​
Are you waiting for an upcoming game from Activision or EA? Try predicting how well it would
sell using the data from 16k+ past video games. ​
3. Myers-Briggs (MBTI) personality type prediction
There are 16 personality types according to the MBTI indicator. Instead of Googling it, try
predicting your personality using this personality type dataset.​
4. Explore Bitcoin Price Data​
Cryptocurrency prices have enamored the world with their extreme volatility. In this project,
you’ll apply time series analysis and data visualization techniques to Bitcoin prices. ​
5. Song Popularity Prediction​
In this great dataset of songs from the 50s, you can predict a song's popularity based on several
attributes.​
6. Analyze Fitness Tracker Data​
With the rise of fitness trackers comes an abundance of data that you can analyze. In this data
analysis project, you’ll analyze and visualize Runkeeper fitness tracker data. ​
7. Bust Myths with Data​
A 1991 study found that left-handed people die nine years earlier than right-handed people on
average. Is this actually true? Find out in this statistical analysis project. ​
8. Analyze Breathalizer Data
Using data collected from breathalizers in the state of Iowa, you’ll be able to visualize and
analyze drunkenness in Iowa and find patterns that can lead to better policy decisions. ​
9. Get on Top of the Music Billboards​
With this Spotify dataset of ~600 songs from 2010 to 2019, you’ll be able to explore and
analyze how popular genres have evolved over the past decade, predict a song’s genre based
on key attributes, and more. ​
10. Analyze a Lego Database​
While this project also requires some SQL skills, this lego database allows you to dig through
thousands of lego sales throughout the year and understand which lego sets are driving the
most sales. ​
Additional Guided & Unguided Python Projects For
Practice​
Guided Python Project for Practice​
1. Predicting Credit Card Approvals​
Automated credit card approvals are a huge machine learning use case in banking. In this
project, you will learn how to predict whether a credit card application gets accepted or rejected
by banks.​
2. Uncover Trending Topics in Machine Learning Research​
In this project, you will apply machine learning to discover the future of machine learning
research trends by analyzing the past decade's Neural Information Processing Systems papers.​
3. Blood Donor Classification​
Blood donations are life saviors. In this project, analyze the patterns in blood donations and
predict if a person will donate again in the future.​
4. Comparing Cosmetics by Ingredients​
Choosing a cosmetic product that won't jeopardize your skin health is hard. In this guided
project, you learn to process the ingredients of cosmetics to make a more informed decision
about whether a new cosmetic is good for you.​
5. A Visual History of Nobel Prize Winners​
Almost everyone in research dreams of getting a Nobel once in their lives. But does your age,
race, and gender affect your chances? Find out by analyzing the data on the winners since 1901.​
6. The GitHub History of the Scala Language​
Scala ranks as the 34th most popular programming language according to the TIOBE index.
Learn how it came to be so by analyzing the history of its GitHub repository in this guided
project.​
7. Exploring the Evolution of Linux​
Version control systems like Git store rich information about a software project’s evolution. In
this project, you will analyze and transform the real Git repository of the Linux Kernel and
understand how 700K+ commits created one of the most widely used operating systems on
earth. ​
8. Recreating John Snow’s Ghost Map​
Doctor John Snow (not the Game of Thrones character) mapped Cholera cases by hand and
deduced the origins of outbreaks in his area, giving birth to modern epidemiology. In this
project, you’ll recreate his work and his famous map. ​
9. A New Era of Data Analysis in Baseball​
Moneyball ushered in the era of sports analytics. In this project, you’ll analyze MLB Statcast
data to compare different baseball players and understand what drives home runs. ​
10. Generating Keywords for Google Ads​
Generating keywords for search ads is an incredibly meticulous and cumbersome process. What
if you can automate this task with Python? In this project, you’ll learn how to do exactly that. ​
11. Mobile Games A/B Testing ​
A/B testing fuels the success of so many digital products and services, and mobile games are a
great testament to that. In this project, you’ll understand the impact of an experiment run in
the popular Cookie Cats game on user retention. ​
12. Prioritize Debt Collection with Machine Learning​
Debt delinquency is a big problem for banks and financial institutions. In this project, you’ll
use machine learning and regressions to understand how to prioritize debt collection for a bank. ​
13. Book Recommender System from Charles Darwin​
Charles Darwin was an avid reader and had an extensive bibliography. In this project, you’ll
use Charles Darwin’s favorite books to create a recommender system that provides book
recommendations based on his tastes. ​

Unguided Python Projects for Practice​


1. Investigating Netflix Movies and Guest stars in the Office​
In this project, you’ll manipulate and visualize the performance of Netflix movies and the
guest stars in the cultural phenomenon series “the Office.”​
2. Exploring the History of Lego​
About 1140 pieces of Lego are produced every second. Find out how the most popular toy brand
in the world became so dominant by analyzing its historical sales data. ​
3. The Discovery of Handwashing​
Washing hands is second nature to all of us, but it has not always been so in the past. In fact,
Hungarian physician Ignaz Semmelweis discovered the benefits of hand washing by analyzing
the mortality data of patients in hospitals. Recreate his data analysis here.​
4. The Android App Market in Google Play​
The Android app market is vast and competitive. Analyze and visualize this dataset scraped
from the Google Play Store to find out what makes a great app.​
5. Word Frequency in Classic Novels​
In this project, you’ll scrape a novel from the website project Gutenberg and then analyze the
distribution of words in a large corpus of books. ​
6. Bad Passwords and the NIST Guidelines​
Almost every site requires a password, so how do you know if you’re using the best one? In this
project, you will create a system that automatically checks if your password conforms to the
National Institute of Standards and Technology.​
7. Comparing Search Interest with Google Trends​
Google exposes its Trends API in Python so that users can find out the search interest of any
keyword. It is an excellent source of time series data with records dating back to 2004. In this
project, you’ll explore worldwide search interest in five major internet browsers.​
8. Exploring the NYC Airbnb market​
Leverage data cleaning and manipulation to uncover insights into the Airbnb market of New
York City.​

How to Choose Which Python Projects to Add to Your Resumé​


With this long list of Python projects, how do you choose one to add to your resumé? According
to Nick Singh, author of the best-selling book "Ace the Data Science Interview," here are four key
principles to think of when you’re pursuing Python projects.​
1. Projects Should Come Out of Genuine Interest​
Doing a project on a topic you care about will make the whole process more engaging to you
and increase your chances of completion. Moreover, this enthusiasm will carry over when
speaking to a hiring manager about your project. ​
2. Simplicity Trumps Complexity​
Today, it is easy to get distracted by fancy tools and cutting-edge techniques. However, data
science in the real world requires a simplistic, pragmatic approach to solution building. One of
the goals of a project is to showcase your ability to develop useful data science solutions with
relatively simple techniques. ​
3. Always Complete Your Project​
It’s easy to fall into scope creep when doing a project. As a rule of thumb, always scope out a
project that you know you can complete from A to Z — even if it means just a simple data
analysis exercise. ​
4. The Project Should Have a Quantifiable Impact​
Once a project is complete, make sure to share your work and gain feedback from the
community in a quantifiable manner. Whether it is GitHub stars, LinkedIn shares, or Reddit
mentions—sharing your work is the best way to showcase the quantifiable impact of your
project to potential hiring managers.​

You might also like