You are on page 1of 1

Upgrade

Follow 589 Followers · Newsletter Archive Learn Data Science Contribute About

Photo by Scott Graham on Unsplash

3 must-have projects for your


data science portfolio
and how to build them

Aakash N S Jan 14 · 3 min read

Before you start applying for data science jobs, make sure to complete at
least one project in each of these three important domains:

1. Exploratory Data Analysis and Visualization

2. Classical Machine Learning on Tabular Data

3. Deep Learning (Computer Vision/NLP)

You can host these projects on your Github/Jovian profile. Here’s mine:

aakashns's notebooks on Jovian


aakashns is using Jovian to collaborate on 163 Jupyter notebooks.
jovian.ai

Project 1: Exploratory Data Analysis and


Visualization (EDA)
Check out these projects for inspiration:

1. Analyzing your WhatsApp messages by Michael Chia Yin

2. Understanding your Browsing Patterns using Pandas by Kartik


Godawat

3. What Makes a Student Prefer a University by Daniela Cruz

Here are the steps for building a project on EDA & visualization:

1. Find a real-world dataset of your choice online

2. Use Numpy & Pandas to parse, clean & analyze data

3. Use Matplotlib & Seaborn to create visualizations

4. Ask and answer interesting questions about the data

5. Document & publish your work in a Jupyter notebook or blog post

Take our course on Data Analysis with Python: Zero to Pandas to learn the
skills required for building projects on Exploratory Data Analysis and
Visualization

Project 2: Classical Machine Learning on


Structured Data
Check out these projects for inspiration:

1. New York Taxi Fare Prediction by Allen Kong

2. Predicting the Auction Price of Bulldozers by Ankur Singh

3. Building the Hogwarts Sorting Hat using Logistic Regression by


Ekaterina Derevyanka

Here are the steps for building a classical machine learning project:

1. Find an interesting tabular dataset online (typically in CSV/JSON


format)

2. Identify the type of problem: regression, classification, unsupervised


learning, etc.

3. Clean the data if required and perform exploratory data analysis

4. Do some feature engineering i.e. create some new & useful features
using existing ones

5. Identify the right modeling approaches e.g. decision trees, regression,


gradient boosting, etc.

6. Train a model and evaluate its performance using K-fold cross-


validation

7. Experiment with different modeling approaches & hyperparameters

8. Document & publish your work in a Jupyter notebook or blog post

Check out these courses on Coursera and Udemy to learn the skills
required to build a classical machine learning project.

Project 3: Deep Learning on Unstructured Data


Check out these projects for inspiration:

1. Blindness Detection using Image Classification

2. Generating New Artworks using GANs

3. Bounding Box Prediction using PyTorch

4. Classifying Environment Audio Recordings

Here are the steps for building a deep learning project:

1. Find an interesting unstructured dataset online (images, text, audio,


etc.)

2. Identify the type of problem: regression, classification, generative


modeling, etc.

3. Identify the type of neural network you need: fully-connected,


convolutional, recurrent, etc.

4. Prepare the dataset for training (set up batches, apply augmentations


& transforms)

5. Define a network architecture and set up a training loop

6. Train the model and evaluate its performance using a validation/test


set

7. Experiment with different network architectures, hyperparameters &


regularization techniques

8. Document and publish your work in a Jupyter notebook or blog post

Take our course on Deep Learning with PyTorch: Zero to GANs to learn
the skills required to build a deep learning project.

Where to Find Datasets for Your Projects?


Here are some sources for finding interesting and unique datasets:

1. Kaggle Datasets (use the opendatasets library for downloading


datasets)

2. Past Kaggle Competitions (check the “Completed” tab)

3. awesome-public-datasets on Github

4. FastAI Course Datasets

5. Curated Deep Learning Datasets

You can also export your personal data from applications like Google
Chrome, WhatsApp, Facebook, Instagram, Apple, FitBit, etc. to analyze
and predict your own behavior!

What are you building? Tweet at us and let us know! We’d love to feature
your project on our Community Medium Publication.

74

Artificial Intelligence Data Science Education

More from Jovian Follow

Jovian is a community-driven learning platform for data science and machine


learning. Take online courses, build real-world projects and interact with a global
community at www.jovian.ai

Himani Gulati · Jan 5

Time Series Analysis — Data Exploration


and Visualization.
A simple walkthrough to handle time-series data and the statistics
involved.

A picture is worth a thousand words, as the saying


goes. And it definitely holds true in data analysis.
As a beginner, I really struggled to put pieces of the ‘THE TIME SERIES’
puzzle together, Hence I have tried to cover the most basic of the things
to the hopefully bigger ones, which once again makes this a beginner-
friendly Project. You can find the notebook for the source code here.

I still would suggest you'll to pick up Statistics as a subject if this is a field


where you’re headed. But, don't forget, Machine Learning != Statistics.

Statistics is Key……

Read more · 12 min read

47

ABHISHEK KUMAR · Dec 29, 2020

Classifying Gender in images using Deep


Learning
Electricity transformed countless industries: transportation,
manufacturing, healthcare, communications, and more. AI will bring
about an equally big transformation.
-Andrew Ng

Quote by Andrew Ng

What is Deep Learning ?


Deep Learning is a subset of Machine Learning that has networks
capable of learning unsupervised from data that is unstructured or
unlabelled.

Introduction
This blog is a part of a course project from Deep Learning with PyTorch:
Zero to GANs. This course is a 6-week long course by Aakash N S and his
team at Jovian. It is a beginner-friendly online course offering a practical
and coding-focused introduction to deep learning using the PyTorch
framework. It was a novel experience for all, with the lectures being
delivered via Youtube live-streaming (on the beloved freeCodeCamp
Youtube channel)…

Read more · 7 min read

13

Abubakkar Siddique · Dec 28, 2020

Trends on Video Game Sales Using


Exploratory Data Analysis and Case Study

Image Credit: WallpaperTip

Playing video games has become a customary and important part of


everyday life for today’s youth, and the broader education community has
been exploring the affordances of video games to support various
competencies that are valuable for success in the twenty-first century.

Now, Let's learn out-of-box, that is completely about Sales but not the
Playing game, Does it sounds strange?

This Project is to perform the analysis on the Video Games Sales across
the countries. Used various libraries of Python for visualization of Data.
The Dataset of Video Game Sales which I used in the Project is Flashed
here. And…

Read more · 4 min read

74

Srijan · Dec 2, 2020

10 Ways You Can Create Tensors In


PyTorch

Photo by Florian Olivo on Unsplash

PyTorch is an open-source Python-based library. It provides high


flexibility and speed while building, training, and deploying deep learning
models.

At its core, PyTorch involves operations involving tensors. A tensor is a


number, vector, matrix, or any n-dimensional array.

In this article, we will see different ways of creating tensors using PyTorch
tensor methods (functions).

Topics
tensor

zeros

ones

full

arange

linspace

rand

randint

eye

complex

tensor()
It returns a tensor when data is passed to it. data can be a scalar, tuple, a
list, or a NumPy array.

In the above example, a NumPy array created using np.arange() …

Read more · 4 min read

40

Daniela Cruz · Nov 12, 2020

What makes a student prefer a


university?… Part II: Analysis.
Data analysis of American Universities to find out which are the most
preferred features by students when it comes to choosing a university.

Photo by Vasily Koloda on Unsplash

The goal of this project is to find out which are the most relevant features
that students take into account to choose the favorite university. Some of
the essential questions for developing this project are related to the
number of applications, admissions, and enrollments, cost of tuition and
fees, cost of living on campus, types of degrees offered, and features of
the states where universities are located (population and GDP).

The data set used for this analysis…

Read more · 12 min read

Read more from Jovian

About Help Legal

You might also like