You are on page 1of 3

Self-Study Plan - Data Science

Topic 1: Learn Programming Language


Programming knowledge is a must-have skill for a Data Scientist. There are various programming
languages for data science like Python, R, and Julia.
As a beginner, you can start with Python. There are various resources available to learn Python.

Free Resources to Learn Python Programming


1. Introduction to Python Programming(Udacity Free Course)
2. The Python Tutorial (PYTHON.ORG)
3. CS DOJO (YouTube)
4. Python 3 Tutorial (SOLOLEARN)
5. Python For Data Science(Udemy Free Course)
6. Programming with Mosh (YouTube)
7. Corey Schafer (YouTube)

Free Resources to Learn R


1. R Basics – R Programming Language Introduction(Udemy Free Course)
2. R Programming (Coursera Free to Audit Course)
3. Learn R Quickly (Udemy Free Course)
4. R, ggplot, and Simple Linear Regression (Udemy Free Course)
5. R Programming Tutorial (YouTube Tutorial)
6. R Programming Full Course In 7 Hours (YouTube Tutorial)

Topic 2: Learn Math & Statistics


To learn data science, you should have a good understanding of Statistics and mathematics. Knowledge
of statistics will give you the ability to decide which algorithm is good for a certain problem.
Mathematics helps you to identify under-fitting and over-fitting by understanding the Bias-Variance
tradeoff.
Free Resources to Learn Math & Statistics-
1. Intro to Statistics (Udacity Free Course)
2. Introduction to Statistics (Coursera Free to Audit Course)
3. Intro to Inferential Statistics(Udacity Free Course)
4. Intro to Descriptive Statistics(Udacity Free Course)
5. Statistics and probability (Khan Academy)
6. Mathematics for Machine Learning: Linear Algebra(Coursera Free to Audit Course)
7. Mathematics for Machine Learning: Multivariate Calculus(Coursera Free to Audit
Course)
8. Linear Algebra Refresher Course(Udacity Free Course)
9. Multivariable calculus(Khan Academy)
10. Learn Linear Algebra(Khan Academy)
11. A Survey of Optimization Methods from a Machine Learning Perspective (Research
Paper)
12. Optimization Methods for Large-Scale Machine Learning (Research Paper)
13. How optimization for machine learning works (YouTube Video)

Compiled By: Puspanjali Sarma


Self-Study Plan - Data Science
Topic 3: Learn Data Science Libraries
Libraries are the collection of pre-existing functions and objects. You can import these libraries into
your script to save time.
If you learn Python, then you need to learn the following Python Libraries for Data Science-
• Numpy- NumPy will help you to perform numerical operations on data. With the help of
NumPy, you can convert any kind of data into numbers. Sometimes data is not in a numeric
form, so we need to use NumPy to convert data into numbers.
• Pandas- pandas is an open-source data analysis and manipulation tool. With the help of pandas,
you can work with data frames. Dataframes are nothing but similar to Excel files.
• Matplotlib– Matplotlib allows you to draw a graph and charts of your findings. Sometimes it’s
difficult to understand the result in tabular form. That’s why converting the results into a graph
is important. And for that, Matplotlib will help you.
• Scikit-Learn- Scikit-Learn is one of the most popular Machine Learning Libraries in Python. Scikit-
Learn has various machine learning algorithms and modules for pre-processing, cross-validation,
etc.
Free Resources to Learn Python Libraries
1. Learn NumPy Fundamentals (Python Library for Data Science)(Udemy Free Course)
2. NumPy for Data Science Beginners: 2022(Udemy Free Course)
3. NumPy Tutorial by freeCodeCamp
4. Pandas (Kaggle)
5. NumPy user guide
6. pandas documentation
7. Matplotlib Guide
8. scikit-learn Tutorial

Topic 4: Learn SQL Skills


You should know how to store and manage your data in a database. That’s why you should have
an understanding of SQL. You can manipulate data using both SQL and Pandas. But there are certain
data manipulation tasks that can be easily performed using SQL.
Free Resources to Learn SQL
1. W3Schools
2. SQL for Data Analysis(Udacity Free Course)
3. SQL for Data Science (edX Free to Audit Course)
4. SQL for Data Analysis: Solving real-world problems with data(Udemy Free Course)
5. SQL Crash Course for Aspiring Data Scientist(Udemy Free Course)
6. SQL Tutorial

Topic 5: Learn Data Visualization


As a Data Scientist, you have to showcase your findings in a visual form, so that stakeholders can
understand them properly. That’s why the knowledge of Data Visualization is important. And for that,
you should be familiar with data visualization tools like ggplot, matplotlib, Seaborn, and D3.js.

You should have knowledge of various Reporting tools like Tableau and power bi.

Compiled By: Puspanjali Sarma


Self-Study Plan - Data Science
Free Resources to Learn Data Visualization
1. Data Visualization in Tableau(Udacity Free Course)
2. Fundamentals of Visualization with Tableau(Coursera Free to Audit Course)
3. Complete Tableau Training for Absolute Beginners(Udemy Free Course)
4. Data Analysis and Visualization(Udacity Free Course)
5. Data Visualization (Kaggle)
6. Data Visualization and D3.js(Udacity Free Course)
7. Data Visualization in Python Masterclass™ for Data Scientist(Udemy Free Course)
8. Free Training Videos (Tableau)
9. Creating Dashboards and Storytelling with Tableau (Coursera Free to Audit Course)
10. Tableau | A Quick Start Guide(Udemy Free Course)

Topic 6: Learn Machine Learning Algorithms


Algorithms are backbone of an ML Problem. You need to learn the basics of Machine Learning and Types
of Machine Learning algorithms( Supervised, Unsupervised, Semi-Supervised, Reinforcement
Learning).
You can watch the Andrew Ng Machine Learning Course for understanding the basics. You can also
check these machine learning resources.
Free Resources to Learn Machine Learning
1. Machine Learning by Georgia Tech(Udacity Free Course)
2. Introduction to Machine Learning Course(Udacity Free Course)
3. Machine Learning: Unsupervised Learning (Udacity Free Course)
4. Machine Learning by Stanford University(Coursera Free to Audit Course)
5. Machine Learning for All by University of London(Coursera Free to Audit Course)
6. What is Machine Learning?(Udemy Free Course)
7. Machine Learning Fundamentals(edX Free to Audit Course)

Topic 7: Take Part in Data Science Competitions


Now it’s time to practice and check your command in Data Science. The best way to practice is to take
part in competitions. Competitions will make you even more proficient in Data Science.
When we talk about top data science competitions, Kaggle is one of the most popular platforms for
data science.
You can also check these platforms for data science competitions-

1. Driven Data
2. Codalab
3. Iron Viz
4. Topcoder
5. CrowdANALYTIX Community
6. Bitgrit

For all these topics, Edureka YouTube channel another best platform to learn!

Compiled By: Puspanjali Sarma

You might also like