You are on page 1of 19

Ashutosh Tripathi

Sr. Data Scientist


Infineon Technologies
Agenda

1 Demand of Data Scientists

2 Why You should choose Data Science as a Career Option

3 What is AI, ML, DL & DS – Terminology Know How

4 Applications of Data Science

5 Different Job Roles in Data Science domain

6 Learning Path to Become a Data Scientist

7 Where to look for resources & practice problems

8 Resume Building

9 How to increase chances for getting shortlisted


Demand of Data Scientist

2012

https://www.datarobot.com/blog/one-billion-models-built-on-the-datarobot-cloud/
Reason behind the high demand
Why you should select Data Science as a Career Option
High Paying Jobs

https://data-flair.training/blogs/data-scientist-salary/
What is AI, ML, DL and DS

Artificial Intelligence (AI)


Programs with the ability to learn and reason like humans ~80%

Machine Learning (ML) ~80% of World’s


Algorithms with the ability to learn without being explicitly programmed Data generated in
last 10 Years

~80% of Resources
Deep Learning (DL) improvement in last
Subset of ML in which artificial neural networks adopt and 10 Years
learn from vast amount of data

Data Science (DS)


DS is a field of study which combines Statistics
and Math.
Data Science
Programming skills – Python / R integrates all the
And most importantly Domain expertise to extract above AI, ML & DL to
meaningful insights from data extracts insight from
data (EDA) and make
predictions from large
datasets (Predictive
Domain Programming
Statistics Analysis)
Knowledge Skills

https://ashutoshtripathi.com/
Machine Learning Model Flow
Model Tuning Model Tuning

Gathering Data Data Analysis Data Pre-processing Variable Selection


ML Model Building
(Feature Engineering) (Feature Selection)
• Collecting the data is Answer questions such
Select a subset of Select the ML Model
first and an essential as: Our goal here is to make •
features out of all family for training which
step towards any • what variables are the data ready for building
features which is suited to best of our
data science and available? ML models! To this end,
critical to ML model need:
many things can be done
Machine Learning • how are they related? performance.
but not limited to such as
project. • what is the • Linear Regression
characteristics of Many feature could • Logistic Regression
• Filling missing values in •
• It can be from those variables? just be noise so • KNN
the data
business units or (numerical or removing them is • DT
• Dealing with (e.g.,
• RF
from public datasets categorical?) removing) outliers important for
overfitting and • Ensembles
for building ML • missing values? • Transforming categorical
complexity reduction • ….
models. outliers? values
etc.
Applications

Applications of Data Science


Different Job Roles in Data Science domain

Data Scientist Data Analyst Data Engineer

• Identify Trends & • Visualization • Build and test


Patterns • Transformation scalable Big Data
• Predictive Analysis • Processing ecosystems
• Statistical Modelling • DB Queries • DB Architecture
• Forecasting • Data Optimization • ETL Operations

ML Engineer Business Analyst Statistician

• Building ML Pipeline • Insight Presentation • Research Oriented


• Infrastructure Setup • Business • New
• Model Deployment Understanding Methodologies
• API Creation • Storytelling Creation
• A/B Testing • Data Visualization • Algorithm building
Learning Path to become a Data Scientist

1 Start with Statistics

2 How to code in Python or R

3 Understand Data Pre-processing

4 Data Visualization

5 Machine Learning Algorithms

6 Solve Practice Problems

7 Participate in Hackathons

https://ashutoshtripathi.com/2019/11/17/how-to-start-career-in-data-science-and-machine-learning/
Learning Path to become a Data Scientist cont.…

ML Algorithms

• Regression Analysis
• K-Nearest Neighbors
• Decision Tree
Visualization • Ensemble Learning
• Random forest
• Insights from Data • Principal Component
• Tableau Analysis
• Matplotlib • Etc.
• Seaborn
Programming in • HighCharts in JS
Python (or R) • Google Charts
• Python code Editor - • Etc.
Jupyter
• How to code in Python
• Pandas
Statistics • Numpy
• sklearn
• Data Transformation
• Central Tendencies using Pandas
• Measures of Variability
• Basics of Probability
• Probability distribution
• Central Limit Theorem
• Hypothesis Testing
• Confusion Matrix
Where to look for Practice Problems

https://www.kaggle.com/ https://archive.ics.uci.edu/ml/index.php

https://www.google.com/
Where to look for Hackathons

https://www.hackerearth.com/hackathon/explore/field/machine-learning/

https://datahack.analyticsvidhya.com/contest/all/

https://www.kdnuggets.com/tag/hackathon

https://www.hackathon.com/

Many more, search in google


Resume Building for Data Science Roles

Important Sections
• Header with formal photograph.
• Brief Technical Summary on your past work and skills. (Max. 10 lines)
• Skill Set.
• Professional Experience. (if any)
• Publications. (if any)
• Technical Contributions on Public Forums like GitHub, Stack overflow, Quora etc. (must,
in today’s competitive world)
• Certifications and Courses including Projects details.
• Achievements
• Extra-curricular Activities
• Declaration

• Most Importantly modify/update your resume based on the specific job description

Don’t
• Un-necessary styling
• Over Explanation of projects
• Projects which you have not done

https://ashutoshtripathi.com/2019/11/21/how-to-create-an-impressive-resume/
How to increase chances of getting shortlisted

• Create Impressive LinkedIn Profile and be active

• Create Github profile to showcase all your technical


project work

• Participate in Hackathons

• Kaggle competitions

• Use Social media to build good professional network and


showcase your talent

• Update yourself with latest technology in your field

• Utilize free courses available at Udemy, Coursera etc sites


to upgrade your knowledgebase

• Mock Interview Practices.


Personal Branding
Hiring Structure

Written Test Technical Interviews HR Interviews

Aptitude Panel 1
Non-Tech

Reasoning OR
Attitude/ Behavior
Communication Panel 2
OR Communication

Programming Panel 3
psychometric test
Tech

(C, R, Python etc.)


OR
Domain Specific
Panel 4
(Electronics VLSI,
SW, HW etc.)

***
New Trend: Interviewer ask to solve a problem within given time and then ask questions from that solution
Contact Details

Website: https://ashutoshtripathi.com Website: https://enetwork.ai

LinkedIn: https://www.linkedin.com/in/ashutoshtripathi1/ Instagram: https://www.instagram.com/ashutosh_ai/

You might also like