You are on page 1of 69

Welcome to

DataVader's 7 step
pack
“Data Science is the sexiest job of the 21st century.”

“Data is the new oil.”

“Be a part of the data revolution.”

You are probably here after listening to these statements made by


the media. Or you have seen one of your colleagues making a
successful carer in data science.

Either way, you are in the right place. Welcome to the 7-step
starter pack. The pack will help you get up close and personal with
Data Science.

Data Science is a vast ocean and making a successful career is like


finding a pearl in the sea. If left unguided, you might never find it.
This 7-day starter pack is your guide to finding that pearl. It is
about making a successful career in Data Science.

Making a career is different from finding a job. A career is a long-


term professional journey that you will embark on fulfilling your
professional goals and ambitions.

I also have included a guide to making a successful career in Data


Science
LIST OF CONTENT

STEP 1 STEP 2
What is Data Science? Different roles in Data
Science
Understand how Data Science
is the confluence of Get hold of different roles in
Mathematics, Programming Data Science and choose the
and business understanding right one for you

STEP 3 STEP 4
Gain the skills required Create your portfolio

Now when you know what role One thing that gets your foot
you want to work in, get a across the job door is your
road map of how to acquire the portfolio. With my definitive
required skills guide, learn how to make and
maintain your pirtfolio

STEP 5 STEP 6
Starting the job hunt Acing the interview

Once you are set to show your So now when you have a couple
Data Science expertise, it is of replies on your job
time to start with the job application, it' s time to
search. I have it covered for prepare yourself and crack the
you! interview.

STEP 7
Advance your career

Getting a job is a subset of


making a career. In step 7,
learn how to advance your
career like a pro
Chapter 1
What is Data
Science?
What is Data Science?
Even before you make a career in Data Science, it is necessary that
you understand what is Data Science.

Have you ever seen your mother predicting the monthly budget
with great accuracy?

Have you noticed how your father designs an investment portfolio


that fetches him handsome returns?

Do you remember daydreaming of becoming a millionaire by


selling your art craft and how many units would be required for
that?

The common link between these examples is Data Science. Your


mom, dad and you are a data scientist by predicting and analyzing
data at hand.

But is this enough to make you a Data Scientist?

Ummm yes and no

While these practices give you enough intuitions to play with data,
you become a data scientist when you crunch, process, and solve
business problems with data. Basically, it is your job. Then, you are
called a data scientist
How do experts define
Data Science?
The definition I have given above is very basic and just for your
understanding. Let us understand how experts describe Data
Science:

According to IBM,

Data science is a multidisciplinary approach to


extracting actionable insights from the large
and ever-increasing volumes of data collected
and created by today’s organizations.

Data science encompasses preparing data for


analysis and processing, performing advanced
data analysis, and presenting the results to
reveal patterns and enable stakeholders to
draw informed conclusions

Let's break it down:

Multidisciplinary: It combines a number of disciplines like


Mathematics and its sub-section Statistics, Computer Science,
and your subject matter. The subject matter will be your
domain expertise like finance, marketing, supply chain,
healthcare, etc.
For example, Amazon makes your 1-day Prime delivery
possible by combining supply chain management and data
analysis.

Actionable insights: Data Science helps businesses to derive


insights that can be acted upon. The insights help to support
business decisions and increase revenue, sales and improve
other metrics like employee and customer retention.

Large and ever-increasing volumes of data: In the example


of your mom and dad, they were working on a smaller dataset.
As a data scientist, you will be working on large volumes of
data to derive business value.

Analysis and processing: Real-world data is messy. You need


to process it to turn it into information that is of value. After
this comes analysis, which is about finding a pattern and
deriving conclusions.

Enable stakeholders to draw informed conclusions:


Stakeholders are people who are interested in the company’s
operations and performance. All of them have different
expectations of the company. Data Science helps to satisfy the
expectations of the stakeholders.
Some Data Science Use
Cases
Time to know some use cases of Data Science and how the brands
you know are using Data Science to gain an edge.

Facebook uses Data Science to perform social analytics


and recommends people which might want to be
friends, online.

Amazon recommends the best of products using


historical data to improve your shopping
experience.

Spotify improves your music experience by recommending


you just the right music with its recommendation system
using Data Science

Netflix has gotten Data Science systems in place to use


your taste in movies and suggest you the next movie or TV
Series. This is what makes Netflix so addictive!

Gmail does the classification of your inbox into Social,


Promotions, Spam section using Data Science.
Why do we hear Data
Science everywhere these
days?
Have advertisements been telling you that millions of jobs are
being created in Data Science and how Data Science could be your
dream career?

Did you get a similar question in mind? Why do I see so many ads
about a Data Science career?

Well, it is primarily because of three reasons:

Digital Footprint: Due to internet penetration, 90% of the


data available on earth was created in the last 2-3 years! This is
the rate at which data is being created which translates into
great requirements for Data Science professionals

Processing power: With advancements in microchips, our


capacity to process large amounts of data has increased
manifolds in recent years. This demands more data
professionals who can get insights from data.

Libraries and frameworks: Recent developments in various


libraries in Python and R along with no-code tools like Power
BI, Tableau, etc have made processing and using data possible

So now you know why “Data Science is the sexiest job of the 21st
century!"
Is Data Science something
new?
It is natural to think that Data Science is a new discipline
altogether. Everybody keeps repeating “Data Science is an up and
coming field!”

But Data Science has existed for centuries to come!

People have been analyzing data, visualizing it as bar graphs,


histograms, performing optimization for ages. The only difference
is in terms of scale and tools.

Data Science today is more like combining various existing fields


with new technological advancements and carving a niche.

Components of Data Science


Time to understand everything in detail! So, by now we
understand that Data Science is an interdisciplinary field requiring
expertise in three particular domains

Mathematics and Statistics


Programming
Business Understanding

None of these skills is more important than the other or requires


more focus. All of them work together to provide a Data Science
solution, as visible from the Venn Diagram.

Let us outline each of these requirements in detail:

Programming: Data is like a gold mine sitting to be explored. As


with gold, it only becomes valuable when miners are able to take
out the precious metal. Similarly, miners of data need to have
hacking skills to be able to get insights from data.

The hacking skills do not per see require a computer science


degree but you need to be someone who writes clean, concise,
and maintainable codes. This requires a basic understanding of
computer language with a hell of a lot of practice!!

As compared to Software engineers, data scientists are expected


to do write codes for open-ended analysis of data whereas
software engineers write codes for predefined outputs.

As a data scientist, you need to code your way to get data from the
database and then clean, manipulate, visualize, analyze and then
share insights from the data
Now comes the million-dollar question

R or Python?

Usually, when I have an introductory call with someone, the first


question that I am asked is: R or Python?

Both are amazing! Let me point out some differences:

R has its roots in Statistics. It is good for statistical analysis and


modeling, visualization, etc. Mostly used in academics.

Python is majorly a general software development language. It


works better with large data, helps in machine learning, and
powers heavy algorithms. Used a lot in the industry.

Other essential hacking skills:

SQL: Most databases use SQL to manipulate data within them


or to extract it.

Version control: You might have heard about Github. The


awesome technology behind it git, the most popular version
control system. It helps teams to collaborate and save copies
of changes. Trust me, learning Git is an essential skill that is
often downplayed in front of Python or R.
The next question I get asked a lot is, are no code tools like
Tableau/ Power BI/ QlikView/ AutoML enough to become a data
scientist?

While you might initially be okay with such tools but as soon as
you would want more control over your data, you will have to
learn how to code. So better start it now!

Mathematics/Statistics: Statistics is a subfield of mathematics


and studying it is like learning the language of data. Similar to how
we understand humans through psychology, statistics help us to
understand data.

In Mathematics, you need to revisit or read some of the concepts


like:

Linear algebra
Matrices
Calculus

In statistics, the following topics form the requisites for data


science:

Descriptive statistics, distributions, hypothesis testing, and


regression
Bayesian Thinking. Conditional probability, priors, posteriors,
and maximum likelihood
Statistical Machine Learning
On top of these concepts, you also need to learn how to
implement them in Python. It includes learning how to operate
certain Python libraries like Python, Numpy, Scikit Learn, etc.

Being proficient in these libraries along with concepts like


vectorization can take you very far in your Data Science journey.
You need to know the techniques, learn how to apply them with
Python/R, and also need to know how to optimize them.

For example, if your company comes with a business problem of


setting up a new plant at location A vs location B, you should be
able to confirm the location with 95% confidence! Read statistics
to know how.

Business understanding: One of the very important skills in data


science is to translate business challenges into data science
problems.

Let's say your manager comes to you and asks, why are our
employees leaving? The employee turnover rate at the company is
very high at 23%!!

So now, you don't have a package to find that out automatically.

You need to relate the turnover rate with the employee's


manager, years of experience, salary, hike, promotion, their job
role, etc to really understand why they are leaving the company.
It all boils down to: you have a business problem and you have to
solve it with data.

It also means asking the right questions. For example, let's say you
are working in Netflix's marketing team. Let's say your company
asks you to bring in more customers.

You need to ask some questions whose answers can help you
increase your subscriber base.

Where does the most subscriber come from? Google ads or


youtube ads? Does celebrity endorsement help? Does discount or
free trial help in increasing subscribers?

After the analysis, you also require to communicate these to the


stakeholders involved. As you become a senior, it will be your job
to evaluate which business problems can be solved with data.

In the next chapter, we will understand different roles in Data


Science.

Summary
Data Science is the art of gaining insights from data.
Know why you are seeing Data Science everywhere
Data Science is not a new field and it has recently come into
vogue due to an increase in data generation, computation
capacity, and development in packages and libraries in
R/Python.
Data Science is the confluence of three major components;
Mathematics/Statistics, Programming, and Business
Understanding.
Chapter 2
What are the
different roles in
Data Science
Is Data Science something
new?
In chapter 1, we understood that Data Science lies at the
confluence of three components:

Mathematics/Statistics
Programming
Business Understanding

The different roles that will be discussed in this chapter are


basically a combination of those components. So these roles will
come out as offsprings of jobs under the title role of "Data
Scientist".

They all are quite different from each other, require more
emphasis on one of the core skills, and also differ in terms of
value provided to the company. I will be elaborating on Data
Analyst, Business Analyst, Machine learning Engineer, Data
Engineer, and MLOps Engineer.

Rather than going by the job titles, it is necessary that you


understand and follow the job description meticulously. Job titles
might vary across companies.

That's why I am going to attach Job description of every role and


analyze it in detail.
Data Analyst
Broadly speaking, Data Analyst is a person who takes the data,
analyzes it, generates insights from it, and then communicates it
to the right people.

Let's say your company has set a yearly sales target of 1,00,000
units. The sales and marketing team wants input from you on the
following things:

What are the weekly and monthly sales?


Are the targets set for each month being met?
Which regions have the most leads and sales?
Are promotional offers leading to more sales?
Can you help managers track the sales in real-time?

You can answer these questions by streamlining the flow of data


in the company, cleaning it, and preparing it. This would entail
getting the right data from the company, clean and transform the
data, and then create meaningful and interactive dashboards.
These dashboards should update on their own with the least error
and help the decision-makers and stakeholders.

A data analyst's job doesn't involve a lot of machine learning but


basic knowledge of algorithms like linear regression will be helpful
Let us analyze the job description of an analyst posted by
Mckinsey:

They expect you to know Excel, a visualization tool like Tableau,


PowerBI, or Qlikview. If you look at the next line, they also want
you to have some experience in relational databases like MySQL
Database which can help in collating, storing, and querying data.

An often overlooked aspect is the ability to communicate and


share insights with the stakeholders.

In a nutshell, for the role of a data analyst, you need to master:

Database knowledge; SQL and NoSQL


Knowledge of dashboards and visualization tools like Tableau
and Power BI
Foundational knowledge of Statistics and some basic
algorithms Good soft skills to communicate better
Business Analyst
Their job is a lot more similar to a
data analyst but they have less
expertise in Statistics and
Programming. They might use
Excel instead of R or Python for
data handling and processing.

They will also not be expected to


perform statistical modeling. In
simple words, they are a simplified
version of a Data Analyst

If you are someone who is starting his/her career, you can orient
yourself as a business analyst or data analyst. They are in demand,
have a good number of openings. Also, the skills associated with
these roles are easier to acquire.

As a business analyst, you need to focus on business


understanding as you will be needing to align the strategies with
the business goals and outcomes.

This job is good for an entry point, but if you want to code more or
do machine learning, then this role might not be the right fit for
you. Also, they pay less as compared to the role of a data analyst
or machine learning engineer.
But if you come from a non-engineering background, you can
think of getting into this role. Then gradually, you can become
better at programming, learn more machine learning and slowly
move to other, advanced roles.

Alongside Excel and other tools like Tableau, you will also need to
know SQL and database knowledge. In almost every data science
role, database knowledge and especially learning SQL is kind of
mandatory.

Let us understand more with a job posted by Amazon for a


business analyst role:

As you can see, the job requires you to create, support, and
monitor metrics to support business. It is also about delivering
metrics reports. If you are unsure what is a metric, a metric is a
measurable value that shows the progress of a company's
business goals.
Machine Learning Engineer
The primary job of a machine learning engineer is to develop
machine learning models which run continuously.

We had discussed a few use cases of Machine Learning in the first


chapter with examples of Facebook, Spotify, Youtube, Gmail, etc.
The ML engineers in these companies help to create models and
optimize them.

Let's take the example of Amazon. The ML Engineer at Amazon


will help to create a recommendation system and then keep
monitoring the model while in production to make it maintainable
over the period of time.

Compared to the role of a data analyst, which requires you to


create a dashboard and visualize data, as a machine learning
engineer, you will be expected to code a lot. You are also
supposed to understand statistics and Machine Learning
algorithms in-depth.

The output of your work will be used for the consumption of


machines than for stakeholders and decision-makers, unlike a
data analyst.

Apart from the models which you create, you are expected to
write clean, maintainable, and comprehensive codes. You will be
closer to the role of a software developer
Going back to the Amazon example, let's say you give real-time
updates on whether a customer on Amazon's website will buy the
product.

As an ML Engineer, you will:

Find historical data at Amazon


Train an ML model on that
Create an API (basically an interface between the website and
your model)
Deploy the model
Make sure that the model keeps running in future

Let us see the job description of an ML Engineer role at IBM and


you will be able to confirm what I have talked about before
Data Engineer
Functionally, a data engineer
works on maintaining data in
databases.

This will help people across the


company to get the data they
need. Usually, they are not
expected to make dashboards
or visualize, as data analysts or
business analysts do. Their main
responsibility is to structure and
formatted data.

A data engineer working at Spotify might be required to store all


the information about their users, the songs that they listen to, the
duration they listen to, the time of the day they listen to a
particular song, the next song they listen to, etc.

This information needs to be stored in a structured form so that


the team members can access it conveniently. You might also be
involved in developing and monitoring the entire pipeline from
data collection to processing and storing it.

I feel like the role of Data Engineer is at a gold spot right now. The
demand is high and the supply of people with data engineering
skills is less. If you enjoy coding and managing data, this role could
be a good point as it is high paying also.
Time to analyze the job description of a posting for a data
engineer role at Google.

MLOps Engineer
Professionals working in MLOps help in the deployment of
Machine Learning or Deep Learning models and help to maintain
them. It means acquiring skills like creating web endpoints using
flask or fast API and also some cloud solutions like AWS or GCP.

This role is crucial in protecting the business from risks due to


models that drift over time or that are deployed but unmaintained
or unmonitored.

Concluding, I would like to say that a lot of content can still be


added to this chapter as different data science roles are clearly
defined.

I have tried to be as concise as possible so that you are neither


overwhelmed nor underinformed.
Chapter 3
How do I gain the
skills required?
After you have understood the terms and definitions related to
data science and figured out the role for you, it is now time to
understand how to acquire the skills required to land your dream
career.

The tools and techniques required for each role may differ and are
specific, I am going to talk in general on how to acquire the skills.
This chapter is more about the mediums and methods through
which you can gain the skills

There are mainly three ways in which you can do that:

Get an advanced degree


Join a certification program
Upskill online

In this chapter, we will explore each of these options with their


pros and cons discussed in detail.

No matter which one you go for, these are the must-haves that
you should be looking for:

Quality Content
Real-world examples integrated into the content
In-depth analysis encouraging free thinking, than just following
the steps
High-quality projects close to the industry practices
Guidance in building portfolios and cracking interviews
Option 1: Get an advanced
degree
Doing a master's degree or a Ph.D. is one of the options on the
cards. Doing a Ph.D. is a controversial opinion to discuss, so I will
focus on doing a master's degree.

For a master's degree also, I also compare the options of doing it


from India and from abroad.

Pros of a master's degree

Getting a master's degree is the obvious choice to make when you


are looking forward to switching job roles or your discipline
altogether.

If you are from a mechanical engineering background or


commerce background, your undergraduate degree does not
speak much about your data science skills.

Getting a master's degree is a mix of safe and explorative ways of


getting the skills and proving your credibility as a data scientist. A
lot of job descriptions that you see of top companies, usually
prefer candidates having a master's degree.

There are a lot of opinions on whether a degree is worth it or not.


Is it relevant or not? Well, it is subject to debate and the answer
depends a lot on personal opinion.
Personally, I feel it is a good investment which might incur a bit of
loan and some investment in time but it has long term
implications. If you are doing it at top universities, it gets you easy
access across a lot of opportunities.

The best part about joining a master's program is the network that
you build. The alumni network that you will access will help you
grow in your data science journey. You will study with peers who
are going to teach you a lot about teamwork.

My college, where I did my bachelor's, is 100 years old. It has


provided me with a rich alumni network that I have approached
every time I have needed some help. They have been very kind
and welcoming.

The teachers will be like a guiding light to your journey and go that
extra mile to help you if you are in their good books. During my
applications, my teachers went the extra mile to write a LOR and in
guiding me through the process.

Masters: India or abroad?

When I thought of transitioning into Data Science, doing a master's


degree was high on my priority list because I had a rough patch
due to UPSC preparations. To fill that up and restart my career, I
felt like doing an MS will be a great career choice.

Now, I was confronted with two options: whether to study in India


or go abroad
For doing a master's in India, I was required to qualify for the GATE
exam in Computer Science with a good rank. Another option for
me was to target Operations Research courses in India which
would get me access to the Data Science world.

However, after investing 3 years into the preparation for


competitive exams, I was not up for it. Rather, I wanted to invest
that time in actually studying Data Science. Therefore, despite
being an economical option, I refrained from doing it in India.

The education abroad is better with a lot of opportunities.


However, it is costly. Especially, if you are going to a private
university, it will be an uphill task to manage finance.

You can target public universities if finance is a matter of concern.


That being said, you will be able to pay off your loans within 2-3
years. If you get a scholarship, then it is an even better deal!

Cons of a masters degree

Let’s have a look at the negative sides of doing a master's.

Well, it costs you money and considerable time investment of 1- 2


years. You will come out with a loan.

If you are not doing it from a good school, doing a job in the
meantime would be a better choice to go for. Some schools might
be having an outdated curriculum having less connection with the
industry that will do you more harm than good
So while choosing a masters
degree, be very aware of:
Cost
Time investment
The school and its
infrastructure
Faculty
Curriculum
Location (closer to the job
market, the better)
Option 2: Join a certification
program
There are several boot camps or certification programs being
offered by different upskilling platforms. Some with a job
assurance program and some come with a job guarantee program
asking you to pay the fee once you get a job.

Then are some online micro masters being offered by various


companies which you can consider.

Pros of certification courses

In terms of investment, they will cost you a lot less. They are
cheaper than master's programs and they are finished within 6-12
months. The job guarantee programs enable you to take the
course, take a job and then pay later, which is a good deal to go
for.

You can also do these with your job, which is not possible with a
full-time master's program.

Which program should I choose?

I get such questions a lot when I explore the options. The person
on the call would be throwing random courses and asking me to
pick one.

I will let you be the judge, I have written all the good qualities that
a course must have. You can add your personal preference to
them and then decide
Cons of certification courses

Although cheaper than a master’s program, they still require a


substantial investment from you, in terms of money.

You will not have a peer to learn from and study along, which can
be demotivating sometimes. On top of that, the courses are self-
paced which makes it tough to schedule and complete.

You will not have an alumni network to rely on. The quality of the
content of these courses can be questionable, as they provide tons
of material and expect you to master them.

The assessment also does not motivate you enough to push your
limits.

Also, there is a lack of personal attention as they work with a lot of


candidates at the same time. It divides their attention and
priorities.

So, if you are a very self-motivated person, then this is a good


choice for you. Also, it is very economical and less time-intensive.
Option 3: Online Courses
Data Science is a new field and most of the professionals are self-
taught and most of them have upskilled through online courses
across different platforms.

So, as obvious, this a go-to option for someone wanting to get into
Data Science.

Pros of Online Courses

I honestly feel that they are great resources. I also started my


career after taking some online courses. Their certificates used to
hold a lot of importance back in time. They were the standards of
judging someone's seriousness in Data Science.

They are very economical!

Anyone can take these courses in the comfort of their homes.


Being self-paced, you study as per your interest and what you feel
like learning.

Distinguished professors from top universities come together to


design a world-class course. Industry experts with 10+ years of
experience impart their knowledge through these courses.

All the latest technologies and tools are taught on these platforms
at minimum cost to you.
Cons of Online Courses

Online Courses are a very good starting point, no doubt. But there
are some things which need to be fixed.

First, they do not impart in-depth knowledge. I remember taking


some of the courses in the starting, only to be missing out on the
complete picture. When I started reading books, only then I could
make sense of most things that were told in the videos.

For monthly subscription-based platforms, you feel like hurrying


up and completing it instead of getting deep insights. Then, the
assessment in these courses is like a joke! Anyone can pass
without attempting to even code.

The problem of lack of personalized guidance and mentorship is


even more aggravated here. There are no cohorts, no dedicated
mentor which stays with you at every moment.

The focus of online courses on syntax rather than problem-solving


is also problematics. Their capstone projects are usually very sub-
standard with no end-to-end approach.

In a nutshell, they are good to start your learning journey but not
enough to advance your career
How is DataVader trying to
fix the issue?
I had mentioned above that online courses need to be fixed. For
that, I have come up with DataVader, which solves a lot of pain
points.

DataVader helps you build real-world Data Science projects with


1:1 mentorship.

Doing end-to-end projects is the key to convert business


challenges into Data Science problems – so give your recruiters
exactly what they are looking for with my personalized project-
based learning.

With DataVader, you get hands-on experience, a steep learning


curve in Data Analytics and Machine Learning. These projects are
designed by me, tailored for you. With my focused mentorship.

Projects have all the content that you will need. Even if you don’t
know the “D” of Data Science, you are most welcome, everything
will contain the basics.

I am also launching an online community in which you can join


network, and learn. It is exclusive and closely knit.

With that, I would like to conclude this chapter. I hope you are in a
better place to decide which route you would like to take for
yourself. If you need any help, I am here
Chapter 4
How Do I Build My
Portfolio?
If you have reached this page, you are halfway done,
congratulations!

This chapter is your guiding light if you are looking to transition


into Data Science or looking to grow your career in Data Science.
This is a guide on building a portfolio.

When I was starting my career in Data Science, I used to hear all


the time, build a portfolio, do projects but these words did not
work. I had to explore. I am writing the things I have learned over
the years.

So you are pursuing a master's or doing a certification program or


you have recently finished some online courses. You are searching
for jobs but you are seen as an inexperienced data professional.

This is like a paradox out here. You need a job to get experience
and you need the experience to get a job.

How do you break the loop then?

By building a portfolio!
What is a portfolio?
In most simple terms, a portfolio is a collection of your works that
showcase your expertise in a particular domain.

It is your chance to impress your recruiter with the skills you


possess and make them believe that you know what you are
getting into.

It would be a one-stop destination where someone would get all


information about you as a data professional. It is an extended
online resume.

In certain professions like graphic designing, architecture, it is sort


of mandatory to have a portfolio. Last month, when I was hiring a
graphic designer, the first thing I looked for was their portfolio. I
wanted to know what they are good at.

I had to hire two graphic designers, one for website UI and


another for logo and graphics, depending on their expertise. I did
this by looking at their portfolio alone!

This might happen to you as well. If you have a lot of projects on


NLP in your portfolio, chances are that startups or companies
requiring someone in NLP might contact you.
Your portfolio can be made out of various components:

Github profile
Kaggle competitions/notebooks
Blogs on medium, substack or WordPress
Personal website

The format in which you deliver your expertise can be different on


different platforms.

Your Github profile will be full of code-heavy with a readme file


and few dropdowns here and there.

Your blog will have an in-depth explanation of topics with codes.


They have to be detailed with the technical stuff logically
structured. I can promise you that you will learn the most when
you write a blog. You will be giving it out to the world and you will
be attentive to the details.

Kaggle is one of the best places on the internet to practice data


science. Your consistency there will help grab the recruiter's
attention.

Then comes your personal website which has everything in one


place. All the projects that you have coded, the blogs that you have
written, and your Kaggle titles.

You can actually create a personal website out of a Github profile,


no need to spend a lot on building a website. If you are a web
designing fan like me, please create one (and share the template
with me :p).
The intention is the same: To keep everything in one place.

Now comes the disclaimer!

It is absolutely not mandatory that you will have to create 100+


projects on Github to get hired. Or that this will make you the best
data scientists out there.

This is far from true!

A lot of awesome data scientists that I know, don't have


repositories on Github. So, you know that it is not mandatory.

But I will give you some reasons to work on your portfolio:

It will help you break into data science as a fresher


It will make you stand out as a data scientist
It enhances your understanding of topics
It motivates you to pursue some passion projects which could
actually be lifesavers.

If you are in a job and don't want to continue doing data science in
your free time, that's okay. That is your day job already :)
What is a project
A project is basically a sequence of tasks done to accomplish a
certain outcome. Data Science projects are similar.

They use different tools and techniques to get results and solve a
problem statement. With data science projects, you can
demonstrate:

Problem-solving skills
Proficiency with certain data science tools
Mathematical and statistical understanding
Ability to communicate

This gives you even more reasons to do a project.

After you have done a project, you can essentially upload the code
on Github with an elaborate readme file. Create a blog post
around your solution, the challenges faced, and how you
overcame them.

One step further, you can even create a YouTube video to explain
it all. As we know, the best way to learn something is to teach one,
so teach, reach out to people.

You can also do a fun project like a meme-generating Bot and


share it with the world. Learning Data Science while having fun:
double treat. Are you up for it?
How to proceed with a
project?
While doing a Data Science project, you will encounter these steps:

Data Collection

You can collect data from various sources. You can run your own
survey, download some publicly available datasets, or you can
scrape data from the web. Also, there are APIs.

Scraping has some legal and ethical concerns so before you do


that, be conscious and read all the details.

Data Cleaning

This is a very important step in each and every data science


project, including the ones that you will do after coming into a job.

Sorting, segregating, making data consistent, imputing missing


values are some of the things which you will do in this step.

Feature Engineering

This step is about finding feature importance, feature selection,


and merging features. Trust me if you do this step correctly, half of
the issues during modeling or analysis are resolved. Again, this
step is an important yet ignored step, just like data cleaning.
Data Analysis

This is the step where you generate insights from the data and use
the data to derive actions that can be taken to improve metrics.
This I have discussed already in earlier chapters.

Modeling

You would potentially end your project by creating a dashboard if


you want to stop at analysis. However, if you are looking forward
to performing Machine Learning or Deep Learning then this is the
step for you.

Deployment

It is very essential to serve your model. This you can do by creating


web endpoints and then deploying over some cloud service.

That's it! You are done.

Now go and document your work as a blog and show the world!!

You can find a lot of tutorials and datasets on Kaggle, medium,


and other blogs. YouTube is a great source. If you need
personalized learning with mentorship, DataVader is here for you!
How to choose a project?
It is up to you to choose a project. But, make sure that your
projects are able to demonstrate these things:

Your competency with mathematical and statistical concepts


Your familiarity with tools required
Your ability to solve complex problems. It could either be a
novel problem that involves saving the world or some industry-
grade problem.
Your domain expertise. Let's say you are analyzing stocks then
you should be able to decipher the P/E ratio, NAV, Dividend
payout ratio, etc.
Your ability to live with the project. Keep updating and
integrating best practices, as it happens in companies.

Now, a couple of things more which I would like to discuss before


we end this chapter.

Explore as much as you can. Find a domain that interests you and
start doing projects. Don't only focus on state-of-the-art models
and the highest accuracy, also focus on documenting your work
and exploring the data properly.

Doing projects on the titanic dataset or the iris flower dataset will
not add much value as it does not help you distinguish from
others. Pick an interesting dataset.
How can DataVader help
you build your portfolio?
As we discussed, the portfolio is a very important aspect of being a
data scientist.

At DataVader, I help you create real-world projects with real-world


data. We skill you in all the end-to-end process, including:

Data collection
Data cleaning
Data analysis
Creating dashboards
Modeling Deployment

These come with my 1:1 mentorship. I work closely with all the
participants and have a small group.

I encourage you to solve business problems than following codes


as usually done in online courses.

With DataVader, you get hands-on experience, a steep learning


curve in Data Analytics and Machine Learning.

With that, I would like to conclude this chapter. I hope you have
gotten an idea of how to build your portfolio and land your dream
job. If you need any help, I am here
Chapter 5
How Do I Start The
Job Hunt?
So you have learned the basic skills, decided the role on which you
would like to focus. You have done a couple of projects and posted
a few blogs. The next step is to start your job hunt.

The chapter is divided into the following subsections:

Drafting your resume


Optimizing your LinkedIn profile
Reading the job description
Using portals to your advantage
Company websites and cover letters

These points will be elaborated one by one to get a deeper


understanding.

Before you start with the chapter, I just want to let you know that a
job search will not be easy initially. Gradually, you will build up
your network and land where you want to be.

Remember, the process is not only about helping ourselves but


also taking care of others. If you are going to be nudgy and pushy,
it will push your potential recruiters away. Spend 80% upskilling
and 20% in the job search. Not the other way around.

You have to trust in your capabilities and the goodwill of others.

With that, let's get started!


Drafting your resume
Your resume is a very important document for your job search.

It is like a passport to get your entry into the place of opportunities


in the job market.

Having a very well-drafted resume can do wonders. Think of


yourself as a recruiter and now come up with bullet points as to
how you'd want the ideal candidate's resume to be like.

I will tell you what I have understood till now:

Make the resume of one page, if you have 5+ years of relevant


work experience, then only make it two pages
Keep the format simple, do not include flashy colors or images
Have a bold headline as your name which makes it easier to
remember your name
Keep your sections sorted and easy to find. Take care of the
alignment and font size.
Mention everything chronologically
When you mention your experience, try to follow this
approach:
Start with the problem statement
Write on how you solved it
Mention the impact you were able to make, for example,
an increase in sales/revenue or a reduction in the customer
churn. Quantify it.
Lastly, you may omit hobbies and unrelated information from
the resume.
A good resume example
Optimizing your LinkedIn
profile
LinkedIn is where you will find a number of professionals hanging
out. There are hiring managers, CEO, CTO of a majority of
companies. Getting noticed by them can get you easy access.

You need to get a professional photo, have a good and impressive


headline. Open to opportunities, Aspiring data scientist might not
go a long way. Have a confident headline and activate the "Open
to work" feature. This is the best way to let recruiters know that
you are open to opportunities.

Write your "About" in the first person, and explain your headline in
detail here.

Add relevant skills, take some skill assessment on LinkedIn.


Mention your projects and certifications. Link your Github and
Kaggle profiles in the featured section so that they can be
accessed. Also, attach your updated resume there.

Be polite, be kind.

The biggest mistake that people make is to send random


connection requests to anyone on the platform and send a generic
message of "Can you get me a referral?"
This is not how someone can refer you because to put your
referral, someone has to take personal responsibility for you and
they can't risk it for someone randomly sending messages.

I have forwarded some resumes of my students to some friends


working in good companies. But this has happened only when I
am 100% sure of their abilities. Trust matters.

Next thing, it is very irritating when someone texts you without


even knowing where you work or what you do and then drop a
generic text of "Can you get me a referral?"

Sometimes, I have messages and connection requests starting


with "Hello Sir". This puts off the concerned person. Know before
you text someone.

Try to get noticed by them first. Engage in conversations, comment


thoughtfully on posts. Posting "Interested" everywhere does not
count here. Engage meaningfully.

Increase your visibility on LinkedIn and post your progress there.


Someone might notice and take a chance on you.

Follow relevant hashtags like "data science jobs", "data science


internship" so that you can stay updated whenever there is a new
opening.
Reading the job description
When you come across a job posting, glance through the Job
description and look at their requirements.

As I told in chapter 4, different companies might have different


titles for the same kind of job opening. So it is necessary that you
go through the job descriptions before you apply.

You should understand the requirements properly through the job


description. If you get a shortlist from the company, go back to
their job descriptions and try to prepare some of the requirements
before the interview.

You can mold your interview in the requirements that they have
asked which will help you gain an edge over other applicants.

This exercise will also help you to advance your skills by pointing
out the lacunae in your interview preparation. You can use these
guidelines to further upskill yourself.

Now comes the tricky part, reading the job descriptions, you might
get overwhelmed at times. Companies over expect from
applicants and mention a number of skills which is almost
impossible for a fresher or moderately experienced professional
to have.

In that case, you need to focus on your strength. Some HRs do not
know the requirements in detail and write whatever they find
online. Be patient and focused!
Using portals to your
advantage
There are a couple of online portals to search for jobs like
Monster, Naukri, Indeed, Timessearch, cutshort, etc.

But, I feel the lowest input and high return platform is Naukri.com!

If you are consistent on the platform and keep updating it, you will
get a lot of recruiter attention. Personally, I updated my resume in
January and I still get recruiter in-mails. Combined with LinkedIn,
Naukri has been my go-to platform for job search.

Ways to optimize your Naukri profile:

Constantly update your Naukri profile, almost once in two


days, some minor words here and there
Add relevant keywords to your profile through which recruiters
can find you.
Fill out the sections in the profile in detail
Always have an updated resume there
Reply to all the inbox messages you get from recruiters
Do not apply randomly Same as everywhere, spend 80% time
in preparations and 20% in applying and following up.
Lastly, be consistent and patient
Company websites and
cover letters
Another good resource for you to find jobs is company websites.
You can spot some companies of your interest and go through
your job openings. If there is a suitable opening, great, apply
there.

Now, I will not tag them as high ROI resource, but you source can
apply. Some of the companies might ask you to attach a cover
letter with your resume.

While this might seem like a draining and futile exercise, it gets
your foot across the door if written properly. To write a cover
letter, you can follow the I, YOU, WE approach.

I: Write about you, introduce yourself, in which field you work


and your experience. You might want to be very concise and to
the point.
You: You can write a bit about the role and requirements as
mentioned in the job description. This will give you a space to
elaborate on how you can be a good fit for the role.
We: In this paragraph, you can be creative and write how you
will fit in the role and how you and the company can progress
towards achieving the vision of the company

Again, it might seem like a long and tedious process, it is vital to be


mindful of writing a good cover letter.
These are my thoughts and opinions on starting your job hunt and
the ways in which you can ace it.

Remember, it might not be easy but it will be worth it.

While applying for companies, make sure that you are the right fit
for the job and the job is right for you.

Getting a job is not the end goal, you need to be aware of the
company's work culture and environment before joining. You also
need to look if you will get opportunities in the company to grow.

In the next chapter, you will get a comprehensive guide on how to


approach interviews, prepare for them and ace it.

Let me know if you need more information and if I can help you
with something
Chapter 6
How To Ace the
interview
Till now, if you have followed this guide meticulously, you would
have received some interview calls. This chapter will help you ace
the interview, whenever you get the chance.

I read a motivational piece of post years ago, which said that


instead of spending time bemoaning over lack of opportunities,
start preparing yourself for the right opportunity. Even if you get
one, you should be able to gain that one.

The same goes for the Data Science interview. As I said in the last
chapter, spend 80% time in upskilling and 20% time in applying for
jobs. 80% time should be allocated in becoming so good that even
if 20% of the time, fetches you an opportunity, you grab that one!

This means that you should be totally prepared for the interviews
in and out

I find interviews very fascinating. It is like, within a short span of


time, you have to convince the recruiter that:

You have a relevant skillset


You have business understanding
You have good communication skills
You can work in teams

This might seem daunting, you can ease this with practice! I am
going to include a couple of my personal experiences with
interviews.
Know what recruiters want
In the last section, I gave you some pointers over which a recruiter
might assess you in the span of time you spend with the recruiter.

Let's break the expectations one by one to understand the


requirements in depth. You feel like you have prepared enough,
but if it is not aligned as per what the recruiter wants, the efforts
will not result in a tangible outcome.

So, know before you go.

Relevant skillset
There are two words to focus on here:
1. Relevant
2. Skillset

Skillset can be technical and non-technical but in this section, we


will focus on technical skills.

So, the recruiter is interested in your data skills and problem-


solving skills. You can demonstrate with coding rounds that might
be in two formats:

1. Take-home assignments
2. Live assessment
The second word is relevant here. It means that the skillset which
you possess should align with what the recruiter wants.

I will tell you something from my personal experience. In January, I


interviewed with a startup in the agritech domain. We had a
discussion over a number of things and everything seemed fine.

Until they wanted someone with knowledge of the Firebase


database. Everything other than this went smooth and perfect.

So, despite experience with other databases, I was not finally


selected for the job because I did not have a "relevant" skillset.

This will not mean that you have to go out and acquire each and
every type of skillset available. It means that you have to stay
focussed on openings whose demand for relevant skillset is what
you possess.

Business understanding

We have discussed at length this in the first chapter. The company


you are applying to will be working in a particular domain or
industry. Knowing the basics will get you far.

I interviewed for a biotechnology firm that wanted the candidates


to have an understanding of genomics. That's why while building
your portfolio, try to know the basics of the domain.
Good communication skills
This is one of the very underrated skills which I believe can take
you far. Organizations want people with good communication
skills who can explain the actionable insights to stakeholders with
absolute clarity and simplicity.

This can be enhanced manifolds by writing blogs and posting on


social media as you will get instant feedback.

You should focus on Data Storytelling. I recently came across an


article by Harvard Business Review which stated that the last step
implementation of Data Science projects is hampered because of
lack of communication.

Managers have high expectations from data teams and when data
teams come with solutions, they are very complex which creates a
wide gap between them.

That is why communicating well is also one of the top priorities of


the recruiters.

Team working

This usually forms a part of HR rounds, but they want to assess


you on team working skills so that they can esnure coherence in
teams and derive best outcomes.
The interview process
There is no set pattern as such, different companies prefer doing it
differently. Some of the elements might be frequent, some not.

I am going to list all of them and some details as to what these


steps mean:

Aptitude test: For some entry-level positions, some


companies take an aptitude test that tests reasoning,
quantitative aptitude, and English Comprehension skills.

You can prepare for these using a lot of online sources.

Take-home assignment: Your problem-solving skills will be


assessed by take-home assignment which can be either in
Python or SQL.

You can practice these on Hackerrank, I believe they have the


best set of questions. As far as Python coding is concerned,
solve Hackerrank and try to finish basic level Data Structures
and Algorithms which will help you solve a lot of problems.
DSA is not mandatory, but if you can study them, well and
good.

Case study: This section is to check your data skills. You will be
provided with a dataset and will be asked to manipulate data,
visualize, create a machine learning model, depending on the
job you have applied for.
You have to be really good with your data manipulation skills using
different Python libraries like pandas, matplotlib, sci-kit learn, etc.

Interview: Different companies can have different stages of


interviews. In the technical interviews, you can be asked
questions related to theory. You will be also asked questions
related to projects you have done or worked on in your job

This will also help the recruiters to assess your problem-


solving skills. Try to structure your projects in

Problem
Statement

Your
approach

Impleme-
ntation

Tools used
and why

The
impact
There can be more interview rounds, depending on the company
and its management.

One of the best tips that I received for interview preparation is to


have answers ready.

You can do that by looking at the most frequent questions asked


in Data Science interviews and preparing answers for them. Later
on, as you start giving more interviews, you will have your own
interview question bank. You can use this to prepare answers.

The prepared answers will give you confidence and reduce the
randomness of the process.

This guide can be elongated further but I will take a pause here.
Chapter 7
Advance Your
Career
Once you get the job, now comes the final step!

To advance your career. You can

The first thing you might want to do after you join the job is to
meet the manager and set the expectations right.

You may feel that you should already know the job expectations
from the job posting and interview process. Although this is
sometimes true, a lot can change between the interview process
and the start of the job.

The interviewers may not be in the same time frame as you, or the
organization may have changed before you joined. By talking to
your manager as early as possible, you’ll get the most up-to-date
information and have time to spend discussing it.

Ideally, your manager has a vision of what you’ll be doing but is


open to your priorities and strengths. Together, you want to define
what success means in your job.

Generally, your success is tied to making your team and/or


manager successful; if the members of the data science team
aren’t all working broadly toward the same objective, it can be
difficult to support one another.
To define your own success, you need to understand what
problems the team is trying to solve and how performance is
evaluated.

Will you be helping to generate more revenue by working on


experiments to increase conversion, or will you be making a
machine learning model to help customer service agents predict a
customer’s concerns, with the goal of decreasing the average time
spent per request?

You can’t know when you start a new job what the expectations
are in terms of job responsibilities.

Some companies value teamwork; you may be expected to work


on several projects at the same time but drop your work at a
moment’s notice to help a colleague.

Other companies ask that you have deliverables on a regular


basis, and it’s OK to ignore emails or Slack messages to finish your
project.

The way to find out whether you’re meeting expectations is to


have regular meetings with your direct supervisor.

So walk the talk as soon as you join!


Know the data
You do need to learn about the data science part as well, of
course.

If your company has been doing data science for a while, a great
place to start is by reading reports that employees have written.

Reports will tell you not only what types of data your company
keeps (and give you key insights), but also the tone and style of
how you should communicate your results.

Much of a data scientist’s job is conveying information to


nontechnical peers, and by reading reports, you’ll have a sense of
just how nontechnical those peers are.

See how simplified or complex the writers make certain concepts,


and you’ll be less likely to over-or underexplain when it comes
time to write your own reports.

Then you’ll need to learn where the data lives and get access to it.

Getting this access includes knowing what table contains the data
you want and maybe also what data system has it. Perhaps the
most frequently accessed data lives in a SQL database, but the
event data from two years ago lives in HDFS (Hadoop Distributed
File System), which you need to use another language to access.
Making the job change
You can think of applying to different companies once you have
spent a bit of time in your first company.

If you have had some complications in your first job, like a toxic
work environment, unsupportive managers, or anything else, you
can think of shifting jobs before that as well.

While choosing jobs, don't make salary as the only criterion, think
of fit. Whether you are fit for the role or not.

With this, we come to an end to our eBook. I hope this eBook was
helpful. if you'd like to know further or join DataVader, you can
visit datavader.io or mail me at datavaderio@gmail.com

Wish you a growing and fulfilling career in Data Science.

You might also like