You are on page 1of 7

12/12/2018 What tools should a data scientist know?

- Quora
1
Home Answer Spaces Notifications Search Quora Add Question or Link

Data Scientists Jobs and Careers in Data Science Data Science

There's more on Quora...


What tools should a data scientist know?
Pick new people and topics to follow and see the
Answer Follow · 39 Request   best answers on Quora.

12 Answers Update Your Interests

Sudalai Rajkumar S, Lead Data Scientist Related Questions


Answered Nov 15, 2017

From the tools perspective, a Data Scientist must know the following Are you a self-made data scientist? How did you
do it?

Data Fetching Tools: In most of the cases, the data is captured and What do experienced data scientists know that
stored in databases and the DS should be able to pull the data to do her beginner data scientists don't know?

analysis. So knowledge on SQL is needed. What is the difference between a data analyst and
a data scientist?
Data Analysis Tools : Once the data is fetched, the DS should be able to
What should every data scientist know about
do all the necessary preprocessing / cleaning on the data and then to
machine learning?
carry out her analysis / modeling. So knowledge on atleast one of the
What qualities separate great data scientists from
data analysis tools like Python, R is must.
good data scientists?

Data Visualization Tools : This is often overlooked but this is as What are the most valuable skills to learn for a
important as the previous two. Once the analysis / modeling is done data scientist now?

and the results are ready, they needed to be communicated in an What tools do data scientists use at Quora?
effective way to the business stakeholders. I have seen plots /
Do data scientists code?
visualizations to be more effective in conveying the results rather than
How does Airbnb hire data scientists?
words / writings. So knowledge of visualization tools like tableau,
qlik or knowledge on plotting libraries like ggplot, matplotlib is also What is the salary of a data scientist?

important.  Ask New Question


In addition to this, some other good to have tools are More Related Questions

If the DS has to deal with huge datasets, then tools like Hadoop, Spark Question Stats
will be very useful.
38 Public Followers
If the DS is doing analysis / models on unstructured data like image / 59,983 Views

text / voice, then knowledge on deep learning tools like TensorFlow, Last Asked Nov 15, 2017

Torch will be useful. 1 Merged Question


Edits
Also there will be times, when the DS has to fetch the data from web or
APIs, so knowledge on some basic programming / scripting will be a
plus.

One another important tool without which the answer will not be
complete is Spreadsheets

Kaggle recently conducted a ML and DS survey and the result of commonly used
tools in work can be seen below.

https://www.quora.com/What-tools-should-a-data-scientist-know 1/7
12/12/2018 What tools should a data scientist know? - Quora

So most people use Python in their work followed by R and SQL according to the
Kaggle survey. More detailed look at the survey results can be seen in this
interactive kaggle kernel .
22.3k Views · View Upvoters · View Sharers

Upvote · 386 Share · 1  

Add a comment... Comment Recommended All

Sponsored by Toucan Toco

An alternative to existing BI tools?


Tell business performance stories through interactive data and data storytelling.

Learn more at toucantoco.com  

Sakina Mirza, Leverage Data Science for Strategic Differentiation


Answered Dec 15, 2017

Hi Guys,

Finally I am eligible to answer this question as I got selected in MuSigma as Data


scientist after an year exp in Big data developer role. Previous to that I was
working in software testing domain. So from my experience, I will guide you
what all technologies are required to clear data scientist interviews.

Firstly I felt that most of the companies will look for prior experience in Big data
field before taking you as data scientist. As I was having this, I became eligible
for data scientist interviews.

Below are further skills that are seen in interview for data scientist position.

Data Scientist needs to have both technical and non-technical skills to perform
their job in an effective manner.

Technical skills are involved at 3 stages in Data Science. 3 categories of tools are
needed – tools for pulling data, tools for analyzing the data, and tools for
presenting the results.

Here are the different tools available for performing the same:

Tools for data pulling & pre-processing

a. SQL

This is a must skill for all data scientists, regardless of whether you are using
structured or unstructured data. Companies are using latest SQL engines like
Apache Hive, Spark-SQL, Flink-SQL, Impala, etc.

https://www.quora.com/What-tools-should-a-data-scientist-know 2/7
12/12/2018 What tools should a data scientist know? - Quora

b. Big Data Technologies

This is the must out of the Skills Needed to Become a Data Scientist. The data
scientist needs to know about different Big Data technologies like Hadoop and
its ecosystem, Spark and Flink if possible.

c. UNIX

As most raw data is stored on a UNIX or Linux server before it’s put in a data-
store so it’s nice to be able to access the raw data without the dependency of a
database. So Unix knowledge is good for Data Scientists.

d. Python

Python is a most popular language for the data scientist. Python is an


interpreted, object-oriented programming language with dynamic semantics. It
is a high-level language with dynamic binding and typing.

Tools for Data Analysis & pattern matching

This depends on your level of statistical knowledge. Some tools are used for
more advanced statistics and some for more basic statistics.

a. SAS

Lots of companies use SAS, so some basic SAS understanding is good. You can
manipulate equations easily.

b. R

R is most popular in the statistical world. R is an open-source tool and


language that is object oriented, so you can use that anywhere. It is the first
choice of any data scientist as most things are implemented in R.

c. Machine Leaning

Machine learning is the most demanding and most useful tool the data
scientists must have. There are lots of machine learning tools are available in the
market like weka, nltk, etc. but machine learning tools on top of big data
technologies are grabbing industry attention like Mahout, MLlib, FlinkML.

Tools for Visualization

a. Tableau

It is a popular tool, especially in Silicon Valley.Apart from above-mentioned


tools following tools are also popular – JasperSoft, SAP BI, QlikView,
MicroStrategy, etc.

Non-Technical Skills

a. Business Acumen

b. Communication Skills

Companies are searching for data scientists who can clearly and confidently
translate their insights on the data to other teammates. A data scientist arms
them with quantified insights.

c. Analytical Problem-Solving

Analytical problem-solving skill is highly demanding for Data Scientist so that


the right approach can be used to get maximum output in available time and
resources.

https://www.quora.com/What-tools-should-a-data-scientist-know 3/7
12/12/2018 What tools should a data scientist know? - Quora

You can connect with me at sakina.mirza46@gmail.com if you need any help


related to this domain.Hope I can be helpful to others as others have helped me
in becoming data scientist.
2.7k Views · View Upvoters

Upvote · 34 Share  

Add a comment... Comment Recommended All

Sponsored by Snowflake

Share your data in real time.


Instantly & securely share any amount of governed data in real time, within and
beyond your organization.

Download at dummies-guide.snowflake.com  

Avinash Navlani, Lead Data Scientist at Northout Solutions


Answered Nov 30, 2017

Here are the following tools Data Scientist should Know:

1. Relational Database: At least one relation database tool such as mysql


and Postgre. SQL is must for working on relational databases.

2. NoSQL Database: At least one NoSQL database such as MongoDB, Neo4j

3. Programming Language: Python, R

4. Big data Technology: Hadoop, Spark (MLlib for model generation in


spark)

5. Visualization Tools: Tableau or qlikview

6. Scraping packages/tools for extracting data from various sources.

7. MS Excel

8. Packages for text mining for example in python Spacy and NLTK.

9. Deep Learning Tools: Tensorflow and Keras

10. Web Application Framework such as flask for designing POC.

11. IDE such as Jupyter Notebook, Spyder, Rodeo.

12. Scala for large data computation on spark.


799 Views · View Upvoters

Upvote · 13 Share  

Add a comment... Comment Recommended All

Deepak Mahtani, Community Manager/Data Scientist at Pivigo (2017-


present)
Answered Jun 19 · Upvoted by Frederick T. Williams, M.S. Data Science, Northcentral
University (2019)

Hi, this is a great question. In terms of technical tools, there are 3 main ones that
I recommend:

1. Programming. The two main programming languages used in data


science, Python and R. I would advise learning one of these two as this
will form the basis of you work as a data scientist.

2. Machine Learning. One of the coolest and most useful tools for building
predictive tools, e.g. recommendation systems.

https://www.quora.com/What-tools-should-a-data-scientist-know 4/7
12/12/2018 What tools should a data scientist know? - Quora

3. databases. Some of the data you will be using will be stored in


databases. I recommend starting with relational databases and SQL is
used to access data within a relational databases.

A soft skill that is often overlooked is communication. This skill can be built up
over time. I recommend going to data science Meetups. Be very picky about
which ones you go to. Go to ones that interests you. You will quickly see what
works well and what does not.

I hope this helps :)


24.4k Views · View Upvoters · View Sharers

Upvote · 49 Share · 1  

Add a comment... Comment Recommended All

Eshika Roy, works at DexLab Analytics


Answered Nov 16, 2017

Data is everywhere. And the demand for data scientists was never so much as of
now.

Here is a set of tools that every data scientist should know of:

1. You should belong from a strong quantitative background – Grabbing a


post-graduate degree in in-demand fields like mathematics, statistics or
software engineering helps in becoming a suave data scientist.

2. You should possess analytical and programming skills – Apart from


holding a good degree, having analytical acumen is important. Excel on any two
analytical tools, such as R, SAS, Python, or Hadoop and put your feet in this
dynamic industry.

3. Having a knack for data – Get yourself drowned in data. You have to be
totally proficient in data, both structured and unstructured. You also have to
slice and dice the raw data to frame intuitive reports.

If you are confused about which Data Science Certification to consider,


DexLab Analytics is your key to a better analytics career. Being a premier data
science training institute in Pune , the courses offered by DexLab lives up to
its very name.
976 Views · View Upvoters

Upvote · 3 Share  

Add a comment... Comment Recommended All

Saanvi Priya
Answered Dec 15, 2017

The tools a data scientist must know is discussed below

1. Basic Tools:

A data scientist is in great demand and he must know how to use the tools of his
trade. This simply means that a data scientist should be well aware of the
statistical programming languages like R or Python, and SQL. You need to
confirm that your data science team includes such a skillful scientist for making
your datasets meaningful yet productive.

Data visualization tools :

https://www.quora.com/What-tools-should-a-data-scientist-know 5/7
12/12/2018 What tools should a data scientist know? - Quora

Data visualization tools are used to draw the interactive chats.In this qlik view
and qlik sense are the things plays an important role for visualization.Tableau is
one of the data visualization

Having lack of data

Text mining

To learn more : BEPEC | Why Data Science?| Bengaluru


283 Views

Upvote Share  

Add a comment... Comment Recommended All

Vidita Mehta, Data Science, Big Data Analytics & FinTech Advisor at
Imarticus (2018-present)
Answered Mar 31

The silver lining behind this list is that most job postings have the following
phrase: "know at least one of the following...". Which means that you don’t
actually have to go out and learn all of the tools. It just means that you should
know at least one of them really well and have a passing familiarity with some of
the others ones. You don’t need to know them intimately, you just need to know
what they do.

So, if you are looking for a data science job, based on this data, the best way to
get started is to learn R, SQL, and Hadoop. Then have a passing understanding
of Python and the tools that work with Hadoop like Hive, Pig, and others. This
will make it so that you know at least one of the tools that data science positions
are looking for and you’ll have a good start to becoming a data scientist.

If you are not done any courses, I would suggest go to Imarticus Lerning.
Imarticus help aspirants like you to upgrade yourself and kick-start a career in
big data analytics. Extensive projects, case studies & mentorship are a few
highlights of our courses, since we believe in Learning by doing, which has won
us various esteemed awards in the industry.

Imarticus Learning , an award winning institute do offer certification courses


for various big data analytics tools such as R, SAS, Python, Big Data and Hadoop.
We offer courses in both classroom and online mode. If you want to be part of
the sexiest industry of the 21st century, you may feel free to consider any of our
big data analytics courses. We provide 100% career assistance for these
programs which includes Resume Building, Extensive Interview Prep, etc

Our courses are as follows:

Post Graduate Program in Data Analytics : This program helps you to


understand foundational concepts and hands-on learning of leading analytical
tools, such as SAS, R, Python, Hive, Spark and Tableau as well as functional
analytics across many domains.

Data Science Prodegree : The program is co-created with Genpact as the


‘Knowledge Partner’, and comes with a cutting edge industry-aligned
curriculum. This program help you with deep understanding of Data Analysis
and Statistics, along with business perspectives and cutting-edge practices
using SAS, R, Python, Hive, Spark and Tableau.

To know more about our programs visit our website (http://imarticus.org/?


id=Website... ).

https://www.quora.com/What-tools-should-a-data-scientist-know 6/7
12/12/2018 What tools should a data scientist know? - Quora

Hope this will help you.


298 Views

Upvote Share  

Add a comment... Comment Recommended All

Top Stories from Your Feed

https://www.quora.com/What-tools-should-a-data-scientist-know 7/7