You are on page 1of 14

Data Science

Revision
Data Science Lifecycle Phases –
1. Discovery – Understanding business domain, identifying potential data sources,
identifying the key stakeholders
2. Data Preparation – Preparing Analytic Sandbox, Data Cleansing, Data
Transformation
3. Model Planning – Data Exploration, Model Selection
4. Model Building – Finalizing a model
5. Communicate Results – Design for aesthetics, choosing an effective medium and
channel

6. Operationalize – Small scope pilot deployment, Full scale deployment


Business Intelligence vs Data Science
Business Intelligence is a means of
performing descriptive analysis of data using
technology and skills to make informed
business decisions.
BI facilitates decision making by enabling the
sharing of data between internal and external
stakeholders.
Some advantages of BI include –
• Gaining a better understanding of the
market.
• Uncovering new revenue opportunities.
• Improving business processes.
• Staying ahead of competitors.
Perspective
• Designed to look backward. • Designed to look forward.
• Looks at real data from real • Interprets the information to
events. predict what might happen in
future.
Transform
• Helps businesses answer the • Helps businesses discover new
questions they know. questions.
• Enables companies to apply • Encourages companies to
insights to existing data. apply insights to new data.
Common Questions
• What happened last quarter? • What if …..?
• How many units sold in last • What’s the optimal scenario
month? How did my daily for our business?
sales perform? • What will happen next? What
• Where is the problem? In if these trends continue? Why
which situation? is this happening?
Analytics
• Descriptive. • Predictive.
• Retrospective • Prescriptive.

• Standard and ad-hoc reporting, • Optimization, predictive


dashboards, alerts, queries, modelling, forecasting,
details on demand. statistical analysis.
Focus
• Focuses on detailed reports. • Focuses on predicting future
• Looks at KPIs and trends. trends.
• Looks at patterns and
observations.
Data Sources
• Rigid approach. • Flexible approach.
• Data sources tend to be pre- • Data sources can be added on
planned and added slowly. the go as needed.
Decisions
• Human directed decisions. • Machine or AI directed
• Users make strategic decisions decisions.
based on their interpretation of • AI will make strategic
data. decisions for the business
• Manual and subject to bias. consistently and without bias.
Storage
• BI systems tend to be • Data Science tends to be
warehoused and siloed. integrated and well governed.
• Difficult to deploy data across • Data can be distributed in real-
the business. time across business.
Ownership
• IT Dept owns it and sends data • Big Data Solutions are owned
to Business Analysts who by analysts.
interpret it. • More time is dedicated in
• More time is consumed in IT analysis and prediction.
Warehousing.
Tools
• MS Excel, SAS BI, Sisense, • Python, R , Hadoop/Spark,
MicroStrategy etc. SAS, TensorFlow etc.
Similarities between BI & Data Science

1.Share love for Data Analysis


2.Principle - Garbage in / Garbage out
3.Collaboration is essential
4.Cloud is the great enabler
5.Self-service access to the data
Role of Data Scientist
The Data Scientist is responsible for designing and creating processes and layouts for complex,
large-scale data sets used for modelling, data mining and research purposes. Data Scientist play an
active role in 4A’s of data –
• Data Architecture
• Data Acquisition
1. Collection of data from multiple sources
2. Data Cleansing
• Data Analysis
1. Analyzing and exploring data to predict and determine trends and patterns
2. Producing data-driven solutions
3. Inventing new algorithms
• Data Archiving

You might also like