Professional Documents
Culture Documents
Beyond The Numbers - Discovering The Fascinating History of Data Science
Beyond The Numbers - Discovering The Fascinating History of Data Science
askokoad@gmail.com
1ZOGK3IHL5
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Meet Your Speaker
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Learning Objectives
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Learning Outcomes
askokoad@gmail.com
1ZOGK3IHL5 Illustrate how data science has evolved over the past century.
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Agenda
Origin of Decisions
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Origin of Decisions
Let’s consider a few situations that early civilizations might have faced:
inputs Temperature
Humidity
When to start cultivating a crop?
Decisions are made today by businesses in the same way - but the methods
have become more accurate and faster owing to the evolution of Video Thumbnail
statistical techniques & computing capabilities. DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Paradigms in Data Science
Inferential Computational
askokoad@gmail.com
1ZOGK3IHL5 Complexity of algorithms and cost of training
Representativeness of data Limitations
large models
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science
askokoad@gmail.com
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science
Inferential
Central Tendency
Expected Values
Probability Theory
< 1940s 1940s - 50s 1960s - 70s 1980s - 90s 2000s - 10s 2020+
askokoad@gmail.com
Telegraph
1ZOGK3IHL5
Difference Engine
Audio Tapes
Computational
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science (1940s to 1950s)
Industrial Statistics
Use prior knowledge to predict future uncertainties
Sampling Theory
askokoad@gmail.com
Make inferences about a population, using a sample
Inferential
1ZOGK3IHL5
ANOVA
Compare the means of 3+ groups - Evidence of difference
askokoad@gmail.com
1ZOGK3IHL5 Run simulations with random inputs to arrive at conclusions
Programming Languages
To have a computer understand instructions & execute them
< 1940s 1940s - 50s 1960s - 70s 1980s - 90s 2000s - 10s 2020+
askokoad@gmail.com
Telegraph Digital Computers
1ZOGK3IHL5
Monte Carlo Methods
Difference Engine
Programming
Audio Tapes Languages
Computational
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science (1960s to 1970s)
Non-Parametric Methods
Rely on ranking/ordering of data rather than the distribution
Decision Theory
askokoad@gmail.com
Assign probabilities to different outcomes to make a decision
Inferential
1ZOGK3IHL5
Robust Statistics
Provide accurate results despite outliers/extreme values
askokoad@gmail.com
1ZOGK3IHL5 Store, Organize & Query large amounts of data quickly
● CharlesBachman
Charles Bachman - IBM
- IBM - 1st
- 1st DBMSDBMS ever
ever created
created
● Businesses wanted databases to be
Businesses wanted databases to be standardizedstandardized
● CommonBusiness
Common Business Oriented
Oriented Language
Language - COBOL
- COBOL
● Laid
Laid the
theroots
rootsfor
forthe
thecreation of MySQL
creation in 1995
of MySQL in 1995
Non-Parametric
Central Tendency Industrial Statistics
Methods
Expected Values Sampling Theory
Decision Theory
Probability Theory ANOVA
Robust Statistics
< 1940s 1940s - 50s 1960s - 70s 1980s - 90s 2000s - 10s 2020+
Computational
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science (1980s to 1990s)
Resampling Methods
Simulate multiple datasets from original data for analysis
1ZOGK3IHL5
Object-Oriented Programming
Computational
askokoad@gmail.com
1ZOGK3IHL5 An abstract entity with its own set of properties & functions
< 1940s 1940s - 50s 1960s - 70s 1980s - 90s 2000s - 10s 2020+
Computational
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science (2000s to 2010s)
Bayesian Networks & Graphical Models
Relationship b/w variables in a dataset using graphs
Causal Inference
askokoad@gmail.com
Is change in one variable changing the other?
Inferential
1ZOGK3IHL5
Big Data
Computational
askokoad@gmail.com
1ZOGK3IHL5 Massive digital information generated every second
Cloud Computing
Computing power & resources for everyone, on-demand
< 1940s 1940s - 50s 1960s - 70s 1980s - 90s 2000s - 10s 2020s+
Computational
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science (2020s+)
Interdisciplinary Approaches
Knowledge from multiple disciplines for problem solving
● Tesla - Advances in battery + electric motor tech
● Model S - Range of 400 km in a single charge
● Accelerated transition from fossil fuels
● Innovative solutions to complex problems
1ZOGK3IHL5
Natural Experiments
Observe events naturally occuring w/o manipulating factors
● Journal of Public Economics - Study of policy impact
Video Thumbnail
● Effectiveness of Public health interventions
● Impact of business closure due to the pandemic on jobs DO NOT REMOVE
● Investigate complex phenomena
This file is meant for + precise
personal conclusions
use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. AllReserved.
Rights Reserved. Unauthorized use or distribution
Proprietary content. © Great Learning. All Rights Unauthorized use or distribution prohibitd.
Evolution of Data Science (2020s+)
Blockchain
Share information - secure, transparent, & tamper-proof
Edge Computing
Computational
askokoad@gmail.com
1ZOGK3IHL5 Compute directly at the source of data, instead of remote
Quantum Computing
Use the principles of quantum physics to compute
● 1st built in 1998 - Los Alamos Laboratory New Mexico Video Thumbnail
● Impact areas: Cryptography, Chemistry & Optimization
● In early stages, a lot of opportunities are still theoretical and DO NOT REMOVE
under experimentationThis file is meant for personal use by askokoad@gmail.com only.
Proprietary
Sharing content. © Great Learning.
or publishing All Rights
the contents Reserved.
in part or fullUnauthorized
is liable foruse or distribution
legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Evolution of Data Science
Inferential
Non-Parametric Resampling Methods Bayesian Networks Interdisciplinary
Central Tendency Industrial Statistics
Methods Approaches
Generalized Linear Causal Inference
Expected Values Sampling Theory
Decision Theory Models Newer Causal Inference
Open Science
Probability Theory ANOVA Natural Experiments
Robust Statistics Model Selection Movement
< 1940s 1940s - 50s 1960s - 70s 1980s - 90s 2000s - 10s 2020s+
Computational
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Let’s conclude by defining data science
Machine
Learning &
Computer Deep Learning
Computer Science & Math & Statistics
Science
Technology
&
Technology
askokoad@gmail.com Data
1ZOGK3IHL5 Science
Software
Development Research
Domain/Business
Knowledge
Video Thumbnail
DO NOT REMOVE
This file is meant for personal use by askokoad@gmail.com only.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Summary
Upon completion of this module, you will be able to:
Early civilizations made use of decisions to solve critical problems that impacted the growth
and sustenance of the community.
Data Science involves the amalgamation of two paradigms - the inferential and
computational paradigms.
The Inferential paradigm focuses on the statistical methods for analysis, while the
askokoad@gmail.com
1ZOGK3IHL5 computational paradigm focuses on computational methods and algorithms.
Data Science has evolved from simple probabilistic models and primary computers in the
early 1950s to highly advanced inference methods and computing in the 2020s.
The developments in inferential and computational paradigms act as catalysts that propel
further advancements in these categories as the evolution continues.
Video Thumbnail
DO NOT REMOVE
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Ha p p y Lea rning !
askokoad@gmail.com
1ZOGK3IHL5
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution
This file is meant for personal use by askokoad@gmail.com only.
Sharing or publishing the contents in part or full is liable for legal action. 27
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.