You are on page 1of 51

TDT4259 – Applied Data Science

Lecture 1: Introduction

Nisha Dalal
Adj. Associate Professor

nisha.dalal@ntnu.no
2

Warm up question

Hurricane Frances, 2008


3

Warm up question

Hurricane Frances, 2008


4

Where have all the criminals gone?


• Early 1990s, very high crime rate, rose to 80% in
the last 15 years!

• Predicted to increase by 50% over the next


decade, 15% in the best-case scenario.

Freakonomics, Steven Levitt and Jon Dononohue,


5

Where have all the criminals gone?


• Early 1990s, very high crime rate, rose to 80% in
the last 15 years!

• Predicted to increase by 50% over the next


decade, 15% in the best-case scenario.

• Crime rate began falling in the early 1990s, it did


so with such speed and suddenness that it
surprised everyone.

Freakonomics, Steven Levitt and Jon Dononohue,


Figure: Source
6

Where have all the criminals gone?


• Crime-drop explanation number of citations:

1. Innovative policing strategies


2. Increased reliance on prisons
3. Changes in crack and other drug markets
4. Aging of the population
5. Tougher gun-control laws
6. Strong economy
7. Increased number of police

Freakonomics, Steven Levitt and Jon Dononohue,


7

Where have all the criminals gone?


Roe vs Wade?

Freakonomics, Steven Levitt and Jon Dononohue,


8

Where have all the criminals gone?

Source: Left, Right


9

Agenda
1. Who I am and what do I do

2. What this course is NOT and what it is

3. What you will be doing in this course

4. Course outline

5. Why do we even care about applied data science?

6. General Q&A session and other important things


Who I am and what do I do?
11

Information Systems and Software Engineering – ISSE Group

IT Startups Established
Organizations
12

Aneo AS – AI Team
What this course is NOT and what it is
14

This course is NOT about….


• Statistics

• Machine learning

• Data warehousing/mining

• Programming
15

…..and what the course is about


• Approaching business problems data-analytically

• Interacting competently on the topic of applied data science for business intelligence

• Communication, visualization and storytelling

• Get an understanding of the different important facets of using data science in


organizations

• Hands on experience of being involved in a data-driven problem


What you will be doing in this course
17

Learning Methods and Activities


What are grading criteria for the Applied Data Science course?

• Lectures, projects and assignments are the main teaching


methods

• The result of the portfolio as a whole is given in a letter


grade
Individual assignment (Pass/Fail)
• There is not a re-sit. If you fail, you need to take the course The individual assignment asks
again you to think like a data scientist
who needs to develop a plan for
• You need both the individual assignment and the group
addressing business problems
project to pass
through a data-driven strategy

Group project (100%)


The group project focuses on a realistic analysis of data
where you are requested to provide insight and direction to
a specific problem
18

Project (Group)
You will work in groups of 5-6 students. The final deliverable will be a report and a presentation that describes a problem or
business opportunity you want to explore, what data you use to tackle it, the analysis you conduct, the presentation of
results, their interpretation and your recommendations:

1. Introduction and problem definition– Describe the context and the problem you wish to address (max 3 pages).
2. Background – Present specific objectives you want to achieve and describe how you approach the problem, how you will design your data-strategy and what goals it is
intended to achieve etc. (approx. 2-5 pages).
3. Method– Describe in detail the methods you are applying to analyze the data and the data-set you have selected (3-6 pages)
4. Analysis– Describe the data analysis you conducted and present the results. It is important that the results are described in detail and visualized appropriately (3-10
pages)
5. Interpretation and recommendations– Describe an implementation plan based on the insights you extracted. You can set specific actions that need to be
implemented, a time-plan for deployment, and ideas for future data collection and improving the analysis and results. (3-5 pages)

Deliverables: A group project report and presentation. Both submitted online.

The page indication is just to give you a hint about the size. A detailed template will be given after the second lecture with grading criteria.
19

Project (Group)
Find the link to register for the group assignment in Blackboard: Course work -> Group Project

IMPORTANT! Only one entry is needed per team for the team registration.
No individual registrations for those who already registered for a team!
20

Assignment (Individual)
The individual assignment is a conceptual one where you will be asked to think like a data
strategist.

The approach you use will be based on the information of the 4th lecture.

A template will be provided during that lecture.


Course outline
22

Lecture Plan
Unpacking the course syllabus

1 23/8 Lecture 1: Introduction [Nisha Dalal] 8 11/10 Lecture 7: Data Visualization & Storytelling
[Manos Papagiannidis]

2 30/8 Lecture 2: Presentation of datasets [Nisha Dalal]


9 18/10 Lecture 8: Data Science in the time of Chat-
GPT [Pikakshi Manchanda]
3 6/9 Lecture 3: Crash course in machine learning
[Kshitij Sharma]
10 25/10 No lecture

13/9 Lecture 4: Data analysis with low or no-code


4
tools [Nisha Dalal] 1/11 Lecture 9: Experiences from Industry [Thomas
11
Thorensen]

5 20/9 No lecture
8/11 Lecture 10: Decision making with data science
12
[Nisha Dalal]
6 27/9 Lecture 5: Lifecycle of a Data Science project I
[Nisha Dalal]

4/10 Lecture 6: Lifecycle of a Data Science project II 13 15/11 Course finish


7
[Nisha Dalal]
23

Important Deadlines
When you will need to deliver or complete a task

1 20/9 Register yourself/group and the company/dataset for group assignment

2 30/10 Deliver individual assignment

3 27/11 Deliver presentation and report for group assignment


24

Important to note
There is enough time to form groups. Take advantage of that.

Important: There could be remote students in your group. Coordinate accordingly.

Each group will be assigned a TA to help you.

If you are unhappy with the group/member, please communicate.


Why do we even care about applied data science?
26

The Age of Data


How data is leading the next digital transformation

Suggested Talk

Philip Evans: How data will transform business

79% of enterprise
executives agree that companies
that do not embrace Big Data will
lose their competitive position
and could face extinction
(Accenture, 2018)
27

Information Systems
A Timeline of major breakthroughs in the last 30 years

Knowledge Management Systems (KMS)


1990s-2000s
The rapid growth of the intranets, extranets, internet and other
interconnected global networks dramatically changed the capabilities
of IS in business. It became possible to circulate knowledge to
Enterprise Resource Planning (ERP) different parts of the world irrespective of time and space.
1990s-2000s
ERP is an organization-specific form of a strategic information system that
incorporates all components of an organization including manufacturing, sales,
resource management, human resource planning and marketing
28

E-Business
1990s-today
The Internet and related technologies and
applications changed the way businesses
operate and people work.

Web 2.0
2000s-today
Social networking and other collective
websites define a mechanism for
collectively assembling information by
and about people.
Service-Oriented Architectures
2000s-today
A service-oriented architecture (SOA) is a style of
software design where services are provided to the other
components by application components over a network.
Social Media
2000s-today
Social media are computer-mediated technologies
that facilitate the creation and sharing of information,
ideas via virtual communities and networks.
29

Internet of Things
2010s-today
The Internet of Things is the network of physical devices,
vehicles, home appliances and other items embedded with
electronics, software, sensors, actuators, and connectivity
which enables these objects to connect and exchange data

Big Data Analytics & AI


2010s-today Suggested Reading
Big data analytics is the process of examining large and varied data sets to Bryant, A., Black, A., Land, F., & Porra, J. (2013). Information
uncover unknown correlations, market trends, customer preferences and Systems history: What is history? What is IS history? What IS
history?… and why even bother with history?. Journal of
other information that can help organizations make more-informed decisions.
Information Technology, 28(1), 1-17.
30

Digital Transformation

By the end of 2021


Of all industry leaders will

45% be disrupted by digitally


enabled competitors.
Source: IDC

Every business is a Digital Business


“Digital is the main reason more than
half of the companies on the Fortune
500 have disappeared since the year
2000.”
Pierre Nanterme, CEO of Accenture
31

A case of digital transformation

The Technology
On August 6, 1991, Berners-Lee posted a
short summary of the World Wide Web www.amazon.com
project on the alt.hypertext newsgroup,
inviting collaborators.

The Opportunity
In 1994 Jeff Bezos, a former Wall Street
hedge fund executive developed Amazon.
On the basis of research he had conducted,
Bezos concluded that books would be the
most logical product initially to sell online.

The Transformation
Amazon was one of the first online book
retailers, enabling ordering books from any
location in the world.
32

A case of digital transformation

Continuous Renewal
Amazon has become one of the leading
technology companies simply because they
were so effective in transforming their
business model in the face of novel digital
technologies

Transformation is about
adaptation
Digital transformation is about how
technology changes the conditions under
which business is done, in ways that change
the expectations of customers, partners,
and employees.
33

Digital Transformations
Some Key Points

1. Digital transformation is not about technology, it is about how technological developments change the
way organizations operate and do business.

2. Digital transformation is not about transformation, it is about adaptation.

3. Digital transformation is about the ability of organizations, its leaders and employees, to adapt to rapid
changes wrought by evolving digital technologies.
Suggested Reading
Hess, T., Matt, C., Benlian, A., & Wiesböck, F.
(2016). Options for Formulating a Digital
Transformation Strategy. MIS Quarterly
Executive, 15(2), 123-139.
Data-driven transformation
35

Data-driven Transformation
Beyond the Big Data Hype

“Everywhere you look, the quantity of information in the


How Mature are Organizations?
world is soaring. Merely keeping up with this flood, and A large proportion of companies are facing problems
storing the bits that might be useful, is difficult enough.
Analyzing it, to spot patterns and extract useful utilizing big data analytics in their operations.
information, is harder still.”
The Economist; “The Data Deluge”

“The Big Data Challenge Involves More Than Just


Managing Volumes of Data”

“The real issue is making sense out of the data and (…)
helping organizations make better decisions.”

Suggested Reading
Ransbotham, S., Kiron, D., & Prentice, P. K. (2016).
Beyond the hype: the hard work behind analytics
success. MIT Sloan Management Review, 57(3) 3-16.
36

Barriers of Big Data Value


Challenges are not technology-related

Suggested Reading
Capgemini and Informatica (2016) The Big Data
Payoff: Turning Big Data into Business Value

Lacking Technical Data Silos Corporate Culture Training Unclear Organizational No Data Business and IT
Expertise Roles Governance are not aligned

32% 31% 30% 26% 24% 22% 22%


37

Barriers for Data-Driven Change


Why can’t companies leverage Big Data?
38%
34% Suggested Reading
Chai, S., & Shih, W. (2017). Why big data isn't
28% enough. MIT Sloan Management Review,
58(2), 57-62.
23% 23%
21%

9%

Don’t know how Lack of management Lack of skills Non-supportive Lack of data Perceived costs Don’t know where
to use analytics to bandwidth culture governance outweigh benefits to start
improve business
38
Need of the hour
40
41
42
43
• Data Scientists
• Data Analysts
• Data Engineers
• Data Architects
• Data Modelers
• Business Analyst
• Data Curator

What’s in the •

DataOps Engineer
Data mining Engineer

name? •

Data Science Specialist
Data and Applied Scientist
• Machine learning Engineer
• Data Visualization Analysts
• DevOps Engineer
• MLOps Engineer
• Chief Data Officer
• And many more..
44
45

Data Science in changing the industry


• Rapid developments of data science tools.
• Helps the organizations to operate more efficiently and improve their strategies based on factual data
statistics and predictions.
• Every aspect in the business process is now open to data analytics including general operations,
manufacturing, supply-chain management, customer services, customer behavior, marketing, workflow, etc.
• Increasing automation (in data cleaning and experiments).
46

..and the • Ability to learn fast


• Less focus on techniques than on questions
needed • Communication (both within team and to the nontechnical
stakeholders)
skillsets • Critical thinking and quantitative.
Other Important things and Q&A
48

Communication
49

Communication
50

Lectures
Every Wednesday 10:15-12:00 at EL5, physical classes.

Next lecture will be online. A Zoom link will be posted on BB a few days before the lecture.

All lectures (physical and remote) will be uploaded on Blackboard shortly after.

TAs will be available to answer questions on Slack, and will be assigned to groups for the group
project.
51

Nisha Dalal
Questions & Discussion nisha.dalal@ntnu.no

You might also like