You are on page 1of 53

AACS1573

Introduction to
Data Science
Chapter 1
Introduction (Part A)
Course Learning Outcome

CLO 1 CLO 2 CLO3


Explain the concepts and Relate data into actionable Practice Exploratory Data
applications of data science insights that are found Analysis (EDA) to analyse the
in various aspects. through data science main characteristics of
processes. datasets using
programming language.
Marks
Test = 40 marks

Assignment:
Report = 48 marks
Presentation = 12 marks

Continuous Assessment
Assignment 60% + Test 40%

Final Online Assessment


Examination 50% FOA Assignment Test
Main Reference

Data Science Using Python and R Data Science: Concepts and Practice
Author(s): Chantal D. Larose Daniel T. Larose Author(s): Vijay Kotu and Bala Deshpande
First published:29 March 2019 Second Edition • 2019
Course Content Outline

Lecture Tutorial Practical


2 hours per week 1 hours per week 1 hours per week

Please accept invitation through TAR UC Gmail to join


the Google Classroom.
Lecturer’s and Tutor’s Info

Nikalus
Lim Khai Yin
Swee Shu Luing
limky@tarc.edu.my sweesl@tarc.edu.my
Course Content Outline
Lecture Tutorial Practical

• Chapter 1: Introduction to Data Science


• Chapter 2: The data science process
• Chapter 3: Visualization and descriptive analysis
• Chapter 4: Machine learning and R platform
• Chapter 5: Mining data stream
• Chapter 6: Case study: link analysis
• Chapter 7: Data quality

Please accept invitation through TAR UC Gmail to join


the Google Classroom.
Course Content Outline
Lecture Tutorial Practical

• Online tutorial
• Google Docs
• Students take turn to present tutorial answers
and tutors give feedback to the presented
answers
• Appreciate that you complete the answer
before the tutorial class.

Please accept invitation through TAR UC Gmail to join


the Google Classroom.
Course Content Outline
Lecture Tutorial Practical

• Online Practical
• Software: R platform
• Please refer to folder in Google classroom
(practical) and follow instructions in
installation guide to install the software in
your desktop/laptop

Please accept invitation through TAR UC Gmail to join


the Google Classroom.
Table of Contents

01 02 03 04
Important of Types of Data
History Data Science Big data Scientists
1.1 History
(Data Science & BigData)

next…
X Y
44,000,000,000,000 GB

Z Data = Potential !

X
X
Internet
- 4.39 billion connected to the Internet
Y
Social Media (every 1 min)
- Around 1 mill logins to FB
- Google – 4.39 process 4.5 million searches/ sec
- 4,500,000 vids are streamed over Youtube

Z
- 55,140 photos are shared on Instagram

X
Communication (every 1 min)
- Around188,000,000 emails
- 231,840 calls are made through Skype
- 4.8 million GIF sent
X Y
Data Science?

Z X
Data Science is the science which uses
computer science, statistics and machine
learning, visualization and human-computer
interactions to collect, clean, integrate,
analyze, visualize, interact with data to create
data products.
X Y
Goal of Data Science

Z Turn data into data products.

X
Turn Data into Data Product (1)

https://towardsdatascience.com/understanding-data-analysis-step-by-step
Turn Data into Data Product (2)
Turn Data into Data Product (3)
Turn Data into Data Product (4)

Forecasting
In a nutshell…
extraction,
“Data Science is about

preparation, analysis,
visualization, and maintenance of
information. It is a cross-disciplinary field which uses scientific
methods and processes to draw insights from data. ”
In a nutshell…
extraction,
“Data Science is about

preparation, analysis,
visualization, and maintenance of
information. It is a cross-disciplinary field which uses scientific
methods and processes to draw insights from data. ”
DATA?
DATA?

Text Image/Video Audio


DATA

Data is a collection of factual


information based on numbers,
words, observations, measurements
which can be utilized for calculation,
discussion and reasoning.
DATA
The 6Vs of Big Data
Data Science Process
1.2 Importance of DS

next…
Why Data Science?

Problem of
Fuel of 21st A lucrative
demand &
century career
supply

Data Science is
changing the Our future
word
Purpose of Data
Science
● The principal purpose of Data Science
is to find patterns within data.

● It uses various statistical techniques to


analyse and draw insights from the
data.

● Then, he has the responsibility of


making predictions from the data.

● The goal of a Data Scientist is to derive


conclusions from the data.
Data Science need for…

● Better Marketing

● Customer Acquisition

● Innovation

● Enriching lives
Importance of DS in
Business
Business Intelligence for Making
Smarter Decisions

Understanding Quantifying Implementing Translating


problem data tools insights
Making Better Products
● Companies should be able to attract their customers towards
products.

● develop products that suit the requirements of customers

● The process involves the analysis of customer reviews to find


the best fit for the products.
Managing Businesses Efficiently

● can predict the success rate of their strategies.

● For example – Monitor the performance of employees.


● Using this, managers can analyse the contributions made
by the employees and determine when they should be
promoted, managing their perks, etc.
Predictive Analytics to Predict
Outcomes

optimizing
Detecting improving reducing
marketing
fraud operation risk
campaigns
Leveraging Data for Business
Decisions

● presence of a plethora of data and necessary data tools, it is


now possible for the data industries to make calculated data-
driven decisions

● predictions are necessary for businesses to learn about future


outcomes
Assessing Business Decisions

● businesses should understand how these decisions affect their


performance and growth.

● If the decision leads to any negative factor, then they should


analyse it and eliminate the problem that is slowing down
their performance.
Automating Recruitment
Processes
● major businesses can even attract thousands of resumes for a
position.

● The data science technologies like image recognition are able


to convert the visual information from the resume into a
digital format.

● It then processes the data using various analytical algorithms


like clustering and classification to churn out the right
candidate for the job.
1.3 BD & Data Analytics
next…
Big Data
● Big Data is the extraction,
analysis and management of
processing a large volume of
data.

● It revolves around the datatype

● Can you give one type of data


that available around us?
Example of Big Data
• Descriptive: What is happening?
• Diagnostic: Why is it happening?
• Predictive: What is likely to happen?
• Prescriptive: What do I need to do?
Data Analyst
1.4 Types of Data Scientists
next…
Who can be the Data Scientist?
In a nutshell, we have learned about…

01 02 03 04
Important of Types of Data
History Data Science Big data Scientists
Coming up next…
1.5 Types of analytics
1.6 Analytics process model
1.7 Related Software/Tools
1.8 Data science applications

You might also like