You are on page 1of 15

Data Science

Data Science

IQ Trainings
OUTLINE

Topics
Covered:-
Introduction to DataScience?
Core Components of Data Science?
Types of DataScientists?
What is Big Data?
Challenges in BigData?
What is Hadoop?
What is MachineLearning?
Tools for DataScience?
What is Data
Science ?
Data science is nothing but study of data.it is a
inter-disciplinary field and it uses scientific
methods of developing ,storing and analyzing
the data to extract knowledge and insights
from unstructured data.Data science can also
be defined as Data Mining ,and Big data.
What is Data?
How it works?
Data is an individual units of information
that is used for processing.Data science is
How is Data
a combination of computer science,
mathematics, and statistics. A Data
used in Science?
scientist must have to know the .To extract meaningful data, Data
components and analyze the statistics and science uses Artificial intelligence and
extract the information from the Machine Learning to predict future
unstructured data. patterns and exactbehavior.

IQ TRAININGS
Data Science
Components ?
Core Components
of Data Science?
Types of Data
1.Data Architecture
2.Machine Learning Scientists?
3.Analytics
1.Data Businesspeople
2.Data Creatives
3.Data Researchers
4.Data Developers
TYPES OF BIG DATA?

What is BigData? STRUCTURED DATA :-


Big data is a technology that is In Structured data, the data that can be processed,
designed to analyze process and extract the stored, retrieved will be in a fixed format.
information from large data sets.
Systems or Enterprises generate huge UNSTRUCTURED DATA :-
amounts of data from Terabytes to and even
In unstructured data, the data cannot be adjusted
Peta bytes ofInformation.
properly into the rows and columns of a relational data
base.

SEMI- STRUCTURED DATA:-


Semi-structured data lies between structured data and
unstructured data. it is a form of data that does not obey the
formal structure of data that is associated with relational data
bases.
Challenges in Big Data?
The most important challenges that are included in Big data are:-

1. Data Capturing
2.Data Storage
3.Data Analysis
4.Data Transfer
5.Visualization
6.Data privacy
7.Data Source
8.Information privacy…etc
Apache Hadoop is a framework and
an open-source data management that is
used for processing large data sets. In

What is Hadoop ? 2004 Google published a paper on a process


called Map-reduce which is a framework that
provides a parallel processing model and
associated implementation that process a
large amount of data. Later this framework
was adopted by Apache open source project
and they renamed it as Hadoop.
Key Characterstics 1.Scalable

of Hadoop 2. Flexible
3. Reliable
4.Economical
5.Robust Ecosystem
Machine learning is an application of data
Machine Learning science that focuses on the development of
computer programs that can access the data

in Data Science? and ability to learn automatically.

Mahout algorithms are implemented on the


top ofApache Hadoop using Map Reduce.
1. Medical Diagnosis
2.Image Processing

Applications of 3.Prediction
4. Virtual Personal Assistants(Siri,Alexa,

Machine Learning
Google)
5. Video Surveillance
6.Social Media Services
7.Online customer support
8.Search Engine results in refining etc
1.SAS
2.Apache spark
3.Big ML
Tools for Data 4.D3.Js
5.MATLAB
Science ? 6. Excel
7.Ggplot2
8.Tableau
9.Jupyter
10.Matplotlib
11.Tensor flow
12.Weka
13.NLTK
Thank You
MAILING ADDRESS
411 walnut street
suite # 8295
Green Cove springs
FL-32043-3443

IQ Trainings
PHONE NUMBER
+1 904-304-2519
732-593-8450

E- MAILADDRESS
info@iqtrainings.com

You might also like