You are on page 1of 16

PYTHON FOR DATA

SCIENCE
By Cognitive class

DHILANI E (16PGM09)
What does the course provide ?

• What is python and why is it useful.


• Pythonized tools for retrieving and dealing with data.
• Basics of data science with python.
• Methods and application of python
• Dealing with Data
Data Scientist : the coolest job of
the 21st century- HARVARD
Business Review
Why Python ?
• It is a widely-used General purpose high-level
programming language.
• Beginner program language since it is easy to learn and
to maintain.
• It supports GUI programming and its library is portable.
• It can be used to create application portable in Mac,
Windows and Unix system.
Computer science +
Mathematics/Statistics+ Visualization
= Data Science

• Example : Web companies like Facebook, Amazon,


Google, LinkedIn uses thos.
DATA outline
• Harvesting
• Cleaning
• Analyzing
• Visualizing
• publishing
Data Harvesting
• Also called as web scraping.
• It is the process where a small script is used to
automatically extract large amount of data from websites.
• Cheap and easy way to collect online data.
• DATA SOURCES: own system, other service, locally
available data, data dumps from web, web documents,
Data Cleansing( Preprocessing )
• Harvested data might come with lots of noise- for
detection. Ex: scatter plot
• Data preprocess : provide structured presentation for
analysis. Ex: Network, Graph.
Data Analyzing

• Analyzing the data


• Numpy( offers efficient multidimensional array) and
scipy.org(builds on top of NumPy) is used.
Data visualization
• Python interface for the Graphviz layout engine.
• Graphviz is a collection of graph layout programs.
Matplotlib
Data publishing
• Open data should be available for usage ( for the benefit
of most people).
• Examples of open dataset types
1. Government data
2. Life science
3. Commerce
4. Social media
5. culture
Thankyou

You might also like