You are on page 1of 20

INTERNSHIP

PRESENTATION
DATA SCIENCE USING PYTHON
GAYATRI VIDYA PARISHAD COLLEGE OF ENGINEERING(AUTONOMOUS)
ELECTRICAL AND ELECTRONICS ENGINEERING
PANTECH E LEARNING IN ASSOCIATION WITH APSSDC

           
Courses in the Internship : Data Science using python

  Name of Course Coordinator and Course Mentor 


       K.RAVI KUMAR KAKKALA DEEPAK
            (Associate Professor)
                 (20131A0218)
                                                                         

2
                          DATA SCIENCE INTERNSHIP CERTIFICATE

3
TOPICS LEARNT IN INTERNSHIP

TOPICS LEARNT IN DATA DCIENCE :


1. Numpy
2. Pandas
3. Matpoltlib
4. Groupby using matplotlib
5. Seaborn

4
ABSTRACT:

DATA SCIENCE is the scientific study of


data.It is a multidisciplinary approach that
combines the principles and practices from the
fields of mathematics, statastics, artificial
intelligence to analyze a large amount of data.
Data science encompasses a set of principles, problem
definitions, algorithms, and processes for extracting
nonobvious and useful patterns from large data sets.It
helps to represent huge amount of data in a simple
manner. The aim is to present the intern with the main
concepts used in data science using tools developed in
Python, such as SciKit-learn, Pandas, Numpy, and others

5
PYTHON LIBRARIES FOR DATA
SCIENCE
MOST POPULAR PYTHON LIBRARIES
 NUMPY
 PANDAS
 SCIPY

DATA VISUALISATION PYTHON


LIBRARIES
 MATPLOTLIB
 SEABORN

6
HOW TO INSTALL THESE LIBRARIES
 There are two ways of instlalling them
 If you have anaconda navigator
 If you don’t have anaconda navigator
 Go to cmd and type conda install numpy or pip install numpy

7
NUMPY

 It is core library for scientific


computing
 It has numpy array
 It is multi dimentional gird of values
of same type and indexed with non
negative integers

8
Creating a random numbers arrays
How to create a numpy array

Creating an array of zeros Creating an array of ones

9
10
THE MAIN DIFFERENCE BETWEEN NUMPY ARRAY AND LIST
 THE ABILITY TO BOADCAST

11
PANDAS

 It is an open source ,built on top of


numpy
 It used for fast analysis and data
cleaning and preparation
 There are two types of pandas data
structures

12
Creating a pandas series

Basic operations to perform on data frames

Creating pandas data frame

13
MATPLOTLIB LIBRARY

 Used for making 2D plots in python


 To create simple plot with just a few
comments
 It is mainly used for data visualization

14
DRAWING A SIMPLE GRAPH Drawing a piechart

15
PROJECT
MANIPULATING DATA USING PANDAS
PLOTING USING MATPLOTLIB

 Reading a csv file  Knowing how many raws and colums do file
have

16
 Cheking whether the file has null values or not  Taking only particular columns

17
 Saving options
 The saved file will be in the same file as
python Code required for saving particular fornmat

18
PLOTTING THE DATA

19
Thank you
20

You might also like