You are on page 1of 14

Chapter - 1

Data Handling Using Pandas -I


SERIES
Creating a series
1. Creating an empty series.
2. Creating a series from a list.
3. Creating a series using range.
4. Creating a scalar series.
5. Creating a series from a dictionary.
6. Creating a series using NaN values.

1. Creating an empty series.


Syntax: Result:

import pandas as pd Empty series


s = pd . Series( )
print(s)

2. Creating a series from a list.


Syntax: Example Output

import pandas as pd import pandas as pd


s = pd . Series( [Data],
index=[ ] ) s = pd.Series([1,2,3,4] , a 1
print(s) index = [‘a’, ‘b’, ‘c’, ‘d’] ) b 2
print(s)
c 3

d 4

1
3. Creating a series from a list.
Syntax: Example Output

import pandas as pd import pandas as pd


s = pd . Series( range(start value,
end value , skip ), index=[ ] )
s = pd.Series( range(1, 15 , 3) a 1
,
print(s)
index = [‘a’, ‘b’, ‘c’, ‘d’, ‘e’] ) b 4

print(s) c 7

d 10

Ps: end value wont be e 13


considered.(if it is 15, will only take
till 14)

4. Creating a scalar series.


Syntax: Example Output

import pandas as pd import pandas as pd


s = pd . Series( number ,
index=[ ] ) s = pd.Series( 100 , a 100
print(s) index = [‘a’, ‘b’, ‘c’, ‘d’ ]) b 100
print(s)
c 100

d 100

2
5. Creating a series from a dictionary.
Syntax: Example Output

import pandas as pd import pandas as pd


s = pd . Series( {key : value }
) s = pd.Series( {‘a’ : 100, ‘b’ : a 100
print(s) 200, ‘c’: 300 , ‘d’ : 400 })
b 200
print(s)
c 300

d 400
Ps: dictionary is a key value pair where
key will be index and value will be
taken as data (curly bracket separated
by full colon (:))

6. Creating a series using NaN values.


Syntax: Example Output

import pandas as pd import pandas as pd


Import numpy as np
s = pd . Series( [Data], s = pd.Series( [10, np.NaN , a 10.0
index=[ ] ) 100] , index = [‘a’, ‘b’, ‘c’] )
b NaN
print(s) print(s)
c 100.0

Ps: shld use np.NaN for NaN values Ps: one value nan all value float

3
Numpy
1) Arange

Syntax: np.arange (start value, end value, skip)

Example:
import pandas as pd
import numpy as np
data = np.arange ( [ 1, 13, 3])
s = pd .Series (data, index= [‘a’,’b’,’c’,’d’])
print(s)

Output:

a 1

b 4

c 7

d 10
2) Array

Syntax: np.array ( [data])

Example:
import pandas as pd
import numpy as np
data = np.array(['a','b','c','d'])
s = pd. Series(data)
print(s)

Output:

0 a

1 b

2 c

3 d

4
head( ) and tail( )

Syntax:
print(s.head())
print(s.tail())

Example:
import pandas as pd
s = pd.Series([1,2,3,4,5],index = ['a','b','c','d','e'])
print (s.head(3))
print(s.tail(3))
Output:
(for head)
a 1

b 2

c 3

(for tail)
c 3

d 4

e 5

Mathematical operations on Series.


Note: the index should be same for both the series or the output will be Nan

Syntax: print( s1+ s2)

print(s1-s2)

etc

5
Example:
import pandas as pd
s = pd.Series([1,2,3])
t = pd.Series([1,2,4])
print (s+t)
print(s*t)

Output:

s+t s*t
0 2 0 1

1 4 1 4

2 7 2 12

Series Attributes
s.ndim No of dimension (always 1 for series)

s.nbytes No of elements x 8 (cz 1 byte = 8 bit)

s.size No of elements

s.dtype Returns the datatype

s.empty Check whether the series is empty or not. If


yes, gives True.If no, gives False

s.hasnans Check whether the series contains any NaN value.


If yes, gives True.If no, gives False

s.index Returns index values in the format of range.

s.values Gives back data in format of array

s.shape No of rows and columns in tuple format


(no of rows , no if columns(blanck for series ))

6
Naming a series and index.
Syntax:
for series
s. name = ‘new name’

For index
s.index.name =’shop1’

Vector operations on series


Syntax:
print(s + 2)
print(s > 4)
print(s==2) etc

Example:
import pandas as pd
s= pd.Series([1,2,3,4,5], index=[‘a’,’b’,’c’,’d’,’e’])
print(s*3)
print(s>2)

Output:

s*3 s>2
a 3 a False

b 6 b False

c 9 c True

d 12 d True

e 15 e True

7
Updation and Filtration
Syntax:

Filtration: print(s[s(>,<,==) value])


Updation: s[‘index no’] = new value
print(s)

Example:
import pandas as pd
s= pd.Series([1,2,3,4,5], index=[‘a’,’b’,’c’,’d’,’e’])
print(s [s < 3] )
s[‘e’] = 6
print(s)
Output:

s[‘e’]=6 s[s>3]
a 1 a 1

b 2 b 2

c 3

d 4

e 6

Deleting an element from the series


Syntax: print (s.drop(index))

Example:
import pandas as pd
s= pd.Series([10,20,30,40,50])
print(s.drop(2)) # the value 30 gets deleted from the series
print(s.drop[1,3]) #multiple values get deleted

8
Indexing and Slicing
Syntax: print(s [index] )

print(s [start value : end value])

Example:

import pandas as pd
s= pd.Series([1,2,3,4,5], index=[‘a’,’b’,’c’,’d’,’e’])
print(s[0])
print(s[‘b’])
print(s[:3])
print(s[3:])
print([-3:])

Output:
s([0]) s[‘b’] s[:3] s[3:]) ([-3:])
a 1 b 2 a 1 d 4 c 3

b 2 e 5 d 4

c 3 e 5

dropna() and drop_duplicates()


Syntax:

print(s.dropna()) —> to delete NaN values

print(s.drop_duplicates()) —-> to delete duplicate values

9
loc and i.loc
Loc : label based
Syntax: print(s.loc[label])
I.loc: integer position based
Syntax: print(i.loc[0,1,2,3,etc])

Example:

Import pandas as pd
s= pd.Series([10,20,30,40,50], index=[‘a’,’b’,’c’,’d’,’e’])
print(s[‘a’:’c’]) note: in loc end value will be included
print(s[2:5])

Output:

s[‘a’:’c’] s[2:5]
a 10 c 30

b 20 d 40

c 30 e 50

Multiple selection
Syntax:
print(s[ [1st value , 2nd value] ] )

Example:

Import pandas as pd
s= pd.Series([10,20,30,40,50], index=[‘a’,’b’,’c’,’d’,’e’])
print(s[ [‘a’,’e’] ] )
Output:

a 10

e 50

10
Sorting the data in a series
Syntax:

s.sort_values(ascending=True) -> ascending

s.sort_values(ascending=False) -> descending

11
12
13
14

You might also like