You are on page 1of 23

Data Frames

Data Frame
• A Data Frame is a two dimensional data structure.
• In Data Frame, data is organised in a tabular form
in rows and columns
• Columns can be of different data types
• Size-Mutable
• Both labeled axes(Rows as well as Columns)
• Can perform arithmetic operations on rows as
well as columns.
Syntax
<pandas>.DataFrame(<data>,<index>,<columns>,
< dtype>,<copy>)
Data:- data content like tuple,list, dict, ndarray
Index:- Index values must be unique and same length as data.
If index not specified default is 0 to n-1
Columns:- Column labels, the optional default syntax is-
np.arange(n). This is only true if no index is passed.
Dtype:- dtype is for data type. If None, data type will be
automatically inferred.
Copy:- Copy data. Default False.
Examples
• import pandas as pd
• D=pd.DataFrame([[10,20],[30,40]])
• print(D)
• import pandas as pd
• D=pd.DataFrame(((10,20),(30,40)))
• print(D)
• Output:-

• 0 1
• 0 10 20
• 1 30 40
Naming Columns
• import pandas as pd
• dt=[[10,20],[30,40]]
• Cl=["C1","C2"]
• D=pd.DataFrame(dt,columns=Cl)
• print(D)
Output
• C1 C2
• 0 10 20
• 1 30 40
#Creating an Empty data frame
• import pandas as pd
• df1 =pd.DataFrame()
• print(df1)
Output
• Empty DataFrame
• Columns: []
• Index: []
#Broadcasting using 1-D array

• import pandas as pd
• a = [[2,5,6],[5,8,9]]
• df =pd.DataFrame(a)
• print(df)
• Output
• 0 1 2
• 0 2 5 6
• 1 5 8 9
#Broadcasting on two data frames of same
sizes
• import pandas as pd
• a =[2,5,6,7,8]
• b =[5,8,9,4,10]
• df1 =pd.DataFrame(a)
• df2 =pd.DataFrame(b)
• print(df1)
• print(df2)
Output
• 0
• 0 2
• 1 5
• 2 6
• 3 7
• 4 8
• 0
• 0 5
• 1 8
• 2 9
• 3 4
• 4 10
Naming Rows
• import pandas as pd
• dt=[[10,20],[30,40]]
• Cl=["C1","C2"]
• RW=[“R1”,”R2”]
• D=pd.DataFrame(dt,columns=Cl, index=RW)
• print(D)
Output
Non default column
labels

C1 C2
R1 10 20
R2 30 40
Non default index
Example
import pandas as pd
Dt={"Name":
["Aman","Raj","Jai","Karan"],"Marks":
[20,25,30,32]}
STU=pd.DataFrame(Dt,index=[1,2,3,4])
print(STU)
print("Avg Marks:", STU["Marks"].mean())
Name Marks
1 Aman 20
2 Raj 25
3 Jai 30
4 Karan 32
Avg Marks: 26.75
Accessing Rows and Columns
SName Marks
5 Aman 20 STU.loc[7]

6 Raj 25
7 Jai 30 STU.iloc[2]

8 Karan 32

SName Jai
Marks 30
Name: 7, dtype:object
Accessing Rows and Columns
SName Marks
5 Aman 20 STU[“SName”]

6 Raj 25
7 Jai 30
8 Karan 32
5 Aman
6 Raj
7 Jai
8 Karan
Name: Sname, dtype:object
import pandas as pd
Dt={"SName":["Aman","Raj","Jai","Karan"],"Marks":
[20,25,30,32]}
STU=pd.DataFrame(Dt,index=[5,6,7,8])
5 Aman
print(STU["SName"]) 6 Raj
7 Jai
print(STU["Marks"]) 8 Karan
Name: SName, dtype: object
print(STU.loc[6])
SName Raj
print(STU.iloc[2]) Marks 25
Name: 6, dtype:
5 20
Name: 6, dtype: 6 25
object 7 30
SName Jai 8 32
Marks 30 Name: Marks, dtype: int64
Name: 7, dtype: object
Adding and Deleting rows and columns
import pandas as pd
Dt={"SName":["Aman","Raj","Jai","Sam"],"Mks":
[10,9,8,7]}
STU=pd.DataFrame(Dt,index=[5,6,7,8])
STU["Gds"]=['A','A','B','B'] SName Mks Gds
5 Aman 10 A
print(STU) SName Mks Gds 6 Raj 9 A
5 Aman 10 A 7 Jai 8 B
STU=STU.drop(7) 6 Raj 9 A 8 Sam 7 B
8 Sam 7 B
print(STU)
SName Gds
STU=STU.drop(columns="Mks") 5 Aman A
6 Raj A
print(STU) 8 Sam B
Combine content of two Data Frames
import pandas as pd
D1={"SName":["Ajay","Ravi"],"Mks":[10,9]}
S1=pd.DataFrame(D1,index=[5,6]) SName Mks
5 Ajay 10
print(S1) 6 Ravi 9

D2={"SName":["Sagar","Rahul"],"Mks":[15,12]}
S2=pd.DataFrame(D2,index=[7,8])
print(S2) SName Mks
7 Sagar 15 SName Mks
5 Ajay 10
S=S1.append(S2) 8 Rahul 12 6 Ravi 9
7 Sagar 15
print(S) 8 Rahul 12
Renaming Rows/Columns
• Df.rename()
import pandas as pd
Dt={"SName":["Aman","Raj","Jai","Sam"],"Mks":
[10,9,8,7]}
STU=pd.DataFrame(Dt,index=[5,6,7,8])
STU["Gds"]=['A','A','B','B']
print(STU)
a1=STU.rename(index={5:11,6:12,7:13,8:14})
print(STU)
print(a1)
inplace()
import pandas as pd
Dt={"SName":["Aman","Raj","Jai","Sam"],"Mks":
[10,9,8,7]}
STU=pd.DataFrame(Dt,index=[5,6,7,8])
STU["Gds"]=['A','A','B','B']
print(STU)
STU.rename(index={5:11,6:12,7:13,8:14},
inplace=True)
print(STU)
Renaming Columns
import pandas as pd
Dt={"SName":["Aman","Raj","Jai","Sam"],"Mks":
[10,9,8,7]}
STU=pd.DataFrame(Dt,index=[5,6,7,8])
print(STU)
a2=STU.rename(columns={'SName':'StuName'})
print(STU)
print(a2)
import pandas as pd
Dt={"SName":["Aman","Raj","Jai","Sam"],"Mks":
[10,9,8,7]}
STU=pd.DataFrame(Dt,index=[5,6,7,8])
print(STU)
STU.rename(columns={'SName':'StuName'},
inplace=True)
print(STU)

You might also like