Professional Documents
Culture Documents
Datascience Set A
Datascience Set A
Out[26]:
name age Percentage
0 Ajay 21 76%
1 Vijay 20 80%
2 Riya 19 75%
3 Priya 20 88%
4 Ram 21 67%
5 Ajay 21 None
8 Priya 20 88%
9 Ram 21 NaN
In [27]: df['Remark']=None
In [28]: df
Out[28]:
name age Percentage Remark
1 of 7 07/08/23, 14:02
datascience Set A - Jupyter Notebook http://localhost:8888/notebooks/datascience%20Set%...
In [29]: df.describe()
Out[29]:
name age Percentage Remark
count 10 10 10 0
unique 5 4 7 0
freq 2 4 2 NaN
In [25]: df.isnull()
Out[25]:
name age Percentage Remark
In [32]: df.duplicated()
Out[32]: 0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 True
9 False
dtype: bool
In [33]: df.drop(columns='Remark',axis=1,inplace=True)
2 of 7 07/08/23, 14:02
datascience Set A - Jupyter Notebook http://localhost:8888/notebooks/datascience%20Set%...
In [34]: df
Out[34]:
name age Percentage
0 Ajay 21 76%
1 Vijay 20 80%
2 Riya 19 75%
3 Priya 20 88%
4 Ram 21 67%
5 Ajay 21 None
8 Priya 20 88%
9 Ram 21 NaN
Out[54]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
In [55]: df.shape
Out[55]: (150, 6)
3 of 7 07/08/23, 14:02
datascience Set A - Jupyter Notebook http://localhost:8888/notebooks/datascience%20Set%...
In [56]: df.describe()
Out[56]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
In [57]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Id 150 non-null int64
1 SepalLengthCm 150 non-null float64
2 SepalWidthCm 150 non-null float64
3 PetalLengthCm 150 non-null float64
4 PetalWidthCm 150 non-null float64
5 Species 150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB
In [58]: df.dtypes
Out[58]: Id int64
SepalLengthCm float64
SepalWidthCm float64
PetalLengthCm float64
PetalWidthCm float64
Species object
dtype: object
4 of 7 07/08/23, 14:02
datascience Set A - Jupyter Notebook http://localhost:8888/notebooks/datascience%20Set%...
In [61]: df.head(20)
Out[61]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
In [60]: df.tail()
Out[60]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
5 of 7 07/08/23, 14:02
datascience Set A - Jupyter Notebook http://localhost:8888/notebooks/datascience%20Set%...
In [62]: df.sample(10)
Out[62]:
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
In [63]: df.size
Out[63]: 900
In [64]: df.columns
In [65]: df['Species'].value_counts()
Out[65]: Iris-setosa 50
Iris-versicolor 50
Iris-virginica 50
Name: Species, dtype: int64
6 of 7 07/08/23, 14:02
datascience Set A - Jupyter Notebook http://localhost:8888/notebooks/datascience%20Set%...
In [67]: sliced_data=df[10:20]
print(sliced_data)
In [69]: specific_data=df[["Id","Species"]]
print(specific_data)
Id Species
0 1 Iris-setosa
1 2 Iris-setosa
2 3 Iris-setosa
3 4 Iris-setosa
4 5 Iris-setosa
.. ... ...
145 146 Iris-virginica
146 147 Iris-virginica
147 148 Iris-virginica
148 149 Iris-virginica
149 150 Iris-virginica
In [ ]:
7 of 7 07/08/23, 14:02