You are on page 1of 1

In 

[12]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df

Out[12]: day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

1 01-04-2017 NaN 9.0 Sunny

2 01-05-2017 28.0 NaN Snow

3 01-06-2017 NaN 7.0 NaN

4 01-07-2017 32.0 NaN Rain

5 01-08-2017 NaN NaN Sunny

6 01-09-2017 NaN NaN NaN

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

isnull()
The isnull() method returns a DataFrame object where all the values are replaced with a Boolean value True for NULL values, and otherwise False.

In [17]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1=df.isnull()

print(df1.to_string())

day temperature windspeed event

0 False False False False

1 False True False False

2 False False True False

3 False True False True

4 False False True False

5 False True True False

6 False True True True

7 False False False False

8 False False False False

notnull
Replace all values in the DataFrame with True for NOT NULL values, otherwise False

In [21]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1=df.notnull()

print(df1.to_string())

day temperature windspeed event

0 True True True True

1 True False True True

2 True True False True

3 True False True False

4 True True False True

5 True False False True

6 True False False False

7 True True True True

8 True True True True

dropna()
One way to deal with empty cells is to remove rows that contain empty cells.

In [22]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1=df.dropna()

print(df1.to_string())

day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

thresh=3 means 3 or more valid values are required to keep a row


In [23]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1=df.dropna(thresh=3)

print(df1.to_string())

day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

1 01-04-2017 NaN 9.0 Sunny

2 01-05-2017 28.0 NaN Snow

4 01-07-2017 32.0 NaN Rain

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

fillna()
method allows us to replace empty cells with a value:

In [24]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1 = df.fillna(0)

df1

Out[24]: day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

1 01-04-2017 0.0 9.0 Sunny

2 01-05-2017 28.0 0.0 Snow

3 01-06-2017 0.0 7.0 0

4 01-07-2017 32.0 0.0 Rain

5 01-08-2017 0.0 0.0 Sunny

6 01-09-2017 0.0 0.0 0

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

In [25]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1 = df.fillna({'temperature': 0,'windspeed':0})

print(df1)

day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

1 01-04-2017 0.0 9.0 Sunny

2 01-05-2017 28.0 0.0 Snow

3 01-06-2017 0.0 7.0 NaN

4 01-07-2017 32.0 0.0 Rain

5 01-08-2017 0.0 0.0 Sunny

6 01-09-2017 0.0 0.0 NaN

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

ffill()
‘ffill’ stands for ‘forward fill’ and will propagate last valid observation forward.

In [26]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1 = df.fillna(method='ffill')

print(df1)

day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

1 01-04-2017 32.0 9.0 Sunny

2 01-05-2017 28.0 9.0 Snow

3 01-06-2017 28.0 7.0 Snow

4 01-07-2017 32.0 7.0 Rain

5 01-08-2017 32.0 7.0 Sunny

6 01-09-2017 32.0 7.0 Sunny

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

bfill() is used to backward fill the missing values in the dataset.


In [27]:
import pandas as pd

df=pd.read_csv("weather_data.csv")

df1 = df.fillna(method='bfill')

print(df1)

day temperature windspeed event

0 01-01-2017 32.0 6.0 Rain

1 01-04-2017 28.0 9.0 Sunny

2 01-05-2017 28.0 7.0 Snow

3 01-06-2017 32.0 7.0 Rain

4 01-07-2017 32.0 8.0 Rain

5 01-08-2017 34.0 8.0 Sunny

6 01-09-2017 34.0 8.0 Cloudy

7 01-10-2017 34.0 8.0 Cloudy

8 01-11-2017 40.0 12.0 Sunny

You might also like