Professional Documents
Culture Documents
Missingvaluetreatment-Ex 1 Code
Missingvaluetreatment-Ex 1 Code
Vishwesh Singbal
3. If you are provided a file with .ipynb extension (as you will be in subsequent classes), then go to File →
Upload Notebook, locate the .ipynb file and upload it.
4. To import a .csv or .xlsx data file in Colab, follow the instructions below.
As Colab is a virtual platform, please note that a file uploaded will be available only till the session lasts. The
next time you load the notebook, you will have to upload the data file again.
Data analytics and Decision making Dr. Vishwesh Singbal
# Display information about the DataFrame, including data types and missing values
>>df1.info()
# Replace specific values ('-', '@@', '#') in the "Age" column with NaN and convert the column to float data
# type
>>df1["Age"] = df1["Age"].replace(['-', '@@', '#'], np.nan).astype('float')
# Print the updated DataFrame
>>df
>>print(df1)
# Replace specific values ('?', 'nuLL') in the "Own_house" column with NaN and convert the column to float
# data type
>>df1.Own_house = df1.Own_house.replace(["?", 'nuLL'], np.nan).astype('float')
>>print(df1)
>>df1.info()
# Replace specific values ('nAN', '###') in the "Income_2020" column with NaN and convert the column to
# float data type
>>df1.Income_2020= df1.Income_2020.replace(["nAN", '###'],np.nan).astype('float')
>>print(df1)
>>df1.info()
# Replace specific values ('###') in the "Income_2021" column with NaN and convert the column to float data
# type
>>df1.Income_2021 = df1.Income_2021.replace(['###'], np.nan).astype("float")
>>df1.info()
>>print(df1)
#The above code should give an error as Python recognizes the values in "Income_2021" as “string” due to
# presence of commas “,”
# Hence first replace specific values ('###') in the "Income_2021" column with NaN
>>df1.Income_2021 = df1.Income_2021.replace(['###'], np.nan)
>>df1.info()
>>print(df1)
# Then remove commas and convert the "Income_2021" column to float data type
>>df1.Income_2021=df1.Income_2021.replace(',',"").astype("float")
>>df1.info()
>>print(df1)