Professional Documents
Culture Documents
Aim:
To Perform Feature Extraction with correlation (Bivariate) Analysis and categorisation using
Python/R.
Steps:
1. Data Preparation:
• Create a correlation matrix for specific columns related to air quality indices
3. Visualize the correlation matrix:
• Show the heatmap with correlations between the air quality indices, helping identify
relationships.
5. Feature selection based on the correlation matrix:
• Calculate correlations between all columns and the "PM2.5 AQI Value."
• Sort and select features with correlations greater than 0.25 in absolute value.
• Print and display the selected features, helping identify which variables correlate significantly
with the target variable.
Importing the dataset and formation of the Co-relation Matrix
Python Code:
import pandas as pd
# Display the first few rows of the DataFrame to understand its structure
print(job_placement_df.head())
Python Code
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5)
plt.title('Correlation Matrix of Job Placement Dataset')
plt.show()
Output
Python Code
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
Output
Python Code
import pandas as pd
print("Correlated Features:")
print(correlated_features)
Output
Record ( 5 )
Total ( 25 )
Result:
In this Experiment , Feature Extraction with correlation (Bivariate) Analysis and
categorization using Python/R was implemented and the output is verified successfully.