Professional Documents
Culture Documents
For e.g : Below are the prices of properties in x city. It shows the
area of the house and total price.
A totally different view of the data can reveal interesting and important
features. Consider a time-based dataframe.
● We can extract parts of the date into different columns like Year,
month, day, a week, etc.
● We can find the number of days between two dates.
● We can create new features like if the day is weekend or
weekday.
● We can create features like if it’s a holiday or not.
Below is an example where we have extracted the month and week of the
year. Similarly, we can create more features like day, year, weekend, etc.
import pandas as pd
#reading file
df = pd.read_csv('housing_price.csv')
df['date'] = pd.to_datetime(df['date'])
df['month'] = pd.DatetimeIndex(df['date']).month
df['week'] = pd.DatetimeIndex(df['date']).week
Feature Construction
Density = mass/volume
Another e.g :
If we have a patient dataset consisting of attributes Name, Patient Id,
Height , Weight , Age. and we are interested in the category of weight
(overweight, underweight, normal weight) a patient lies in, then the
feature BMI (Body Mass Index) i.e.