Professional Documents
Culture Documents
Introduction
● Converting the data into the correct tabular form is one of the first step in
data preprocessing.
Data Represented in a Table
Data should be arranged in a two dimensional space made of rows and columns.
size (m, n)
• Outliers
• Human error
• Sparse data
• Special characters
Feature Matrix and Target Vector
A single piece of data is called a scalar.
Feature matrix data is made up of independent columns, and the target vector depends
on the feature matrix columns. Independent
Variable
Car Model Dependent
Car Capacity Variable
Car Brand Car Price
Loading a Sample Dataset and Creating Feature Matrix and Target
Matrix
1. Import Pandas
Library
import pandas as pd
2. Load the
dataset into
pandas dataset=“filename”
Dataframe df=pd.read_csv(dataset,header=0)
4. Total Number of
Rows
df.index
Syntax:
5. Set Address
column as index Dataframe.set_index(‘column name’,inplace=True)
df.set_index(‘Address’, inplace=True)
7. Retrieve first
five rows and
columns df.iloc[0:4, 0:3]
8. Retrieve the
data using labels
df.loc[0:4,[“Avg. Area Income”, “Avg. Area House Age”]]
10. Drop a
column X=df.drop[‘Price’,axis=1]
11. Shape of
feature matrix
x.shape