Welcome to Scribd!

Lab Assignment (Linear Regression)

Uploaded by

0% found this document useful (0 votes)

9 views2 pages

The document describes the tasks for a linear regression lab assignment to predict car prices using a provided dataset. The tasks include importing libraries, loading and exploring the dataset, preprocessing the data by handling missing values, feature scaling, and one-hot encoding categorical features. The preprocessed data is then split into training and testing sets and a linear regression model is trained on the training data. The trained model is evaluated on the testing data using metrics like MAE, MSE, and R-squared score. Finally, the model makes predictions on new data and conclusions are drawn about the model performance and impact of preprocessing steps.

Original Description:

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

9 views2 pages

Lab Assignment (Linear Regression)

Uploaded by

Rana Babar

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 2

Search inside document

Linear Regression

The objective of this lab assignment is to perform linear regression on the given "Car
Price Prediction" dataset. In this lab, participants will explore the dataset, preprocess
the data, split it into training and testing sets, build a linear regression model, train the
model, make predictions, and evaluate its performance.
Dataset: You have been provided with the "car_price_data.csv" dataset. This dataset
contains various features of cars, such as mileage, horsepower, number of cylinders,
and other specifications, along with their corresponding prices.

Tasks:

1. Data Preparation:
- Import the necessary libraries (e.g., pandas, numpy, matplotlib, and scikit-learn).
- Load the "car_price_data.csv" dataset into a pandas DataFrame.
- Explore the dataset: check the dimensions, data types, summary statistics, and the
presence of any missing values.

2. Data Preprocessing:
- Handle missing values, if any, using appropriate techniques (e.g., mean/median
imputation).
- Perform feature scaling (e.g., Min-Max scaling) to standardize the numerical
features.

3. Handling Categorical Features:

- Identify the categorical features in the dataset. These are the features that represent
non-numeric data, such as car make, model, or fuel type.
- Hint: You can use the `pd.get_dummies()` function from pandas to perform one-hot
encoding on the categorical features. This function will create binary vectors for each
category within a categorical feature.

4. Concatenate Transformed Data:

- After one-hot encoding the categorical features, you will have additional binary
columns in the DataFrame.
- Hint: Use the `pd.concat()` function to concatenate the transformed data with the
original DataFrame.

5. Data Splitting:
- Split the dataset into training and testing sets using an 80-20 or 70-30 ratio.
- Separate the target variable (car prices) from the rest of the features in both the
training and testing sets.

6. Model Building:
- Create an instance of the linear regression model from scikit-learn.
- Train the model using the training data.

7. Model Evaluation:
- Make predictions on the testing data using the trained model.
- Evaluate the model's performance using appropriate metrics (e.g., Mean Absolute
Error (MAE), Mean Squared Error (MSE), and R-squared score).
- Interpret the results and discuss the model's performance.

8. Predictions:
- Use the trained model to predict the prices of cars with new feature values. For
example, predict the price of a car with specific mileage, horsepower, fuel type, etc.

Conclusion:
Summarize your findings from the lab, including insights gained from the data
exploration, preprocessing, model training, and evaluation. Discuss the impact of
one-hot encoding on the model's performance and how it handles categorical features
effectively. Also, mention any challenges faced during the preprocessing or modeling
stages and potential ways to overcome them.

Note to students:
Ensure to include comments in your code for better understanding and readability.
Experiment with different settings, feature engineering techniques, or test/train split ratio
to gain a better understanding of linear regression.

Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
Document50 pages
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
Ashish Pavan Kumar K
No ratings yet
Procedure of Testing Hypothesis
Document5 pages
Procedure of Testing Hypothesis
mohapatra82
100% (1)
Assignment - Machine Learning
Document3 pages
Assignment - Machine Learning
Le G
No ratings yet
8.2.2 LAP Guidance On The Estimation of Uncertainty of Measurement - R1
Document18 pages
8.2.2 LAP Guidance On The Estimation of Uncertainty of Measurement - R1
Catalina Ciocan
No ratings yet
Group Project 2 Sta108 DAH SIAP Edit
Document15 pages
Group Project 2 Sta108 DAH SIAP Edit
wani
56% (9)
C1 Lesson 1 - Exploring Random Variables
Document12 pages
C1 Lesson 1 - Exploring Random Variables
Teresa Navarrete
67% (3)
Guide
Document210 pages
Guide
José Antonio Neves
No ratings yet
Statistical and Machine Learning
Document17 pages
Statistical and Machine Learning
TatianeDutra
No ratings yet
TIME SERIES FORECASTING. ARIMAX, ARCH AND GARCH MODELS FOR UNIVARIATE TIME SERIES ANALYSIS. Examples with Matlab
From Everand
TIME SERIES FORECASTING. ARIMAX, ARCH AND GARCH MODELS FOR UNIVARIATE TIME SERIES ANALYSIS. Examples with Matlab
B. NORIEGA
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Title Predicting House Pricing Using AIML (KASHISH)
Document2 pages
Title Predicting House Pricing Using AIML (KASHISH)
Jay Vardhan
No ratings yet
Assignment 2
Document3 pages
Assignment 2
vedantsimp
No ratings yet
Pravesh 6301
Document11 pages
Pravesh 6301
Shreyas Paraj
No ratings yet
Instructions and Guidelines
Document2 pages
Instructions and Guidelines
Dishan Otieno
No ratings yet
File 482621234 482621234 - Assignment 2 - 7378831553794248
Document5 pages
File 482621234 482621234 - Assignment 2 - 7378831553794248
Bob Philip
No ratings yet
Team Alacrity - Amazon ML Challenge 2023 - Text File
Document8 pages
Team Alacrity - Amazon ML Challenge 2023 - Text File
omkar sameer chaubal
No ratings yet
ML0101EN Reg Simple Linear Regression Co2 Py v1
Document4 pages
ML0101EN Reg Simple Linear Regression Co2 Py v1
Muhammad Rafly
No ratings yet
Important Questions
Document4 pages
Important Questions
Adilrabia rsl
No ratings yet
Predicting Stock Values Using A Recurrent Neural Network
Document12 pages
Predicting Stock Values Using A Recurrent Neural Network
Mr SKammer
No ratings yet
Chapter 11
Document19 pages
Chapter 11
ramaraju
No ratings yet
2324 BigData Lab3
Document6 pages
2324 BigData Lab3
Elie Al Howayek
No ratings yet
Assignment 1 Supplementary
Document5 pages
Assignment 1 Supplementary
mughaltalha01249212015
No ratings yet
Academic Analytics Model - Weka Flow
Document3 pages
Academic Analytics Model - Weka Flow
Madalina Beret
No ratings yet
5 Ejercicio - Experimentación Con Los Modelos de Regresión Más Eficaces - Training - Microsoft Learn Ingles
Document9 pages
5 Ejercicio - Experimentación Con Los Modelos de Regresión Más Eficaces - Training - Microsoft Learn Ingles
acxel david castillo casas
No ratings yet
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
Document5 pages
ML0101EN Reg Mulitple Linear Regression Co2 Py v1
Rajat Solanki
No ratings yet
L10 - Keras Regression
Document14 pages
L10 - Keras Regression
Kunapuli Poojitha
No ratings yet
8 Ejercicio - Optimización y Guardado de Modelos - Training - Microsoft Learn Ingles
Document13 pages
8 Ejercicio - Optimización y Guardado de Modelos - Training - Microsoft Learn Ingles
acxel david castillo casas
No ratings yet
Machine Learning Project Car Price Prediction Algorithm
Document4 pages
Machine Learning Project Car Price Prediction Algorithm
Ruqaiya Ali
No ratings yet
Data Science & Machine Learning by Using R Programming
Document6 pages
Data Science & Machine Learning by Using R Programming
Vikram Singh
No ratings yet
Data Science and Machine Learning Essentials: Lab 4A - Working With Regression Models
Document24 pages
Data Science and Machine Learning Essentials: Lab 4A - Working With Regression Models
aussatris
No ratings yet
Project Note
Document8 pages
Project Note
subhradeepnath2.o
No ratings yet
Assignment - CarsData - Descriptive - EDA - Munjal - Exercise - Ipynb - Colaboratory
Document6 pages
Assignment - CarsData - Descriptive - EDA - Munjal - Exercise - Ipynb - Colaboratory
prarabdhasharma98
No ratings yet
Introduction To Logistics Regression.
Document4 pages
Introduction To Logistics Regression.
Vikram Choudhary
No ratings yet
Only Quat
Document8 pages
Only Quat
BI11-286 Nguyễn Xuân Vinh
No ratings yet
Classification in R
Document5 pages
Classification in R
Aman Kansal
No ratings yet
Individual Assignment 2
Document4 pages
Individual Assignment 2
jemal yahyaa
No ratings yet
Scikit Learn
Document17 pages
Scikit Learn
RR
No ratings yet
Engo 645
Document9 pages
Engo 645
sree vishnupriyq
No ratings yet
Assignment Instructions For The Data Analytics Report
Document5 pages
Assignment Instructions For The Data Analytics Report
robert jnr Martin
No ratings yet
Data Preprocessing
Document2 pages
Data Preprocessing
Hemant
No ratings yet
Anomaly Detection in Social Networks Twitter Bot
Document11 pages
Anomaly Detection in Social Networks Twitter Bot
Mallikarjun patil
No ratings yet
Applied Datascience - Phase3
Document8 pages
Applied Datascience - Phase3
Ayina A Ashok
No ratings yet
Unit 7 ML
Document33 pages
Unit 7 ML
Yuvraj Chauhan
No ratings yet
Linear Regression
Document1 page
Linear Regression
AMEER MALIKASAB NADAF
No ratings yet
BDA-A5 (Employee Salaray Data)
Document1 page
BDA-A5 (Employee Salaray Data)
Chaudhary Taha
No ratings yet
AI Phase4
Document11 pages
AI Phase4
techusama4
No ratings yet
CSC 603 - Final Project
Document3 pages
CSC 603 - Final Project
bme.engineer.issa.mansour
No ratings yet
F.E Process
Document3 pages
F.E Process
Anthony J.
No ratings yet
Project Assignment.2024
Document2 pages
Project Assignment.2024
tin nguyen
No ratings yet
Assignment 1:: Intro To Machine Learning
Document6 pages
Assignment 1:: Intro To Machine Learning
Minh Trí
No ratings yet
RANDOM FOREST (Binary Classification)
Document5 pages
RANDOM FOREST (Binary Classification)
Noor Ul Haq
No ratings yet
ML Practical 04
Document19 pages
ML Practical 04
chatgptlogin2001
No ratings yet
Car Price Prediction
Document2 pages
Car Price Prediction
umartanveer113311
No ratings yet
DP-Designing and Implementing
Document10 pages
DP-Designing and Implementing
Steven Doh
No ratings yet
Data Mining Problem 2 Report
Document13 pages
Data Mining Problem 2 Report
Babu Shaikh
No ratings yet
Tensorflow Estimators - Documentation
Document4 pages
Tensorflow Estimators - Documentation
Saransh Bansal
No ratings yet
Methodology of Model Creation: Mgr. Peter Kertys, VÚB A. S
Document11 pages
Methodology of Model Creation: Mgr. Peter Kertys, VÚB A. S
Juan Diego Monasí Córdova
No ratings yet
Regression Linaire Python Tome II
Document10 pages
Regression Linaire Python Tome II
Elisée TEGUE
No ratings yet
Project Cdac
Document4 pages
Project Cdac
abhishekbayas103
No ratings yet
Shubham Mankodiya DS
Document6 pages
Shubham Mankodiya DS
Vidhi Thakkar
No ratings yet
Finance
Document1 page
Finance
ahmadkhalil
No ratings yet
School of Computer Science Engineering and Technology
Document1 page
School of Computer Science Engineering and Technology
PRANAT MANGLA
No ratings yet
Lab-5A: Regression Analysis and Modeling
Document2 pages
Lab-5A: Regression Analysis and Modeling
Vishal Ramina
No ratings yet
Group Assignment 3
Document5 pages
Group Assignment 3
ketan
No ratings yet
Chapter 6.assessment of Learning 1
Document11 pages
Chapter 6.assessment of Learning 1
Emmanuel Molina
No ratings yet
A Study On Imbibition and Syneresis in Four Commercially Available Irreversible Hydrocolloid (Alginate) Impression Materials
Document4 pages
A Study On Imbibition and Syneresis in Four Commercially Available Irreversible Hydrocolloid (Alginate) Impression Materials
Rizki Annisa
No ratings yet
Validity and Internal Consistency of The New Knee Society Knee
Document8 pages
Validity and Internal Consistency of The New Knee Society Knee
Santiago Mondragon Rios
No ratings yet
Minitab17中文版操作说明
Document86 pages
Minitab17中文版操作说明
Jason
100% (1)
Jurnal Selvi Psik Fix
Document8 pages
Jurnal Selvi Psik Fix
Selvi Monika
No ratings yet
Statistical Analysis On Risk Factors of Prevalence of Malaria in Z/ Dugda Woreda, Oromia, East Arsi
Document31 pages
Statistical Analysis On Risk Factors of Prevalence of Malaria in Z/ Dugda Woreda, Oromia, East Arsi
tewodrosmolalign19
No ratings yet
Intro To Hypothesis Testing
Document83 pages
Intro To Hypothesis Testing
rahulrockon
No ratings yet
Box Plot Outliers
Document2 pages
Box Plot Outliers
soumedy curpen
No ratings yet
Johns Hopkins Carey Business School - Data Analytics Syllabus
Document6 pages
Johns Hopkins Carey Business School - Data Analytics Syllabus
Ajit Kumar
No ratings yet
P Charts: NCSS Statistical Software
Document21 pages
P Charts: NCSS Statistical Software
Ak
No ratings yet
Unified Keypoint-Based Action Recognition Framework Via Structured Keypoint Pooling
Document10 pages
Unified Keypoint-Based Action Recognition Framework Via Structured Keypoint Pooling
henrydcl
No ratings yet
Benford Law
Document6 pages
Benford Law
profharish
No ratings yet
SMU MBA 103 - Statistics For Management Free Solved Assignment
Document12 pages
SMU MBA 103 - Statistics For Management Free Solved Assignment
rahulverma2512
0% (2)
6
Document1 page
6
neilonlinedeals
No ratings yet
Nursing Research and Statistics
Document2 pages
Nursing Research and Statistics
Principal. Aacchs
No ratings yet
Effects of A Neurodevelopmental Treatment Based.3
Document13 pages
Effects of A Neurodevelopmental Treatment Based.3
ivairmatias
No ratings yet
ch02 - v1 - Edit
Document84 pages
ch02 - v1 - Edit
Nguyễn Tuấn Hoàng
No ratings yet
Appendix 4: Statistics and Morphometrics in Plant Systematics
Document10 pages
Appendix 4: Statistics and Morphometrics in Plant Systematics
Ivonne Jalca
No ratings yet
Process Variation Enemy and Opportunity
Document3 pages
Process Variation Enemy and Opportunity
Adrian Paredes
No ratings yet
Terapia Clark Biofeedback Zapper
Document8 pages
Terapia Clark Biofeedback Zapper
Gomez Gomez
50% (2)
Means and Variance of The Sampling Distribution of Sample Means
Document19 pages
Means and Variance of The Sampling Distribution of Sample Means
CHARLYN JOY SUMALINOG
No ratings yet
Business Statistics-Unit 2
Document18 pages
Business Statistics-Unit 2
GKREDDY
No ratings yet
Crafting Cases
Document9 pages
Crafting Cases
Nour
No ratings yet
Zaid Raqeeb Alqadhi
Document5 pages
Zaid Raqeeb Alqadhi
zaidalqadhi
No ratings yet
FF654 Final-Syllabus SY Common - Sem 2 AY 2020-21
Document40 pages
FF654 Final-Syllabus SY Common - Sem 2 AY 2020-21
My Name
No ratings yet
Stress Sensor Prototype Determining The Stress Level in Using A Computer Through Validated Self Made Heart Rate HR and Galvanic Skin Response GSR Sensors and Fuzzy Logic Algorithm IJERTV5IS030049
Document10 pages
Stress Sensor Prototype Determining The Stress Level in Using A Computer Through Validated Self Made Heart Rate HR and Galvanic Skin Response GSR Sensors and Fuzzy Logic Algorithm IJERTV5IS030049
Akshaya Srinivasan
No ratings yet