You are on page 1of 6

TSSM’s

BHIVARABAI SAWANT COLLEGE OF ENGINEERING & RESEARCH


PUNE 411041
SAVITRIBAI PHULE PUNE UNIVERSITY
2020-2021

Machine Learning Lab


(Lab Practice 3)

Mini Project Report


On

“White Wine Quality Prediction System”

By
Nandini Kate - 156
Amey Kavathekar - 46
Anuradha Lokhande - 59

Under the Guidance of


Prof. Dr. Harmeet Khanuja
Index

Sr. No Name

1 Problem Statement

2 Introduction

3 Hardware Software Requirements

4 Methodology

5 Architecture Diagram

6 Algorithm

7 Results

8 Conclusion
Problem Statement:
To experiment with different classification methods to see which yields the highest accuracy of the “Wine
Quality Prediction System”. To determine which features are the most indicative of a good quality wine.

Introduction:

Here we will predict the quality of wine on the basis of giving features. We use the wine quality dataset
from Kaggle. This dataset has the fundamental features which are responsible for affecting the quality of
the wine. By the use of several Machine learning models, we will predict the quality of the wine. Here we
will only deal with the white type wine quality, we use classification techniques to check further
the quality of the wine i.e. is it good or bad.

Hardware & Software Requirements:

● Personal Computer

● 4GB RAM
● Any Preferable OS

● Google Colab / Jupyter Notebook

Methodology:
The goal of this project is to determine wine quality based on the chemical properties .
Input variable: based on physicochemical tests
Output variable: based on sensory data, median of at least 3 evaluations made by wine experts.

Libraries used:
Pandas is a useful library in data analysis, Numpy library used for working with arrays, Seaborn and
Matplotlib are used in data visualization.
Dataset description:

In this dataset, classes are ordered, but it was not balanced. Here, red wine instances are present at a high
rate and white wine instances are less than red.
These are the name of Features from the dataset :-
1. Fixed acidity
2. Volatile acidity
3. Citric acid
4. Residual sugar
5. Chlorides
6. Free sulfur dioxide
7. Total sulfur dioxide
8. Density
9. PH
10. Sulphates
11. Alcohol

Architecture Diagram :
Algorithm :

1. Use Preprocessing Methods


● Label Encoding to convert categorical variables to numerical variables.
● Removing duplicate values.
● Checking for Nan values.

2. Splitting the dataset into train and test dataset

3. Use Classification Algorithms


● SVM classifier,
● KNN classifier,
● Decision Tree classifier,
● Random Forest Classifier.

4. Prediction of the target values by applying the most efficient algorithm.

Results:

● Validation Accuracy:
● RF vs SVM Confusion Matrix :

Conclusion:

The project involved analysis of the wine quality prediction dataset with proper data processing. Then, 4
models were trained and tested with accuracy as follows:

1. SVM Classifier – 82.20%


2. K- NN Classifier – 77.16%
3. Decision Tree Classifier - 74.56%
4. Random Forest Classifier – 85.93%

You might also like