Green University of Bangladesh Department of Computer Science and Engineering (CSE)

Green University of Bangladesh
Department of Computer Science and

Engineering (CSE)
Faculty of Sciences and Engineering
Semester: (Fall, Year:2023), B.Sc. in CSE
LAB REPORT NO - 05
Course Title: Data Mining LAB

Course Code: CSE 424 Section:D3
Lab Experiment Name: Ensemble Learning
Student Details
Name ID
1. Nazmul Hossen 201902140
Lab Date : 23- 12 -2023

Submission Date : 30-12-2023
Course Teacher’s Name : Meherunnesa Tania
[For Teachers use only: Don’t Write Anything inside this box]
Lab Report Status
Marks: ………………………………… Signature:.....................
Comments:.............................................. Date:..............................
1. TITLE OF THE LAB EXPERIMENT
Implement some ensemble method using the same dataset.
2. OBJECTIVES/AIM
• To learn more about group learning.
• To increase machine learning models' capacity for accurate prediction.
• To produce predictions that are more accurate and lower mistakes compared to individual models.
• To outperform single models in terms of robustness and susceptibility to overfitting.
3. PROCEDURE / ANALYSIS / DESIGN
An ensemble is a group of parts that are seen as a whole rather than as individual parts. An
ensemble technique creates and combines many models to solve a problem. Ensemble techniques
improve the robustness and generalization of the model. In this study, we will discuss a few
strategies and how they are implemented in Python.
Bagging, stacking, and boosting are the three primary classes of ensemble learning techniques. It
is crucial to comprehend each technique in-depth and to take it into account while working on a
predictive modeling project.
Here is types of divided some features of ensemble learning .These are –

Basic Ensemble Techniques
• Max Voting
• Averaging
• Weighted Average
Advanced Ensemble Techniques
• Bagging
• Stacking
• Boosting
Algorithms based on Bagging and Boosting
• Random Forest
• AdaBoost
Working procedure of Max Voting::

At first,Train different models independently using the same dataset.Every model has its own
forecast during the prediction process.The class label with the most votes from each of the
various models is chosen to be the final forecast.
Working procedure of Weighted Average:

Train multiple individual models on the same dataset.During prediction, each model makes
its own prediction.Assign weights to the predictions of each model based on their
performance or reliability. Compute a weighted average of the predictions to obtain the final
prediction.
Working procedure of Bagging (Bootstrap Aggregating):
Create several bootstrap samples by choosing portions of the original dataset at random and
replacing them.Utilizing each bootstrap sample, train an independent model.
Every model has its own forecast during the prediction process.
For classification, the final prediction can be found via majority vote, and for regression, the
average of all the models' forecasts.
Working procedure of Stacking:

Use the same dataset to train a variety of different models, often known as base models.
Construct a meta-model that uses the base models' predictions as input. This type of model is
sometimes referred to as an aggregator or blender.
Educate the
4. IMPLEMENTATION
For MAX-VOTING :
# Load the Dataset
data = pd.read_csv('/kaggle/input/diabetes-dataset/diabetes.csv')
X = data.drop('Outcome', axis=1)
y = data['Outcome']
# Split the Dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Implement Base Models
model1 = LogisticRegression()
model2 = DecisionTreeClassifier()
model3 = RandomForestClassifier()
# Implement Max Voting
max_voting_model = VotingClassifier(estimators=[('lr', model1), ('dt', model2), ('rf', model3)],
voting='hard')
max_voting_model.fit(X_train, y_train)
max_voting_pred = max_voting_model.predict(X_test)
#Evaluate Performance
max_voting_accuracy = accuracy_score(y_test, max_voting_pred)
print("Max Voting Accuracy:", max_voting_accuracy)
For Weighted Average :
# Split the data into training and testing sets

# Train multiple base models with different hyperparameters or features
model1 = RandomForestClassifier(n_estimators=100, random_state=42)
model2 = RandomForestClassifier(n_estimators=50, max_depth=5, random_state=42)
# Train the models
model1.fit(X_train, y_train)
# Make predictions on the test set
pred1 = model1.predict(X_test)
# Evaluate individual model performances
acc1 = accuracy_score(y_test, pred1)
acc2 = accuracy_score(y_test, pred2)
# Assign weights to the models
weight1 = 0.7
weight2 = 0.3
# Calculate the weighted average predictions
ensemble_pred = (weight1 * pred1 + weight2 * pred2) / (weight1 + weight2)
# Convert predictions to binary (0 or 1)
ensemble_pred = [1 if p >= 0.5 else 0 for p in ensemble_pred]
# Evaluate the ensemble model performance
ensemble_acc = accuracy_score(y_test, ensemble_pred)
# Display the results
print(f'Model 1 Accuracy: {acc1:.4f}')
print(f'Model 2 Accuracy: {acc2:.4f}')
For Stacking :
#: Split the Dataset
# Implement Base Models
model1 = LogisticRegression()
model2 = DecisionTreeClassifier()
model3 = RandomForestClassifier()
# Generate Base Model Predictions
# Implement Stacking
meta_model = LogisticRegression() # Example meta-model, you can choose another suitable
model
stacking_train_pred = pd.DataFrame({'Model 1': pred1, 'Model 2': pred2, 'Model 3': pred3})
stacking_test_pred = pd.DataFrame({'Model 1': model1.predict(X_test), 'Model 2':
model2.predict(X_test), 'Model 3': model3.predict(X_test)})
meta_model.fit(stacking_train_pred, y_test)
stacking_pred = meta_model.predict(stacking_test_pred)
#Evaluate Performance
stacking_accuracy = accuracy_score(y_test, stacking_pred)
print("Stacking Accuracy:", stacking_accuracy)
For Bagging :
#Split the Dataset

# Implement Base Model
base_model = DecisionTreeClassifier()
#: Implement Bagging
bagging_model = BaggingClassifier(base_model, n_estimators=10, random_state=42)
bagging_model.fit(X_train, y_train)
bagging_pred = bagging_model.predict(X_test)
# Evaluate Performance
bagging_accuracy = accuracy_score(y_test, bagging_pred)
print("Bagging Accuracy:", bagging_accuracy)
5. TEST RESULT / OUTPUT
Fig-1: Loading Datasets
Fig-2: Accuracy for Max Voting
Fig-3: Accuracy for Weighted Average

Fig-4: Accuracy for Stacking
Fig-5: Accuracy for Bagging
6. ANALYSIS AND DISCUSSION
This lab report covers, in general, how the nature of the issue, the properties of the dataset, and the behavior
of the base models influence the choice of ensemble learning method. Generally speaking, it's best to try a
few different approaches and assess each one's effectiveness before deciding on the best one. Any machine
learning endeavor aims to identify the one model that has the highest predictive accuracy for the desired
outcome. Rather than creating a single model and hope it's the best/most accurate forecast possible,
ensemble techniques combine several models and average them to get a single final model.

Green University of Bangladesh Department of Computer Science and Engineering (CSE)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Green University of Bangladesh Department of Computer Science and Engineering (CSE)

Uploaded by

Copyright:

Available Formats

Green University of Bangladesh

Department of Computer Science and

Course Title: Data Mining LAB

Lab Experiment Name: Ensemble Learning

1. Nazmul Hossen 201902140

Lab Date : 23- 12 -2023

3. PROCEDURE / ANALYSIS / DESIGN

Here is types of divided some features of ensemble learning .These are –

Working procedure of Max Voting::

Working procedure of Weighted Average:

Working procedure of Stacking:

For Weighted Average :

# Split the data into training and testing sets

#Split the Dataset

5. TEST RESULT / OUTPUT

Fig-1: Loading Datasets

Fig-2: Accuracy for Max Voting

Fig-3: Accuracy for Weighted Average

Fig-5: Accuracy for Bagging

6. ANALYSIS AND DISCUSSION

You might also like