Practical 2 - Working With Scikit-Learn

Artificial Intelligence and Machine Learning
PRACTICAL 2
Introduction to Machine Learning
Working with Scikit-Learn
Prepared by Nima Dema
1 | Page 15 February 2024

Table of Contents
0. Learning Objectives 2
1. Introduction to scikit-learn library 2
2. Scikit-learn datasets 3
2.1. Import modules 3
2.2. Load data 4
2.3. Creating dataframe 4
2.4. Exploring the data 5
3. Scikit-learn Pre-processing techniques 5
3.1. Import Scaler 5
3.2. Scale the features 5
3.3. Convert scaled data to DataFrame 6
TODO: WORKING WITH SKLEARN DIABETES DATASET 6
0. Learning Objectives
In this week’s practical session, the main focus is on the usage of sklearn library.
The primary aim is to enable students to effectively harness the scikit-learn library
for machine learning tasks.
By the end of the lab, you should be able to:
➔ Use scikit-learn datasets module to load and read data

➔ Use some common scikit-learn methods to pre-process data
1. Introduction to scikit-learn library

The scikit-learn library, often abbreviated as sklearn, is an open-source machine
learning library for the Python programming language. It provides a wide range of
tools for machine learning tasks such as classification, regression, clustering,
dimensionality reduction, and model selection.

In addition to its machine learning capabilities, scikit-learn provides a rich suite of

data pre-processing methods including scalers, encoders, imputers and etc., which
are essential for preparing datasets before training machine learning models.
Moreover, scikit-earn also offers a collection of inbuilt datasets that serve as

convenient resources for experimentation and learning. These datasets cover a
diverse range of domains and are readily available for users to explore and apply
various machine learning techniques without the need for external data sources.
2. Scikit-learn datasets
The sklearn.datasets module contains different datasets that you can use to
work on for creating machine learning models. In this section we will explore few of
them.
INSTRUCTIONS:
➔ Load iris dataset, which is commonly used datasets to apply machine

learning techniques as a beginner. For this task use load_iris()
➔ Explore the data returned by the load_iris() method.
➔ Create dataframe from the loaded dataset. Make sure to include both features
as well as target.
2.1. Import modules

Import the necessary libraries (Already done for you). For this task you may need
pandas as well.

from sklearn.datasets import load_iris

import pandas as pd
2.2. Load data

Load data using load_iris() and answer following questions:
 What is the type of data returned by the load_iris()?

 Explain different attributes associated with the data returned by load_iris().
#Write your answer here [Expecting 3 lines of code]

…….
…….
…….
2.3. Creating dataframe

Create dataframe from the above loaded data. Make sure you include all the features
as well as target in your dataframe. Use dataframe head() method to check your
result.
Your expected result should look like:
Write your solution here [Expecting 3 lines of code]

…….
…….
…….

2.4. Exploring the data

Use necessary methods to explore your data and check what type (categorical or
numerical) of data does each features including the target falls under. Justify your
answer.
#write your answer here

…….
…….
…….
3. Scikit-learn Pre-processing techniques

In this section, we will explore the preprocessing tools provided by the scikit-learn.
Among various techniques essential for machine learning, one commonly used
method is feature scaling, which involves adjusting features to the same rage.
Sklearn offers numerous methods for scaling the features, but for our purposes, we
will opt for StandardScaler to scale our features.
INSTRUCTIONS:
 First find the necessary module where the StandardScaler is and import it.
 Create object of StandardScaler
Scale the features by calling fit_transform() method of StandardScaler.
3.1. Import Scaler

You can import StandardScaler from the sklearn.preprocessing module.
#write your solution here
…….
3.2. Scale the features

Initially, you must instantiate a scaler object and subsequently, you will invoke the
fit_transform method to effectively scale the features.
 What type of data is returned by the fit_transform method() ?

…….
3.3. Convert scaled data to DataFrame

To get better visualization of the data, let us convert our scaled data back to
dataframe. Compare your newly created dataframe with orginal dataframe. How are
they different?

…….
TODO: WORKING WITH SKLEARN DIABETES DATASET

INSTRUCTIONS:
➔ Load sklearn diabetes datasets. Perform necessary steps to load and view
data and compare it with iris dataset. How is diabetes dataset different from
iris dataset? Justify your answer.
➔ Explore other scaling techniques available in sklearn and apply to your

dataset. How is your choice of scaling technique different from
StandardScaler?
THANK YOU 

Practical 2 - Working With Scikit-Learn

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Practical 2 - Working With Scikit-Learn

Uploaded by

Copyright:

Available Formats

Artificial Intelligence and Machine Learning

Prepared by Nima Dema

1 | Page 15 February 2024

By the end of the lab, you should be able to:

➔ Use scikit-learn datasets module to load and read data

1. Introduction to scikit-learn library

2 | Page 15 February 2024

In addition to its machine learning capabilities, scikit-learn provides a rich suite of

Moreover, scikit-earn also offers a collection of inbuilt datasets that serve as

➔ Load iris dataset, which is commonly used datasets to apply machine

2.1. Import modules

3 | Page 15 February 2024

from sklearn.datasets import load_iris

2.2. Load data

 What is the type of data returned by the load_iris()?

#Write your answer here [Expecting 3 lines of code]

2.3. Creating dataframe

Your expected result should look like:

Write your solution here [Expecting 3 lines of code]

4 | Page 15 February 2024

2.4. Exploring the data

#write your answer here

3. Scikit-learn Pre-processing techniques

 Create object of StandardScaler

Scale the features by calling fit_transform() method of StandardScaler.

3.1. Import Scaler

3.2. Scale the features

 What type of data is returned by the fit_transform method() ?

#write your solution here

5 | Page 15 February 2024

3.3. Convert scaled data to DataFrame

#write your solution here

TODO: WORKING WITH SKLEARN DIABETES DATASET

➔ Explore other scaling techniques available in sklearn and apply to your

6 | Page 15 February 2024

You might also like