You are on page 1of 6

Artificial Intelligence and Machine Learning

PRACTICAL 2
Introduction to Machine Learning
Working with Scikit-Learn

Prepared by Nima Dema

1 | Page 15 February 2024


Artificial Intelligence and Machine Learning

Table of Contents
0. Learning Objectives 2
1. Introduction to scikit-learn library 2
2. Scikit-learn datasets 3
2.1. Import modules 3
2.2. Load data 4
2.3. Creating dataframe 4
2.4. Exploring the data 5
3. Scikit-learn Pre-processing techniques 5
3.1. Import Scaler 5
3.2. Scale the features 5
3.3. Convert scaled data to DataFrame 6
TODO: WORKING WITH SKLEARN DIABETES DATASET 6

0. Learning Objectives
In this week’s practical session, the main focus is on the usage of sklearn library.
The primary aim is to enable students to effectively harness the scikit-learn library
for machine learning tasks.

By the end of the lab, you should be able to:

➔ Use scikit-learn datasets module to load and read data


➔ Use some common scikit-learn methods to pre-process data

1. Introduction to scikit-learn library


The scikit-learn library, often abbreviated as sklearn, is an open-source machine
learning library for the Python programming language. It provides a wide range of
tools for machine learning tasks such as classification, regression, clustering,
dimensionality reduction, and model selection.

2 | Page 15 February 2024


Artificial Intelligence and Machine Learning

In addition to its machine learning capabilities, scikit-learn provides a rich suite of


data pre-processing methods including scalers, encoders, imputers and etc., which
are essential for preparing datasets before training machine learning models.

Moreover, scikit-earn also offers a collection of inbuilt datasets that serve as


convenient resources for experimentation and learning. These datasets cover a
diverse range of domains and are readily available for users to explore and apply
various machine learning techniques without the need for external data sources.

2. Scikit-learn datasets
The sklearn.datasets module contains different datasets that you can use to
work on for creating machine learning models. In this section we will explore few of
them.

INSTRUCTIONS:

➔ Load iris dataset, which is commonly used datasets to apply machine


learning techniques as a beginner. For this task use load_iris()
➔ Explore the data returned by the load_iris() method.
➔ Create dataframe from the loaded dataset. Make sure to include both features
as well as target.

2.1. Import modules


Import the necessary libraries (Already done for you). For this task you may need
pandas as well.

3 | Page 15 February 2024


Artificial Intelligence and Machine Learning

from sklearn.datasets import load_iris


import pandas as pd

2.2. Load data


Load data using load_iris() and answer following questions:

 What is the type of data returned by the load_iris()?


 Explain different attributes associated with the data returned by load_iris().

#Write your answer here [Expecting 3 lines of code]


…….
…….
…….

2.3. Creating dataframe


Create dataframe from the above loaded data. Make sure you include all the features
as well as target in your dataframe. Use dataframe head() method to check your
result.

Your expected result should look like:

Write your solution here [Expecting 3 lines of code]


…….
…….
…….

4 | Page 15 February 2024


Artificial Intelligence and Machine Learning

2.4. Exploring the data


Use necessary methods to explore your data and check what type (categorical or
numerical) of data does each features including the target falls under. Justify your
answer.

#write your answer here


…….
…….
…….

3. Scikit-learn Pre-processing techniques


In this section, we will explore the preprocessing tools provided by the scikit-learn.
Among various techniques essential for machine learning, one commonly used
method is feature scaling, which involves adjusting features to the same rage.
Sklearn offers numerous methods for scaling the features, but for our purposes, we
will opt for StandardScaler to scale our features.

INSTRUCTIONS:

 First find the necessary module where the StandardScaler is and import it.

 Create object of StandardScaler

Scale the features by calling fit_transform() method of StandardScaler.

3.1. Import Scaler


You can import StandardScaler from the sklearn.preprocessing module.
#write your solution here
…….

3.2. Scale the features


Initially, you must instantiate a scaler object and subsequently, you will invoke the
fit_transform method to effectively scale the features.

 What type of data is returned by the fit_transform method() ?

#write your solution here

5 | Page 15 February 2024


Artificial Intelligence and Machine Learning

…….

3.3. Convert scaled data to DataFrame


To get better visualization of the data, let us convert our scaled data back to
dataframe. Compare your newly created dataframe with orginal dataframe. How are
they different?

#write your solution here


…….

TODO: WORKING WITH SKLEARN DIABETES DATASET


INSTRUCTIONS:
➔ Load sklearn diabetes datasets. Perform necessary steps to load and view
data and compare it with iris dataset. How is diabetes dataset different from
iris dataset? Justify your answer.

➔ Explore other scaling techniques available in sklearn and apply to your


dataset. How is your choice of scaling technique different from
StandardScaler?

THANK YOU 

6 | Page 15 February 2024

You might also like