You are on page 1of 1

3/29/24, 3:55 PM Iris Dataset Analysis - Notebook by Swapnil Gupta (swapnilg4u) | Jovian

Sign In

Learn practical skills, build real-world projects, and advance your career Sign Up

swapnilg4u / iris-dataset-analysis
3 years ago

Data Analysis on the IRIS Flower Dataset


Download the Iris flower dataset or any other dataset into a DataFrame. (eg
https://archive.ics.uci.edu/ml/datasets/Iris)
Use Python/R and perform following –

1. How many features are there and what are their types (e.g., numeric, nominal)? See here
2. Compute and display summary statistics for each feature available in the dataset. (eg. minimum value,
maximum value, mean, range, standard deviation, variance and percentiles) See here
3. Data Visualization-Create a histogram for each feature in the dataset to illustrate the feature distributions.
Plot each histogram. See here

4. Create a boxplot for each feature in the dataset. All of the boxplots should be combined into a single plot.
Compare distributions and identify outliers. See here

import numpy as np
import pandas as pd

df = pd.read_csv("iris-flower-dataset.csv",header=None)
df.columns = ["col1","col2","col3","col4","col5"]

df.head()

col1 col2 col3 col4 col5

0 5.1 3.5 1.4 0.2 Iris-setosa

1 4.9 3.0 1.4 0.2 Iris-setosa

2 4.7 3.2 1.3 0.2 Iris-setosa

3 4.6 3.1 1.5 0.2 Iris-setosa

4 5.0 3.6 1.4 0.2 Iris-setosa

Q1 How manyfeaturesare there and whatare their types?


https://jovian.com/swapnilg4u/iris-dataset-analysis 1/1

You might also like