You are on page 1of 5

import statistics

# Sample data
data = [1, 2, 3, 4, 5]
# Mean
mean = statistics.mean(data)
print("Mean: ", mean)

# Median
median = statistics.median(data)
print("Median: ", median)

# Mode
mode = statistics.mode(data)
print("Mode: ", mode)

# Variance
variance = statistics.variance(data)
print("Variance: ", variance)
# Standard deviation
stdev = statistics.stdev(data)
print("Standard deviation: ", stdev)

matplotlib library:
import matplotlib.pyplot as plt
# Sample data
data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5]
# Create histogram
plt.hist(data, bins=5, edgecolor='black')
# Add labels and title
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram of Sample Data')
# Show plot
plt.show()
In this example, we import the matplotlib.pyplot module and
create a list of sample data. We then use the hist() function to
create a histogram of the data, with 5 bins and black edges
around each bin. We add labels to the x and y axes and a title to
the plot using the xlabel(), ylabel(), and title() functions. Finally,
we use the show() function to display the plot.
A box plot is a graphical representation of the five-number
summary of a dataset, which includes the minimum value, the
first quartile (Q1), the median, the third quartile (Q3), and the
maximum value. It is a commonly used tool in descriptive
statistics to visualize the distribution of a dataset and identify
potential outliers.
The box in the box plot represents the interquartile range (IQR),
which is the range of the middle 50% of the data. The length of
the box indicates the spread of the data within the IQR. The
median is represented by a line inside the box. The whiskers
extend from the box to the minimum and maximum values of
the data, but may be truncated to indicate the most extreme
non-outlier data point.

Any data points that fall outside of the whiskers are considered
outliers and are plotted as individual points. These points are
plotted beyond the whiskers at a distance of 1.5 times the
IQR.Box plots can be used to compare the distribution of
multiple datasets. In this case, the box plots are typically
displayed side-by-side to make comparisons easy.

You might also like