You are on page 1of 3

Lab #2

January 22

Submission Guidelines: Although group discussions are encouraged, each student must submit
their work individually to Crowdmark by the end of the day.

The concepts introduced in this tutorial will be vital for understanding future lectures and for
completing the upcoming assignment. Make sure you are comfortable with them before moving on
to more complex topics.

1 Introduction to Types of Plots


• Stem and Leaf Plot: A stem and leaf plot is a data visualization technique used to display
quantitative data. The ’stem’ represents the leading digits, while the ’leaf’ shows the trailing
digits.

• Bar Plot: A bar plot represents categorical data with rectangular bars. The length of each
bar is proportional to the count or frequency of the category it represents.

• Pie Chart: A pie chart displays categorical data in the form of a circle divided into sectors.
Each sector represents a category, and its size is proportional to the count or frequency of
that category.

• Histogram: A histogram is similar to a bar plot but is used for continuous or numeric
data. The data is divided into bins, and the frequency or count of data points in each bin is
represented by the height of the bar.

• Polygon Plots: A polygon plot connects the midpoints of each bin in a histogram with
straight lines, creating a polygon. It is often used to highlight the distribution shape of the
data.

2 Statistical Measures and Descriptive Statistics


• Sample Mean: The sample mean, often denoted as x̄, is the sum of all values in a sample
divided by the number of values in the sample.

• Standard Deviation: The standard deviation, often denoted by σ or s, measures the amount
of variation or dispersion in a set of values. A low standard deviation means that the values
tend to be close to the mean, while a high standard deviation means the values are spread
out over a wider range.

• Mode: The mode is the value that appears most frequently in a data set. A data set may
have one mode (unimodal), two modes (bimodal), or multiple modes (multimodal).

1
• Median: The median is the middle number in a sorted list of numbers. If the list has an odd
number of observations, the median is the middle number. If the list has an even number of
observations, the median is the average of the two middle numbers.

• Percentile: The P th percentile is the value below which a given percentage (P ) of the data
falls. For example, the 20th percentile is the value below which 20% of the data falls.

• Q1 and Q3 (First and Third Quartiles): Q1 (25th percentile) is the middle value in the
first half of the data set (25 percent of the data points are below that). Similarly, Q3 (75th
percentile) is the middle value in the second half of the data set (75 percent of the data points
are below that).

• Suspected Outliers: Outliers are data points that are significantly different from other
observations. They could be due to variability in the data or errors. In a box-and-whisker
plot, outliers are usually observed as points that fall outside the ”whiskers.”

• Box Plot: A box plot is a graphical representation of statistical data based on a five-number
summary (minimum, first quartile (Q1 ), median, third quartile (Q3 ), and maximum). It can
also show outliers. The box represents the interquartile range (the range between Q1 and
Q3 ), the line inside the box is the median, and the lines or ”whiskers” extend to the smallest
and largest observations in the data.

2
Question: Consider the following data set representing the number of hours studied by
students for a final exam: 10, 12, 12, 15, 15, 15, 15, 17, 18, 20, 20, 20, 23, 23, 26, 26, 26, 27,
27, 40.
You are expected to follow the materials and methods taught in the lectures. You
are not allowed to use Excel.

1. What is the sample mean of the data set?


2. Calculate the standard deviation of the number of hours studied by students.
3. What is the mode of the data set?
4. Compute the median of the data set.
5. Determine the first and third quartiles of the data set.
6. Find the 65th percentile of the data set.
7. Are there any suspected outliers in this data set? If yes, which ones?
8. Construct a box plot (sketch with hand) for the given data set.

You might also like