You are on page 1of 5

Name : Md Ujale

Registration No. : 12002485


Faculty Member : Dr. Pooja Kansra
Course Code : QTT509
Course Title : Statistical Analysis for Decision Making
Program : P370- NN2E: MBA (International Business)
Trimester : 1st
Section : Q2050
Batch : 2020
Date of Submission : 13th September 2020
Metropolitan_Area Commute Time
Abilene, TX 37.57
Akron, OH 49.38
Albany, GA 44.52
Albany-Schenectady-Troy, NY 48.37
Albuquerque, NM 48.85
Alexandria, LA 54.63
Allentown-Bethlehem-Easton, PA 53.55
Altoona, PA 43.37
Amarillo, TX 39.68
Ames, IA 35.97
Anchorage, AK 47.98
Anderson, IN 49.92
Anderson, SC 51.22
Ann Arbor, MI 47.00
Anniston-Oxford, AL 50.72
Appleton, WI 38.93
Asheville, NC 46.83
Athens-Clarke County, GA 46.92
Atlanta-Sandy Springs-Marietta, GA 66.33
Atlantic City, NJ 51.07

1. Create a histogram of the daily commute times and interpret the results?
Ans:-
Min Max Midpoint Frequency (f) Relative Freq. (f/total)
30 40 35 4 0.2
40 50 45 10 0.5
50 60 55 5 0.25
60 70 65 1 0.05
Total = 20

Metropolitan Areas
12
10
8 frequency (f)
6
10
4
2 4 5

0 1
35 45 55 65
2. Find the most representative average daily commute time across this distribution
and reason behind the same?
Ans:- After calculating the mean from the given no. I got 47.64 as the most representative
average daily commute time in metropolitan area.

Metropolitan_Area Commute Time


Abilene, TX 37.57
Akron, OH 49.38
Albany, GA 44.52
Albany-Schenectady-Troy, NY 48.37
Albuquerque, NM 48.85
Alexandria, LA 54.63
Allentown-Bethlehem-Easton, PA 53.55
Altoona, PA 43.37
Amarillo, TX 39.68
Ames, IA 35.97
Anchorage, AK 47.98
Anderson, IN 49.92
Anderson, SC 51.22
Ann Arbor, MI 47.00
Anniston-Oxford, AL 50.72
Appleton, WI 38.93
Asheville, NC 46.83
Athens-Clarke County, GA 46.92
Atlanta-Sandy Springs-Marietta, GA 66.33
Atlantic City, NJ 51.07
Average Daily Commute 47.64

3. Find a useful measure of the variability of these average commute times around
the mean?

Ans:- A useful measure of the variability of these average commute times is “Median”.
Although the distribution shape is skewed to the right, the Median is a better measure of
central tendency and the value of Median is 46.

C.I. Midpoin Frequency Cumulative Freq.


t (f)
30-40 35 4 4
40-50 45 10 14
50-60 55 5 19
60-70 65 1 20
l= 40, n= 20, cf= 4, f= 10, h= 10
Median = l + (n/2)-cf
f *h
= 40+ (20/2)-4
*10
10
Median = 46

4. The empirical rule for standard deviations indicates that approximately 95% of
these average travel times will fall between which two values? For this particular
data set, is this empirical rule at least approximately correct?

Ans:- The empirical rule for standard deviations indicates that approximately 95% of these
average travel times will fall between (Mean – 2*SD) and (Mean + 2*SD) Which is (46.64 –
2*6.64) = 34.36 and (47.64 + 2*6.64) = 60.92. I got SD(6.64) by solving in notebook. The
empirical rule gives a substantial meaning to standard deviation for symmetrical and bell-
shaped distribution. Therefore, since these average commute times are not at least
approximately normally distributed and the shape is skewed to the right, the empirical rule
for this distribution would not be very accurate.

5. A researcher is interested in determining whether there is a relationship between


the number of room air conditioning units sold each week and the time of year.
What type of descriptive chart would be most useful in performing this analysis?
Explain your choice.

Ans:- According to me descriptive statistics chart would be most useful in performing this
analysis because descriptive statistics is used to summarize the data available and understand
the better data. It is typically distinguished from inferential statistics. With descriptive
statistics you are simply describing what is or what the data shows. With inferential statistics,
you are trying to reach conclusions that extend beyond the immediate data alone. It wolud
also help in describing the data and analyzing the relationship between the number of room
air conditioning units sold each week and the time of year.
6. Explain why the standard deviation would likely not be a reliable measure of
variability for a distribution of data that includes at least one extreme outlier?

Ans:- The standard deviation would likely not be a reliable measure of variability for a
distribution of data because standard deviation is a measure that summarizes the amount by
which every value within a dataset varies from the mean. It is usually presented in
conjunction with the mean and is measured in the same units. Mathematically, standard
deviation is the square root of the variability for the distribution of the data. When the values
in a dataset are pretty tightly bunched together the SD is small and when the values are
spread apart the SD will be relatively large. That’s why the standard deviation would not be a
reliable measure of variability for a distribution of data that includes at least once extreme
outlier.

You might also like