Professional Documents
Culture Documents
4
Chapter Outline
4.1 Introduction
4.2 Descriptive Statistics
Data
Management
by Rebecca C. Tolentino
Learning Objectives
4.1 Introduction
Data management pertains to the “practice of managing data as a valuable resource to unlock
its potential for an organization” (SAS, 2020). This is very essential in this digital age when big data is
produced every day. Statistics is one of the tools that will aid in the effective management of data.
There are now several computer softwares that are equipped with statistical functions. These
softwares generate several statistical measures that are used in decision making. One of these
softwares is Excel. This software will be used extensively in this chapter.
The following are the most commonly used descriptive statistics and their equivalent syntax
in excel.
54
Chapter 4. Data Management
Definition 4.3.2. Correlation coefficient is a measure of the relative strength of a linear relationship
between two numerical variables. Its value ranges from -1, perfect negative correlation to +1, for a
perfect positive correlation.
Definition 4.3.4. The coefficient of determination (r2) is the proportion of the total variation in the
dependent variable (Y) that is explained or accounted for by the variation in the independent variable
(X).
Definition 4.3.5. Regression analysis is carried out to develop a model to predict the values of a
dependent variable (Y), based on the value of the independent variable (X).
The Independent Variable, denoted by X, provides the basis for estimation. It is the predictor
variable.
where
b is the slope of the line, or the average change in Y’ for each change of one unit in X.
Suppose the X’s are in B2 to B32 and the Y’s are in C2 to C32
55
Chapter 4. Data Management
(Regression
Constant)
Pearson’s r Correlation coefficient for two interval or = CORREL(B2:B32,C2:C32)
ratio-scaled variables
• Calculating The Standard Deviation, Mean, Median, Mode, Range, & Variance
Using Excel. https://www.youtube.com/watch?v=k17_euuiTKw
56
Chapter 4. Data Management
Exercise 4.1
Descriptive Statistics
Name: ________________________________________________________
Score:
Course-Block: _________________ Schedule: ________________________
Professor: _____________________________________________________
2. A travelling salesman checks the prices of gasoline in gas stations within his area of
assignment. The following are the prices per liter of unleaded gasoline in a sample of 15
gasoline stations in his area:
50.15 51.89 48.84 51.87 46.59 51.61 49.54 47.98
50.96 51.22 51.08 50.88 51.94 46.50 47.90
57
Chapter 4. Data Management
4. A travelling salesman checks the prices of gasoline in gas stations within his area of
assignment. The following are the prices per liter of unleaded gasoline in a sample of 15
gasoline stations in his area:
50.15 51.89 48.84 51.87 46.59 51.61 49.54 47.98
50.96 51.22 51.08 50.88 51.94 46.50 47.90
5. A commuter from Cavite travels daily to work in Manila each morning. He records his travel
time ( in minutes) during the last two weeks as follows:
Week 1 Week 2
Mon Tue Wed Thurs Fri Mon Tue Wed Thurs Fri
104 84 62 97 70 115 54 74 101 108
b. Compute the range, interquartile range, variance, standard deviation, and coefficient of
variation.
c. What would you tell a person who asks how long it would take to commute from Cavite to
Manila in the morning?
58
Chapter 4. Data Management
6. One of the major issues in customer service is the speed with which a company responds to
customer complaints. The manager of a telecommunication company aims to have a baseline
data about the period the company is able to respond to customer complaints. The data will
be used as a reference for a new system they want to adopt. The following data from a
random sample of 25 complaints represent the number of days between the receipt of a
complaint and the resolution of the complaint:
b. Compute the range, interquartile range, variance, standard deviation, and coefficient of
variation.
c. On the basis of the results of (a) and (b), if you had to tell the president of the company how
long a customer should expect to wait to have a complaint resolved, what would you say?
Explain.
59
Chapter 4. Data Management
Exercise 4.2
Linear Regression and Correlation
Name: ________________________________________________________
Score:
Course-Block: _________________ Schedule: ________________________
Professor: _____________________________________________________
X 12 11 13 5 19 14 17 6 17 14 18 7 8 18 14
Y 20 19 24 14 27 22 25 14 22 26 26 15 15 23 20
4. A college faculty collected data on his students’ general weighted average in the first
semester and their high school average grade.
GWA 2.06 2.08 2.11 1.52 1.62 1.47 2.18 1.7 1.85 1.69
HS grade 92 93 85 87 89 89 89 85 88 95
GWA 2.03 2 1.46 1.27 2.06 1.26 2.14 2.2 1.29 1.96
HS grade 85 92 87 92 91 85 91 92 91 85
60
Chapter 4. Data Management
6. In the study conducted by a college faculty collected data on his students’ general weighted
average in the first semester and their high school average grade, if a regression equation is
developed on GWA as a function of high school average,
61
Chapter 4. Data Management
7. In the study conducted by a Mathematics faculty on the number of hours a student spent in
the online classroom and his score in the assessment test, if a regression model is developed
on score in the assessment test based on the number of hours a student spent in the online
classroom,
62
Chapter 4. Data Management
References
Berenson, M.L., Levine, D.M. & T.C. Krehbiel (2012). Basic business statistics: Concepts and
applications (12th Edition). Prentice Hall.
Lind, D.A., Marchal, W.G. & S.A. Wathen (2012). Basic Statistics for Business Economics (8 th Edition).
McGraw Hill.
Mann, P.S. (2010). Introductory Statistics. John Wiley & Sons, Inc.
63