You are on page 1of 6

Instructor: Shawn Michael Dela Rosa

Data Management

When you take a course in college, it is natural


to wonder how you will do compared to the
other students. Will you finish in the top 10% or
will you be closer to the middle? One statistic
that is often used to measure the position of a
data value with respect to other values is known
as the z-score or the standard score.

• The z-score for a given data value x is the number of


standard deviations that x is above or below the mean of
the data.

• A negative z-score represents a value less than the mean.

• A positive z-score represents a value greater than the


mean . When z=0 , the data value is equal to the mean.

• Population : Sample :

Raul has taken two tests in his chemistry class. He scored 72 on the
first test, for which the mean of all scores was 65 and the standard
deviation was 8. He received a 60 on a second test, for which the
mean of all scores was 45 and the standard deviation was 12. In
comparison to the other students, did Raul do better on the first
test or the second test?

• One of the most important statistical distributions of data is known


as a normal distribution.

• This distribution occurs in a variety of applications.

• Types of data that may demonstrate a normal distribution include


the lengths of leaves on a tree, the weights of newborns in a
hospital, the lengths of time of a student’s trip from home to school
over a period of months, the SAT scores of a large group of
students, and the life spans of light bulbs.

Properties of a Normal Distribution

• The graph is symmetric about a vertical line through the mean of


the distribution.

• The mean, median, and mode are equal.

• Areas under the curve that are symmetric about the mean are
equal.

• The total area under the curve is 1.

• It is often helpful to convert data values x to z-


scores, as we did previously using the z-score
formula.

• If the original distribution of x values is a normal


distribution, then the corresponding distribution
of z-scores will also be a normal distribution.

• This normal distribution of z-scores is called the


standard normal distribution.

• It has a mean of 0 and a standard deviation of


1.

A soda machine dispenses soda into 12-ounce cups. Tests show that the
actual amount of soda dispensed is normally distributed, with a mean of 11.5
oz and a standard deviation of 0.2 oz.
a. What percent of cups will receive less than 11.25 oz of soda?
b. What percent of cups will receive between 11.0 oz and 11.55 oz of soda?
c. What percent of cups will receive more than 11.25 oz of soda?

MMW 2024 Page 1


MMW 2024 Page 2
The average speed of a car in NLEX is 90kph
with a standard deviation of 8. Find the
probability that the speed is less than 75?

A consumer group tested a sample of 100 light


bulbs. It found that the mean life expectancy of the
bulbs was 842h, with a standard deviation of 90.
One particular light bulb from the DuraBright
Company had a z-score of 1.2. What was the life
span of this light bulb?

CORRELATION
• Correlation is a statistical method used to
determine whether a relationship between
variables exist. A variable here is a
characteristic of the population being observed
or measured.

• Simple linear relationship can be positive or


negative. A positive relationship exists when
either variables increase at the same time or
both decrease at the same time. On the
contrary, in a negative relationship, as one
variable increases, the other variables
decreases or vice versa.

• It is the most widely used in Statistics to measure


the degree of the relationship between the
linear related variables.

• It requires both variables to be normally


distributed.

• Correlation refers to the departure of two


random variables from independence.

• Correlation coefficient (Pearson’s r) is a


measure of the linear strength of association
between two variables. It is founded by Karl
Pearson.

• The value of the correlation coefficient varies


between -1 and +1.

• Guilford’s suggested interpretation for the value


of .

Value Interpretation
Less than Slight; Almost negligible relationship
0.20
0.20-0.40 Low correlation; definite but small relationship
0.41-0.70 Moderate correlation; substantial relationship
0.71-0.90 High correlation; marked relationship
0.91-1.00 Very high correlation; very dependable
relationship

MMW 2024 Page 3


MMW 2024 Page 4
relationship

• Example: The owner of a chain of fruit shake store would like to


study the correlation between atmospheric temperature and
sales during the summer season. A random sample of 6 days is
selected with the results given as follows. Compute the
coefficient of correlation.

1 2 3 4 5 6 Days
79 76 78 84 90 83 Temperature (
147 143 147 168 206 155 Sales

REGRESSION

• Regression analysis is a statistical method used


to describe the nature of the relationship
between variables. There are two types of
regression analysis: simple and multiple.

• In simple linear regression, there are two


variables- an independent (predictor variable)
and a dependent (response variable). On the
other hand, multiple linear regression, there are
two or more independent variables used to
predict the dependent variable.

LINEAR REGRESSION
• It is a simple statistical tool used to model the
dependence of a variable on one (or more)
explanatory variables.

• This functional relationship may then be formally


stated as an equation, with associated values
that describe how well this equation fits the
data.

• Example: The owner of a chain of fruit shake store would like to


study the correlation between atmospheric temperature and
sales during the summer season. A random sample of 12 days is
selected with the results given as follows. Find the linear
regression equation.

1 2 3 4 5 6 Days
79 76 78 84 90 83 Temperature (
147 143 147 168 206 155 Sales

MMW 2024 Page 5


MMW 2024 Page 6

You might also like