You are on page 1of 10

1

UNIT 1: INTRODUCTION TO STATISTICAL ANALYSIS

Have you ever asked yourself questions such as: How can I determine whether a process
is out of control? How can I know that a process is improving? How do I improve my
products and services? What consumer values are prevalent? If you have asked yourself
such questions, then Statistics will be of great help. Statistical analysis deals with
statistical techniques that can be used in some decision making.

No one on earth today can accurately foretell the outcome of one’s decision since we rely
mostly on probabilities. However, a mechanism can be used in projecting trends based
on the observed data and those data are measured and analyzed by using statistical
procedures in various ways.
2

1.1 Level of Data Measurement

Around the earth, millions of researchers are benefiting from the numerical data gathered
in business everyday. However, all these data should not be viewed similarly because it
comes from different groups.
For example, the data from revenues, height, weight, temperatures, ratings, scores, cell
phone numbers, are different from each other.

For this reason, the researcher needs to classify these data according to their level of
measurements. This classification is important to avoid confusion. For example, getting
the average of daily revenues is meaningful, but it does not make sense in averaging cell
phone numbers. Therefore, your knowledge of levels of data will help you decide what
statistical analysis is appropriate.

What are the levels of data measurement?


There are four scales of measurement: nominal, ordinal, interval, and ratio data. The nominal
scale data are of the lowest level, and ratio scale data are of the highest level.

1. Nominal Level. This type of data can be used only to classify or to label, for it has no
quantitative attributes. This is the lowest level of data because it cannot be quantified.
Covid-19 patient numbers are used to label the patient, ID number, SSS/GSIS number,
cellphone number are examples of nominal data.

Survey results that are for classification only are nominal. For example, teacher, front-
liner, engineer, doctor, race, color, gender, civil status are nominal.

Nominal data can be used in chi-square statistics.

2. Ordinal Level. Ordinal data can be used to rank or arrange objects. It has also an
attribute used to classify or categorize just like nominal data. For example, academic
ranking, degree of illness, satisfaction rating, socioeconomic status, are ordinal data.
But unlike nominal data, ordinal data can be measured to some extent. For example,
socioeconomic status can be measured as low income, middle income, and high
income. Satisfaction ratings can be measured as 1, 2, 3, 4, and 5. However, you cannot
use ordinal data to establish that intervals, between rank 2 and 3 and between 4 and
5, are equal. Here the distances or spacing represented by consecutive numbers are
not necessarily equal.

3. Interval Level .The interval level not only classifies and orders the data, but it also
specifies that the distances between each interval on the scale are equivalent along
the scale from low to high interval. The distance between attributes has meaning and
3

the differences between data are meaningful. The interval level do not consider zero
as the starting point specifically in the case of temperature. Examples: temperature,
scores, weight ,height, costs, allowance and age.
4. Ratio Level. This measurement is an interval level modified to include the inherent
zero starting point. The temperature is not a ratio data because it can have negative
values. Examples: scores, weight, height, costs, allowance and age.

1.2 Summation Notation


Throughout the whole course you will encounter the summation symbol  as part of
statistical computation. Perhaps you might find it difficult at first. But you need these skills
in computing variances late.

The following examples of notation will start from simple to complex.


5
Example 1 Find the summation of  fi .
i 1
Solution:
The variable f i is subject to summation. The subscript i indicates the sequence of numbers
to add up. The number above the summation symbol, “5”, tells you that there are five
frequencies to be summed up starting from the first, as indicated by 1 below the symbol. So
this will be:

f
i 1
i  f1  f 2  f3  f 4  f5
4
Example 2 Find the summation of  X
i 1
i X.

Solution:

The X i is different from X . Our treatment X is a constant element just like a number. Since
there is a subscript in X , it shows four of them will be summed up. So this will be;

 X
i 1
i  X    X1  X    X 2  X    X 3  X    X 4  X 
4

 X X .
2
Example 3 Evaluate the summation of i
i 1
Solution:
This is similar to the previous example only that each element is squared. It follows the same
principle.
4

 X  X    X1  X    X 2  X    X 3  X    X 4  X 
2 2 2 2 2
i
i 1

 Oi  Ei 
2
5
Example 4 Evaluate the summation: 
i 1 Ei
.

Solution:
The form remains. The total elements are 5.

 Oi  Ei  O  E   O  E2   O  E3   O  E4  O  E5 
2 2 2 2 2 2
5


i 1 Ei
 1 1
E1
 2
E2
 3
E3
 4
E4
 5
E5
4
Example 5 Find the summation of X
i 1
i
2
Yi .

Solution:
Some will be encountered in regression analysis. Following the same step, this will become:
4

X
i 1
i
2
Yi  X12Y1  X 22Y2  X 32Y3  X 4 2Y4

In some computations, the subscript will not appear anymore. Take note the following
example:

Example 6. Given the data below, and X  5 . Calculate   X  X  .


2

Solution:

The requirement is that you have to subtract X from X , and then the results are squared,
lastly, take the summation of all squared item.
Subtract 5 from each value of X , (7  5) , (6  5) .., the result will be
5

Then square the result

Sum up the result

 X  X   X  X 
2 2
The answer will be  10

Example 7 Calculate the summation of X 2


Y 2 given the following data
6

Solution:
Square the value of X and Y , multiply the results, and lastly sum up the product.

Finally, the answer will be


7

1.3 Mean, Variance and Standard Deviation


There are times that you need to know the descriptive measures of data by a single figure,
and this figure gives you an overall picture of the set of data. It gives us a long term
estimate of price and demand over a long period.

1.3.1 Mean (Average)

The symbol for mean is X (x- bar) for sample data, and  (Greek alphabet pronounced as
mu) for the population mean. The term mean is synonymous with the average and is
computed by adding all numbers and dividing by the number of items.

X
 X  X1  X 2  X 3  ...  X n
n n
Where:  X = total amount of items, n  number of items.
Example 8 The variable X represents the sample revenue. Calculate the mean.

Solution:
Mean is equal to the total divided by the number of items.

X
 X  240  40
n 6
8

Example 9 Compute the mean of the following scores.

Solution:

Calculate the total and divide by 6.

X
X 
400
 80
n 5

1.3.2 Variance S 2
Variance is a measure of how dispersed the data is. If individual observations vary greatly
from the group average or mean, the variance is large. The formula is shown below.

 X  X 
2

S 2

n 1

Where: X = sample mean


X  individual item
n  total number of sample

Example 10 Compute the variance and standard deviation of the following data.
9

Solution:
In computing variance, you need to compute first the mean (average). Then, subtract each
item-15, 17, 12… from the mean, square the results, and finally, sum it up. Then, divide the
result by 1 less than the number of items. The result is the variance.

The mean is:

X
X 
90
 15
n 6

The mean is 15. Then, subtract 15 from each item X , say, 15 15  0 , 17  15  2 , and so
forth, and square the result. Say, 0  0 , 22  4 ,  3  9 and so forth. The total is 64.
2 2

2
To compute the variance S , apply the formula and substitute.

 X  X 
2
64
S 2
   12.8
n 1 6 1

To compute standard deviation S , get the square root of variance.

S  S 2  12.8  3.58
10

You might also like