You are on page 1of 8

Lesson 1: Fundamental Concepts and Summation Notation.

The lesson provides the basic concepts and terms involving Statistics. Summation
notation is presented as it is a convenient and simple form of shorthand used to give a concise
expression for a sum of the values of a variable. 

Lesson Objectives:
1. Describe the meaning and basic concepts of Statistics.
2. Classify data according to the level of measurement.
3. Perform operations involving summation.

Discussion:
A. Definition
In a broader sense, Statistics is concerned with scientific methods of collection,
organization, presentation, analysis, and interpretation of data. It is its essential purpose to
describe and draw inferences about the numerical properties of the population.
The term “data” means factual information or observations that may either be quantitative
or qualitative. The collection of data entails gathering information through interview schedules,
structured questionnaires, observations, experimentations, the use of existing records, and other
methods. The data are then organized in an orderly fashion as a requisite for data presentation.
More often than not data gathered are presented in graphs and tables to give the readers a quick
picture of the data distribution. Textual presentation is more appropriate with little data at hand.
Data analysis comes after the processing of data as guided by statistical principles. This
may involve the use of any method of statistics the choice of which depends upon the nature or
purpose of the statistical problem. Drawing valid conclusions and making reasonable decisions
are based on such analysis. For example, a political analyst can use sample data of the voting
population to predict the political preferences of the entire voting population.
In a singular sense, the term Statistics is used to denote the data themselves or numerical
quantities derived from the data as, e.g. averages, population statistics, statistics on enrollment,
employment statistics, birth statistics, etc.

B. Categories of Statistics
1. Descriptive Statistics – is concerned about organizing, summarizing, presenting, and
interpreting data. This branch of Statistics lays the foundation for all statistical
knowledge. For example, if we measure the IQ of the complete population of students
in a particular university and compute the mean IQ, that means is descriptive because
it describes a characteristic of the complete population.
2. Inferential Statistics – deals with the population where only part of it is examined.
Patterns in the data may be modeled, in a way that accounts for randomness and
uncertainty in the observations, to draw inferences about the process or population
being studied. For example, we wish to make a statement about the mean IQ in the
complete population of students in a particular university from a knowledge of the
mean computed in the sample of 100 and to estimate the error involved in this
statement, we use procedures from inferential statistics.
C. Types of Data
1. Primary data – are data that have been acquired directly from the source. They are
also called eye-witness accounts written by people who experienced a particular event
or behavior and are collected especially for the task at hand. (e.g. Student profile,
financial records, office memos, minutes of the meeting, enrollment, etc.)
2. Secondary data – are non-primary data or existing records. (e.g. Census, Stocks
charts)
D. Types of Variable
Variable is a particular attribute of interest that is measurable or observable on
every individual/object.
1. Qualitative – have labels or names rather than numbers, assigned to their categories,
and used extensively in observational studies. (e.g. gender, courses enrolled, religion,
etc.)
2. Quantitative –have values that represent counts or quantity and are measured in
numeric units, can either be discrete or continuous. (e.g. age, height, weight, GPA,
etc.)
* Variables that take on specific values within an interval or those that represent counts
are said to be discrete (e.g. temperature in ℉∨℃ ), while variables that can take on an infinite
number of possible values in a given interval are said to be continuous.
E. Population, Sample and Distribution
1. Population – is the total of all units or possible values of the variable.
2. Sample – a subset or portion of the total population. (e.g. A 100% sample would be the
entire population; a 1% sample would consist of only 1 out of every 100 units in the
population)
3. Distribution – is a pattern of variation of a variable. It displays how frequent each
value occurs.
Illustration: Given the listed ages of children in the smallest barangay in a certain town.

12 1 15 11 22 15 17
12 3 15 11 24 15 17
12 6 15 11 24 17 19
13 6 15 11 13 17 20
13 7 15 12 13 18 21
18 8 15 12 13 19 22
18 8 16 12 14 14 16
18 9 16 10 14 14 17
19 9 10 10 3 9 11
14 9 10 10 3 17 11
14 9 16 10 5 6  

The following chart shows the distribution of age groups 1-5, 6-10, 11-15, 16-20,
and 21-25,

Age Distribution
35
30
25
20
15
10
5
0
1-5 6 - 10 11 - 15 16 - 20 21 - 25

F. Scales of Measurement
1. Nominal – take values that give names or labels to various categories with no
particular ordering. Information that can be obtained from processing data is limited
counts and percentages. This is the lowest level of measurement. (e.g. name, gender,
color, course, educational program, etc.)
2. Ordinal – are labels or classes with an inherent order. The difference between
categories cannot be measured and has no meaning. Information that can be obtained
from processing data on these variables is limited to frequency counts with additional
insight on the rank or order of the categories specified. (e.g. Socioeconomic status
classified as low, middle, and high; and student classification with value freshman,
sophomore, etc.)
3. Interval – are quantitative with differences between two consecutive quantities being
constant. Intervals between categories can be quantified and have meaning. It does
not have a true starting or zero points which means that a value of zero does not
necessarily mean the absence of the characteristic being measured. Some examples of
a variable in the interval scale are room temperature in ℉∨℃ .
4. Ratio –is a type of variable measurement scale that is quantitative. The ratio scale
allows any researcher to compare the intervals or differences. The ratio scale is the
4th level of measurement and possesses a zero point or character of origin. This is a
unique feature of a ratio scale. Examples are age at last birthday (in a year), weekly
food allowance (in peso), height (in cm), and weight (in kg).

**Statistical analysis of a variable varies according to its level of measurement.


Knowledge of such will serve as a guide in determining the appropriate algebraic operations and
consequently statistical tools that can be used for analysis. The inappropriate analysis leads to
erroneous conclusions which may lead to dangerous consequences brought about by wrong
decisions.
Summation Notation
Summation notation is used to express the sum of numbers and relationships among
variables in a more concise form.
Suppose a variable, say x, is observed. The first observation is denoted as x1, the second
x2, the third x2, and so on. The ith observation is customarily denoted by xi
If a set of N observations corresponds to x1, x2,… xN, then their sum can be expressed as
N

∑ x i=x 1 + x 2+ x 3 +…+ x N
i=1
where
∑¿ summation notation
i=¿ arbitrary symbol to denote the index
x i=¿ summand or unit value
N=¿ upper limit of the summation (total number of units)
1=¿ lower limit of the summation
Example #1. Find the sum of the scores of the six (6) students in a Statistics class. The
scores are 52, 60, 70, 75, 58, and 80. Assigning first score as x1, the second as x2, and the
third as x3, and so on, thus x1 = 52, x2 = 60, x3 = 70, x4 = 75 x5 = 58, and x6 = 80.
The summation is written as
6

∑ x i=x 1 + x 2+ x 3 + x 4 + x 5 + x 6
i=1
6

∑ x i=52+60+ 70+75+58+80
i=1
6

∑ x i=395
i=1
Properties:

1. If c is constant
N N

∑ cx i=c ∑ xi
i=1 i=1

Example
4 4

∑ 3 x i=3 ∑ x i
i=1 i=1

2. If c is constant and for any m integer


N

∑ c=Nc
i=1

Example:
10

∑ 2=10 ×2=20
i=1
N

∑ c=c ( N −m+1 )
i=m

Example:
10

∑ 2=2 (10−2+1 )=2 × 9=18


i=2

3. If a and b are constants


N N N

∑ ( a xi ± b y i )=a ∑ x i ±b ∑ y i
i=1 i=1 i=1

Example: Given two variables x and y : x= { 4 , 3 ,7 ,8 } and y= { 2, 1 ,5 , 4 }. Find

( )(∑ )
4 4
1. ∑ xi yi
i=1 i=1

( )( )
4 4

∑ x i ∑ y i = ( x 1+ x 2 + x 3 + x 4 ) ( y 1+ y 2+ y 3+ y 4 )
i=1 i=1
( )(∑ )
4 4

∑ xi y i =( 4 +3+7+ 8 )( 2+1+5+ 4 )=( 22 )( 12 ) =264


i=1 i=1
4
2. ∑ x 2i y 2i
i=1
4

∑ x 2i y 2i =( x 21 y12+ x 22 y 22+ x 23 y 23+ x24 y 24 )


i=1
4

∑ x 2i y 2i =( 42 22+3 2 12 +72 52+ 82 42 ) =2402


i=1
4
3. ∑ ( x i− y i )2
i=2
4

∑ ( xi − y i )2=¿ ( x 2− y 2 )2 + ( x 3− y 3 )3 +( x 4 − y 4 ) 4 ¿
i=2
4

∑ ( xi − y i )2=¿ ( 3−1 )2 +( 7−5 )2 +( 8−4 )2=4 +4 +16=24 ¿


i=2
4
4. ∑ ( x2i − y 2i )
i=1
4

∑ ( x 2i − y 2i )=( x 12 − y 12 )+ ( x 22− y 22) +( x 32− y 32 )+( x 42− y 42 )


i=1

∑ ( x 2i − y 2i )=( 4 2−22 ) +( 32−12 ) +( 72−52 ) +( 82−42 )=92


i=1

Practice Set:
Given the following data:
x= {88 , 85 , 75 , 92, 88 , 89 , 70 , 91, 92 , 75 }
y= {74 , 65 ,90 , 75 , 70 ,76 , 90 , 75 , 88 , 87 }Find
10
1. ∑ x 2i
i=1
10
2. ∑ y2i
i=1
10
3. ∑ xi yi
i=1
10
4. ∑ ( xi − y i )2
i=1
10
5. ∑ ( x i2 − y i2)
i=1
Summary of the Lesson:
The two branches of Statistics are Descriptive Statistics and Inferential Statistics.
Classification of data to either population or sample is essential in deciding which branch can be
applied.
Statistics deals with data and how much can be interpreted meaningfully. Hence, scales
of measurement, i.e. nominal, ordinal, interval, and ratio
A variable is a characteristic of the subjects in a study. It can either be qualitative or
quantitative. A quantitative can either be discrete or continuous.
Summation notation is a method to express statistical operations concisely.
Practice Set:
Solve the following COMPLETELY and NEATLY. Write your answers on a yellow pad paper,
for the problems that need an EXCEL output, print it in an A4 size bond paper.
I. For each of the following determine whether the given situation involves the use of
descriptive statistics or inferential statistics.
1. The manager of a department store records the number of buying customers daily for
eight consecutive weeks and then estimates the average number of buying customers for
the following weeks.
2. The market researcher of a manufacturing company constructs a graph showing the
fluctuations in sales for a major product line during the last five years.
3. A company trainer employs one technique/strategy in one production line and another
technique in another line and then gives the same assessment. Using the results, he
determines which technique is more effective.
4. The sales manager ranks his sales team according to their job performance.
II. Which of the following variables are qualitative and which are quantitative? Among the
quantitative variables, which are discrete and which are continuous?
1. ROI
2. Position
3. Performance rating
4. Salary rate
5. Employee ID
III. Classify the following data sets as qualitative or quantitative.
1. Total sales
2. Product quality
3. Liquidity ratio
4. Number of units sold
5. Sales region
IV. Classify the following data sets as continuous or discrete.
1. Dividend payout ratios
2. Ranks of employees
3. Scores of the examinees on Civil Service Exams.
4. The top ten performing employees
5. Variants of the product
V. Identify the measurement scale of each of the following.
1. Classification of product line
2. Temperature reading in the production line
3. Company training
4. Salary rates
5. Employee rank

You might also like