You are on page 1of 13

UNIVERSITY OF ST.

LA SALLE
College of Business and Accountancy

BSTAT – BUSINESS STATISTICS


First Semester, Ay 2020 – 2021

HANDOUTS 1

STATISTICS IN THE RESEARCH PROCESS

"Statistics can be fun or at least they don't need to be feared."

Many folks have trouble believing this premise. Often, individuals walk into their first statistics class
experiencing emotions ranging from slight anxiety to borderline panic. It is important to remember,
however, that the basic mathematical concepts that are required to understand introductory statistics
are not prohibitive for any university student. The key to doing well in any statistics course can be
summarized by two words, "KEEP UP!" If you do not understand a concept--reread the material, do the
practice questions, and do not be afraid to ask your professor for clarification or help. This is important
because the material discussed four weeks from today will be based on material discussed today. If you
keep on top of the material and relax a little bit, you might even find you enjoy this introduction to basic
measurements and statistics.

Why Study Statistics?

"Why do I need to learn statistics?" or "What future benefit can I get from a statistics class?"

There are five primary reasons to study statistics:

The first reason is to be able to effectively conduct research. Without the use of statistics it would be
very difficult to make decisions based on data collected from a research project. Statistics provides us
with a tool with which to make an educated decision.

A second point about research should be made. It is extremely important for a researcher to know what
statistics they want to use before they collect their data. Otherwise data might be collected that is not
interpretable. Unfortunately, when this happens it results in a loss of data, time, and money.

Although you may never plan to be involved in research, research may find its way into your life.
Certainly, if you decide to continue your education and work on a masters or doctoral degree,
involvement in research will result from that decision. Secondly, more and more work places are
conducting internal research or are part of broader research studies. Thus, you may find yourself
assigned to one of these studies.

The second reason to study statistics is to be able to read journals. Most technical journals you will read
contain some form of statistics. Usually, you will these statistics in something called the results section.
Without an understanding of statistics, the information contained in this section will be meaningless. An
understanding of basic statistics will provide you with the fundamental skills necessary to read and
evaluate most results sections. The ability to extract meaning from journal articles and the ability to
critically evaluate research from a statistical perspective are fundamental skills that will enhance your
knowledge and understanding in related coursework.

S. R. LEONARES, PHD 1
The third reason is to further develop critical and analytic thinking skills. The study of statistics will serve
to enhance and further develop these skills. To do well in statistics one must develop and use formal
logical thinking abilities that are both high level and creative.

The fourth reason to study statistics is to be an informed consumer. Like any other tool, statistics can be
used or misused. Yes, it is true that some individuals do actively lie and mislead with statistics. More
often, however, well-meaning individuals unintentionally report erroneous statistical conclusions. If you
know some of the basic statistical concepts, you will be in a better position to evaluate the information
you have been given.

The fifth reason to have a working knowledge of statistics is to know when you need to hire a
statistician. Conducting research is time consuming and expensive. If you are in over your statistical
head, it does not make sense to risk an entire project by attempting to compute the data analyses
yourself. It is very easy to compute incomplete or inappropriate statistical analysis of one's data. It is
also important to have enough statistical savvy to be able to discuss your project and the data analyses
you want computed with the statistician you hire. In other words, you want to be able to make sure that
your statistician is on the right track.
(https://universalteacher.com/1/reasons-for-conducting-research/)

Statistics are part of our everyday life. Science fiction author H. G. Wells in 1903 stated, ""Statistical
thinking will one day be as necessary for efficient citizenship as the ability to read and write." Wells was
quite prophetic as the ability to think and reason about statistical information is not a luxury in today's
information and technological age. Anyone who lacks fundamental statistical literacy, reasoning, and
thinking skills may find they are unprepared to meet the needs of future employers or to navigate
information presented in the news and media On a most basic level, all one needs to do open a
newspaper, turn on the TV, examine the baseball box scores, or even just read a bank statement to see
statistics in use on a daily basis.

Statistics in and of themselves are not anxiety producing. The idea of statistics is often anxiety provoking
simply because it is a tool with which we are unfamiliar.

-------------------------------------------------------------------------------------------------------------------------------------------

STATISTICS:
 Defined: A branch of science which deals with the collection, organization, presentation, analysis,
and interpretation of data.
 A body of techniques and procedures dealing with the collection, organization, analysis,
interpretation, and presentation of information that can be stated numerically.
 The backbone of (Quantitative) Research

Two Branches of Statistics

Descriptive statistics are used to organize or summarize a particular set of measurements. These deal
with organizing and summarizing observations so that they are easier to comprehend. The census
of households conducted by the Philippine Statistics Authority every five years represents an
example of how descriptive statistics are generated. The information that is gathered concerning

S. R. LEONARES, PHD 2
gender, race, income, etc. is compiled to describe the population of the Philippines at a given point
in time. Collection, Organization, Presentation, and Analysis are part of descriptive statistics.

Inferential statistics use data gathered from a sample to make inferences or generate conclusions about
the larger population from which the sample was drawn. Opinion polls and television ratings
systems represent some uses of inferential statistics. For example, a limited number of people
are polled during an election and then this information is used to describe voters as a whole.
Interpretation falls under Inferential Statistics.

Example:
We wanted to know the level of job satisfaction nurses experience working on various units within
a particular hospital (e.g., psychiatric, cardiac care, obstetrics, etc.). The first thing we would
need to do is collect some data. We might have all the nurses on a particular day complete a job
satisfaction questionnaire. We could ask such questions as "On a scale of 1 (not satisfied) to 10
(highly satisfied), how satisfied are you with your job?". We might examine employee turnover
rates for each unit during the past year. We also could examine absentee records for a two
month period of time as decreased job satisfaction is correlated with higher absenteeism. Once
we have collected the data, we would then organize it. In this case, we would organize it by
nursing unit.
Absenteeism Data by Unit in Days
Psychiatric Cardiac Care Obstetrics
3 8 4
6 9 4
4 10 3
7 8 5
5 10 4
Mean = 5 9 4

Thus far, we have collected our data and we have organized it by hospital unit. You will also
notice from the table above that we have performed a simple analysis. We found the mean (you
probably know it by the name "average") absenteeism rate for each unit (descriptive statistics).
Next, we would interpret our data (inferential statistics). We could take the information gained
from our nursing satisfaction study and make inferences to all hospital nurses. We might infer,
and therefore conclude, that cardiac care nurses as a group are less satisfied with their jobs as
indicated by the high absenteeism rate.

This course will be discussed in light of the role of statistics in the research, particularly quantitative
research, process.

Statistics in the Research Process:

“Research is a procedure for carefully finding accurate solutions to important and relevant questions by
the use of scientific method of gathering and interpreting information. Doing research is a multi-
dimensional skill. Carrying out successful research must exceed the bounds of printed paper, and leap
out to influence opinions and opinion shapers.” (https://universalteacher.com/1/reasons-for-conducting-
research/)

S. R. LEONARES, PHD 3
The Research Process (from the standpoint of Statistics) :

 Formulate the research problem (this could be your general or specific objective)
• S – pecific
• M - easurable
• A – attainable
• R – ealistic
• T – ime bound

Remarks: A research objective that is SMART sets a very good road map for the conduct of
research:
• The scope/population is delineated, hence it can be determined beforehand
whether to do a census (gathering data from the whole population) or a survey
(gathering data from a sample) will be conducted
• The subjects (sources of information) are identified, hence the appropriate method
of data collection can be determined
• The kind of information needed to answer the problem/objective is known at the
beginning of the study
• The type of objective is known, hence the appropriate descriptive and/or inferential
statistical tools are anticipated

 Define the population of the study


o Population – all subjects under investigation
– the set of all elements of interest in a particular study
o Sample – a subset of the population

Notes: a. In order to identify the population of the study, ask the question, “Who/What are going
to provide the information needed to answer the research problem?”
b. the population of the study need not consist of a human population

 Identify the variable/s of the study


o Variable – measurable characteristic or attribute of the subject that is the focus of
the study that can take on different values

Notes: a. In order to determine the variable/s of the study, ask the question, “What information
is needed from each subject (element of the population) in order to answer the
research problem?”
b. A research problem or specific objective may involve one or more variables.
c. It would be a good practice to determine the variable/s of each stated specific objective
so as not to miss any information needed from each subject

Example:
Problem: What is the mean weekly household food expense of a USLS BStat student for the first
semester of AY 2020 – 2021?
 Population of study:
• All USLS BStat students for the first semester, AY 2020 – 2021

S. R. LEONARES, PHD 4
Question: How will the description and scope of the population be affected if each of the
following is omitted?
a. USLS
b. BStat
d. first semester
e. AY 2020-2021
 Variable/s:
• weekly household food expense (only one information is needed from each USLS BStat
student for the first semester of AY 2020-2021)

Remarks: 1) Identifying the variable/s early in the study eliminates the possibility of
a. missing it when eventually formulating the instrument or
b. including variables in the instrument that are not necessary in
answering the problem/objectives
2) Identifying the variable/s enables the researcher to determine the type of
variable/s and the level/s of scale of the data that will be collected. These,
in turn, determine the types of analysis and interpretation that will be
applied to generate needed results
:
:
 (Anticipated) Conclusion (think ahead as to how the answer to the research problem/specific
objective will look like):
• The mean weekly household food of a USLS BStat student for the first semester of AY
2020-2021 is _______.

Notes:
a. This will help you to anticipate that you need to compute for the mean of the
weekly household food expense values that you have collected from all members
of the population
b. More importantly, you conclusion should be consistent with the statement of your
problem/specific objectives, that is, it is about the population under study, so the
conclusion should be about the population under study
 This is not a problem if a census is conducted – the conclusion is
straightforward, like in the example above
 However, if only a sample was taken from the population for the study, the
conclusion should never be about the sample; it should still be about the
population, hence, its form will be quite different from the anticipated
conclusion as in the example above (inferential statistics can provide a
template for specific types of objectives)

CLASSIFICATION OF RESEARCH OBJECTIVES/GOALS:

Each state objective can be differentiated according to the following classification. This will guide
the researcher to anticipate the type of analysis and interpretation that is required of the objective.

S. R. LEONARES, PHD 5
Analytic goals: directed toward finding out from the data one or more of the following attributes of
characteristics of the group being studied:

1. Central tendency – general characteristic of the group


Examples:
a. To determine the mean weekly allowance of USLS College Freshmen for the
first semester, AY 2020 – 2021.
b. To determine the percentage of USLS College students who prefer a Samsung
over a Vivo cellphone for the first semester, AY 2020-2021.

2. Variance in the group – how individual members of the group vary from the average
characteristic of the group
Examples:
a. To determine the age range of the students in this class.
b. To determine if the final grades in this class are similar.

3. Difference within the group/between groups – whether or not subgroups of the group/ two
separate groups being studied are different or similar on certain traits investigated (special case:
comparison between/among two or more groups with regards to a particular variable)
Examples:
a. To compare the mean no. of Coke Sakto bottles consumed in July, 2020 between
the male and female USLS students.
b. To determine if there is a significant difference in the mean number of text
messages sent in a day among the students from the five different colleges
of USLS for the first semester, AY 2020-2021.

4. Relationships within the group – if relationship between certain variables covered in the study
exists
Examples:
a. To establish if there is a significant relationship between choice of cellphone
brand and the college a USLS student belongs to for the first semester, AY
2020-2021.
b. To determine if relationship status and final grades in Statistics are
independent for the first semester, AY 2020-2021.

5. Prediction – establishing a mathematical/statistical model to predict future outcomes


Examples:
a. What factors influence the a graduate’s ability to land a job within one year
after graduation?
b. What is the estimated sales of a particular restaurant for next week if the
present conditions hold?

Types of Analysis:
1. Descriptive –
• limited to the description of the particular group being studied
• a conclusion cannot be applied to cases outside the study group

S. R. LEONARES, PHD 6
2. Inferential –
• application of the findings or conclusions from a small group to a large group from which
the smaller group was drawn

To summarize, the following diagram shows the aspects of statistics involved in a research process,
depending on the scope of the study:

Population study Sample study

Sampling

Collection Collection

Organization Organization

Presentation Presentation

Analysis Analysis

Interpretation

Conclusion
(always about the population)

AVOID any one of two possible procedural errors:


1. You did a population study but you used inferential statistics to arrive at the conclusion.
2. You did a sample study but you did not use inferential statistics to arrive at the conclusion.

Remember, inferential statistics is applied only in order to generate conclusions about the population
BASED ON SAMPLE DATA.

TYPES OF VARIABLES
(inherent characteristic of the variable; does not change)

1. Qualitative/Categorical
 Attributes are in terms of categories or levels - the descriptions that you give a variable that help
to explain how variables should be measured, manipulated and/or controlled.
Examples:
Variable Categories/levels
1. sex categories - Male
- Female

S. R. LEONARES, PHD 7
2. Religion categories - Roman Catholic
- Protestant
- Iglesia ni Cristo
- Islam
- Others, please specify _______
3. Importance of university to levels - strongly agree
getting a good job - agree
- neither agree nor disagree
- disagree
- strongly disagree
Notes:
1. categories vs levels
Categories – do not have/possess an intrinsic order; they are all considered equal
Levels – possess intrinsic or inherent order from one “category” to the next

2. Categories/levels should be
a. exhaustive – should cover all possible answers (oftentimes, the use of “Others, please
specify” serves the purpose of including all possibilities, especially those categories with
small frequencies). This will prevent the respondent from being confused about what
answer to tick () or mark with an x since his or her desired response is not among the
given options
b. mutually exclusive – should make sure that the categories do not overlap in order to
ensure that the respondents provide only one answer. This will prevent the respondent
from being confused as to which category to tick () or mark with an x if there is more
than one possible answer. This holds true even for multiple response questions.

2. Quantitative/Numerical
 The variable has numerical properties which are the values by which the said variables can be
measured, manipulated and/or controlled
 Attributes are in terms of counts (discrete) or measurements (continuous)

 Distinctions/Types of quantitative variables :


a. Discrete Variable
• uses the process of counting to generate data
• values of attributes are in terms of whole numbers only

Examples:
a. Number of t-shirts owned
b. Number of pocketbooks read

b. Continuous Variable
• uses the process of measuring to generate data (with the use of a measuring instrument)
• values of attributes may have fractional or decimal parts
Examples:

S. R. LEONARES, PHD 8
a. Weight of a package
b. Volume of water
c. temperature

Note: for continuous variables, it is important to append the unit of measurement since
the result may have a different value depending on the unit

Example:
 For discrete variables, the value of a number remains the same regardless of the
variable: 5 chairs vs 5 students ( the value of 5 is the same for both)
 For continuous variables, the value of a number depends on the unit of
measurement, even if the same variable is being measured: 5 inches vs 5 feet
(length measuring 5 inches is shorter than length measuring 5 feet)

READ: http://dissertation.laerd.com/types-of-variables.php

FUNCTIONS OF VARIABLES

 Not an intrinsic property of the variable; it depends on the role of the variable in a study
 Important if the investigation is about cause and effect
 Distinctions:
a. Independent Variable
• sometimes called an experimental or predictor variable
• is a variable that is being manipulated in an experiment in order to observe the effect this
has on a dependent variable
• what the researcher (or nature) manipulates -- a treatment or program or cause

b. Dependent Variable
• sometimes called an outcome variable
• a variable that is dependent on an independent variable(s)
• what is affected by the independent variable -- the effects or outcomes

Example:
Study/Problem: the effects of a new educational program on student achievement
Independent variable - the program
Dependent variables - measures of achievement

 a variable may function as an independent variable in one study and a dependent variable in another

MEASUREMENT AND MEASUREMENT SCALES

What is Measurement?

Defn: Measurement – The process of assigning numbers to observations or observed characteristics

S. R. LEONARES, PHD 9
Normally, when one hears the term measurement, they may think in terms of measuring the length of
something (e.g., the length of a piece of wood) or measuring a quantity of something (e.g., a cup of
coffee).This represents a limited use of the term measurement. In statistics, the term measurement is
used more broadly and is more appropriately termed scales of measurement.

Scales of measurement refer to ways in which variables/numbers are defined and categorized. Each scale
of measurement has certain properties which in turn determines the appropriateness for use of certain
statistical analyses. The four scales of measurement are nominal, ordinal, interval, and ratio.

1. Nominal Scale
 Consists of numbers which indicate categories for purely classification or identification purposes
 The numbers serve as codes only; any number can be used to represent a category as long as
they do not duplicate
 The numbers do not indicate order among the categories
 The numbers have no numeric properties, hence, the four fundamental operations (addition,
subtraction, multiplication, division) cannot be applied to the numbers in the nominal scale
 The categories are mutually exclusive (the observations cannot fall into more than one category)
 The categories are exhaustive (there must be enough categories for all the observations)
Example:
Sex: Male =1
Female = 2
Remarks:
a. assigning the number 2 to Female does not imply that females are
“better” than males
b. these numbers cannot be arithmetically manipulated, for example, to
get the “average sex”

2. Ordinal Scale
 Possesses rank order characteristics
 the categories must still be mutually exclusive and exhaustive, but they also indicate the order
of magnitude of some variable
 the numbers serve as codes but must now be assigned in consecutive order, indicating degree
of level (for example: lowest to highest, most preferred to least preferred, etc.)
Example:
Likert item response: Strongly agree =1
Agree =2
Neither agree nor disagree = 3
Disagree =4
Strongly disagree =5

Remarks:
a. Although the numbers are arranged in consecutive order, it cannot be assumed
that the differences between two consecutive numbers are the same
anywhere in the scale, for example, the degree of difference of “1” in
responses between strongly agree (1) and agree (2) is not necessarily the
same as that between disagree (4) and strongly disagree (5)

S. R. LEONARES, PHD 10
b. Fundamentally, these scales do not represent a measurable quantity; for
this reason, arithmetic operations on the numbers are supposedly not
applicable

Example:
Likert-type items (such as "On a scale of 1 to 10, with one being no pain and ten
being high pain, how much pain are you in today?") also represent ordinal data. An
individual may respond 8 to this question and be in less pain than someone else who
responded 5. A person may not be in exactly half as much pain if they responded 4
than if they responded 8. All we know from this data is that an individual who
responds 6 is in less pain than if they responded 8 and in more pain than if they
responded 4. Therefore, Likert-type items only represent a rank ordering.

REMEMBER: a. Nominal and Ordinal scale data are basically categories/levels converted to numeric
codes.
b. Qualitative variables generate either nominal (categories) or ordinal (levels) scale data.

3. Interval Scale
 Has all the properties of the ordinal scale
 A scale that represents quantity and has equal units
 A given interval (distance) between scores has the same meaning anywhere on the scale
 Interval scale provides information about how much better one value is compared with another
 zero does not represent the absolute lowest value but represents simply an additional point of
measurement and not the absence of the property being measured

Examples:
a. temperature measured on Celsius scale
 Temperature is defined as the measure of the warmth or coldness of an object
or substance with reference to some standard value
 Water boils at 100Celsius, freezes at 0Celsius (ice is cold to the touch)
 However, 0Celsius does not imply complete absence of heat – there are
substances colder than ice (dry ice, liquid nitrogen) – so 0Celsius is not the
absolute lowest value in the Celsius thermometer

b. score on a test
 Test measures knowledge gained by a student about the topic
 A score of 0 does not imply complete absence of knowledge gained by a student
about the topic

4. Ratio Scale
 Possesses all the characteristics of the interval scale (represents quantity and has equality of
units)
 The most informative scale as it tends to tell about the order and number of the object between
the values of the scale
 Allows comparison of intervals or differences
 Has a true or absolute zero point (no numbers exist below zero, i.e., there are no negative
numbers)

S. R. LEONARES, PHD 11
 The ratio of two values is meaningful because the zero point characteristic makes it relevant or
meaningful to say, “one object has twice the length of the other” or “is twice as long.”

Examples:
a. Very often, physical measures will represent ratio data (for example, height and
weight). If one is measuring the length of a piece of wood in centimeters,
there is quantity, equal units, and that measure cannot go below zero
centimeters. A negative length is not possible.
b. Cost of today’s lunch
c. length of time of a full-length movie

REMEMBER: a. Interval and Ratio scale data are possess inherently numeric characteristics
b. Quantitative variables generate either interval or ratio scale data.

The table below will help clarify the fundamental differences between the four scales of measurement:

Indications Indicates Direction of Indicates Amount of Absolute


Difference Difference Difference Zero
Nominal X
Ordinal X X
Interval X X X
Ratio X X X X

You will notice in the above table that only the ratio scale meets the criteria for all four properties of
scales of measurement.

-------------------------------------------------------------------------------------------------------------------------------------------

EXERCISES

1. Indicate whether each of the following examples refers to a population or to a sample.


a. A group of 25 customers selected to taste a new soft drink
b. Salaries of all CEOs in the pharmaceutical industry
c. Customer satisfaction ratings of all clients of a local bank
d. Monthly phone expenses of selected Globe subscribers

2. Indicate whether the following are qualitative (QL), quantitative discrete (QD) or quantitative
continuous (QC) variables and the corresponding level of measurement of the data generated for
each variable.
a. Brand of jeans you prefer
b. Ratio of current assets to current liabilities
c. Number of text messages received per day
d. Rating of the management skills of a company president
e. Number of banks in the municipalities and cities of Negros Occidental
f. Ranking of professional tennis players

S. R. LEONARES, PHD 12
g. Scores of freshmen college students on an attitude towards math scale
h. Effectiveness of a drug for headache, measured in minutes
i. Earnings per share
j. Number of leaves
k. Weekly allowance
l. Distance of the student’s house from school
m. Color of the hair
n. Zip code

2. Identify the level of measurement of the following variables.


a. Age f. Favorite TV show
b. Place of birth g. Shoe size
c. Number of children in the family h. High school GPA
d. Grade in Math 1 i. Family monthly income
e. Height (in cm.) j. Travel time (in minutes) from USLS to
residence

3. A researcher measures two individuals and the uses the resulting scores to make a statement
comparing two individuals. For each of the following statements, identify the scale of measurement
(nominal, ordinal, interval, ratio) that the researcher used.
a. I can only say that the two individuals are different.
b. I can say that one individual scored 6 points higher than the other.
c. I can say that one individual scored higher than the other, but I cannot specify how much
higher.
d. I can say that the score for one individual is twice as large as the score for the other
individual.

4. A firm is interested in testing the advertising effectiveness of a new television commercial. As part of
the test, the commercial is shown on a 6:30 PM local news program in Bacolod City. Two days later, a
market research firm conducts a telephone survey to obtain information on recall rates (percentage
of viewers who recall seeing the commercial) and impressions of the commercial.
a. What is the population for this study? __________________________________________
_________________________________________________________________________
b. What is the sample for this study?_____________________________________________
_________________________________________________________________________
c. Why would a sample be used in this situation? Explain.

S. R. LEONARES, PHD 13

You might also like