You are on page 1of 27

Research Methods for Business

Variables :
Pueden ser vistas entonces como causa (variable independiente) y
efecto (variable dependiente). La independiente es controlada por el
experimentador, mientras que la dependiente cambia en respuesta a la
independiente.

Components of scientific research:

 Objectivity
o A good researcher strives to be as objective as possible.
o We describe measurements as being empirical because they
are based on objectively quantifiable observations.
 Confirmation of findings
o Because the measurements are objective, we should be able
to repeat them and confirm the original results.
o Confirmation of findings is important for establishing the
validity of research.
 Self-correction
o If empirical evidence fails to support the predicted relations
between our independent and dependent variables, we
change our view about how the real world operates.
 Control of unwanted factors
o Potentially influential and undesirable factors (= other than
the IVs) are not allowed to change; use of control variables
for extraneous influences.
Research Question

Finding a problem
 Each research project begins as a problem or a question for which
we are seeking an answer.
 You find a research idea when you find a gap in the current
knowledge or an unanswered question that interests you.
 The most important characteristic of a good research idea is that
it is testable.
 Other than by direct trial-and-error investigation, how can we
determine the relevant factors?
o Examining past research is your best bet.
o Importance of literature review.

Hypotheses development (cont’d)

 FYI: Early in the developmental stages of a certain theory,


hypothesis testing may actually be harmful.
 Because the researchers do not know all of the relevant variables,
it may be very easy to disconfirm hypotheses and the underlying
theories.
 Researchers should use more inductive logic when a research area
is new, because there is a high probability of disconfirming
hypotheses when they might actually be true.
o Inductive logic – Involves reasoning from specific cases to
general principles.
o Deductive logic – Involves reasoning from general principles
to specific predictions.
 Directional versus non-directional.
o Non-directional – Does not predict the exact directional
outcome, but only that the conditions will differ.
o Directional – Specifies the outcome of the research.

«Hipótesis direccional (a) o no direccional (b): La hipótesis direccional


especifica la dirección esperada de la relación entre variables,
mientras que la no direccional enuncia la relación entre variables,
pero no predice la naturaleza de dicha relación.»

a) “La práctica del yoga es beneficiosa para conseguir un


patrón de sueño satisfactorio en personas de más de 65
años.”
b) “Existe una relación entre el consumo de éxtasis y las
prácticas sexuales de los jóvenes madrileños.”
The Hypothesis are Non-Directional
Dependent Variable: The effect so it would be leadership skills
Independent Variable: The cause so it would be the ratio

Variables

 Variable: a characteristic of interest that differs in kind or degree


among various observations (<-> constant).
 Qualitative variable: labels or names are used to identify the
characteristic.
 Quantitative variable: numerical values are used to ...
o Discrete: assumes a countable number of values.
o Continuous: assumes an uncountable number of values.

Variable measurements
 Nominal scale
o A simple categorization system (e.g. types of ice-cream).
 Ordinal scale
o Categorized events can be rank ordered; the intervals
separating the ranks do not have to be comparable
(differences between scale values have no meaning).
 Interval scale
o Categorized events can be rank ordered and equal intervals
separate adjacent events; no true zero-point present.
 Ratio scale
o Takes the interval scale one step further, it also assumes the
presence of a true zero point.

Category scale

 Nominal: answers are different categories that cannot be ordered


o In which department of the organization do you work?
Finance, HR, marketing, etc.
 Ordinal: answers are different categories that can be ordered
o What is your functional level? Executive, higher
management, middle management, lower ranked jobs.

Continuous scale (arithmetic operations are valid)

 Interval: there is no true zero


o Temperature: difference between 10° and 20° is the same as
between 20° and 30°.
o However, there is no such as thing as ‘no’ temperature.
o Also: time on a 12-hour clock.
 Ratio: there is a true zero
o What is your income? (ratio because 10.000€ is double the
amount of 5.000€).
Nominal Variable

Ratio Variable
Ordinal Variable
Ordinal Variable
1) Ratio
2) Nominal
3) Ordinal
4) Ordinal (Semantic Differention)
5) Ordinal
a. Ratio
b. Ratio
c. Ordinal
d. Nominal
e. Nominal
f. Ratio?
g. Nominal
h. Ratio?
Exercise 1.1

Discuss in your own words what scientific research is and the


components of the scientific research process (possibly making use of a
few examples).

Scientific research is a process that helps validate with empirical


methods an hypothesis. The components are: Observation, Hyphotesis,
Experimentation, Conclusion, Report.

Exercise 1.2

Discuss in your own words each type of research design presented in


class (possibly making use of a few examples).
Experiment: Tries to explain the cause of the effect. Ex. The higher
presence of water in alcohol is good for the liver

Cross Sectional: It allows to measure the effect on a specific date. Ex


The effect of Coronavirus in 1 year old kids in October 2020

Longitudinal: Allows to measure the effect on a timeline. Ex. The effect


of “virus” in 1 year old kids during the duration of the pandemic.

Case-Study: Observational type research, can be biased by the


observer’s capacity. The observations of an anthropologys about the
behaviour of a tribe in Somalia.

Exercise 1.3

Variables can have category or continuous scale. Explain the difference


between these concepts in your own words (possibly making use of a
few examples).

The continues scale can be measured with any number within an


interval. Ex. Heigh = 1.7943

The category scale can be measure with exact numbers. Ex. Quantity of
participans

Exercise 1.4

When measuring attitudes or opinions we can use Likert scale or


semantic differentials. What is the differences between these
answering formats?

Likert: Scale with 5 points, helps measuring attitudes and conformity

Semantic Differential: Scale with 7 points with bipolar etiquettes

Exercise 1.5
The World Bank website reports the following information on each
country in the world:

 -  Political system (republic, federal republic, constitutional


monarchy, ...) Nominal
 -  Capital city Nominal
 -  Total area (in km2) Ratio
 -  Population (in millions) Ratio
 -  Currency Nominal
 -  Income range (from the lowest to the highest values in
thousands of USD) Ordinal

What is the scale of each variable (ordinal, nominal, interval,


ratio)?

Exercise 1.6

Identify whether the variables reported below are fact or


opinions, and their associated scale.

 -  Gender (male, female, other) Fact, Nominal


 -  Age (number of years) Fact, Ratio
 -  Highest education level achieved Fact, Ordinal
 -  Number of years of education Fact, Ordinal
 -  Starting a business (yes, no) Fact, Nominal
 -  Duration of a business (number of years of activity) Fact, ratio
 -  Job autonomy (what is the degree of autonomy perceived of an
employee) Opinion, Likert
 -  Supervisor rating (from 1 to 5) Opinion,Likert

Exercise 1.7

Make at least an example for each type of measurement scale


discussed in class.
Nominal: Political orientation, religious preference
Ordinal: How likely is that you would recommend our product?
Ratio: How many food intakes do you have per day?
Interval: What is the temperature in your city?
Likert: On a scale from 1 to 5, 1 being Very Poor and 5 being
Excellent, how would you describe the quality of the product?
Semantic Differential: How would you describe the taste of our
product? 1 being sweet and 7 bitter.

Exercise 1.8

Describe the nature and related scale of each variable reported below:

 -  Price range in EUR (< 40, 40 – 60, 60 – 100, >=100) Ratio


 -  Brand (Apple, Samsung, Sony, Huawei, ...) Nominal
 -  Storage capacity (number of GBs) Ratio
 -  Audio format (MP3, WMA, WAV, ...) Nominal
 -  Weight (in ounces) Ratio
 -  Voice recorder (1 = yes, 0 = no) Nominal
 -  Sound quality (1, 2, 3, 4, 5) Ordinal
 -  Easy to use (number of stars: 0.5, 1, 1,5 ... 5) Ratio

Exercise 1.9

Consider the following characteristics:

 -  Temperature of water in a swimming pool hot----cold


(ordinal) // Celsious degree (ratio)
 -  Amount of your saving account Euros (ratio) // From 1 low to 5
strong (ordinal)
 -  Strength of wind at the airport Miles per hour (ratio) // From
1… (ordinal)
Think of two variables with different scale that may measure each
characteristic.

Exercise 1.10

Describe the nature and related scale of the variable associated to


each answer below:

 -  Who is your favorite basketball player? (opinion, nominal)


 -  What is the brand of your future car? (opinion, nominal)
 -  Do you live in Paris? (fact, nominal)
 -  What is the ZIP code of your address? (fact, nominal)
 -  What is the floor of your classroom? (fact, ordinal)
 -  How many pairs of shoes do you have home? (fact, ratio)
 -  From 1 to 5 how much do you like to drink a cappuccino in the
morning? (opinion, ordinal)
 -  What has been your recent rating of Uber car driver? (opinion,
ordinal)
 -  What was the average grade at the high school? (fact, ratio)

Session 2

Probability, Quantitative Sampling

Probability - Degree of certainty that such an event can occur. It is


usually expressed as a number between 0 and 1, where an impossible
event has a zero probability and a safe event has a one probability.
Terminology:

Mean (Media) – It’s the same as average.

Median (Mediana) - The median of a set of numbers is the average


number in the set (after the numbers have been arranged from lowest
to highest) -- or, if there is an even number of data, the median is the
average of the two average numbers.

Mode (Moda) - The fashion of a set of numbers is the number that


appears more often.

Standard Deviation - Indicates how dispersed the data is with respect


to the mean (average). The larger the standard deviation, the more
dispersed the data are.
Variance (Varianza) - The variance of the random variables, therefore,
consists of a measure linked to their dispersion. (Standard Deviation al
cuadrado)

Sampling Principles:

1) Precision: Sample size must be sufficiently high


2) Representativeness: Sample must have the same characteristics
as the population. Proportionally representing the universe in the
sample.
Measurements
Reliability: The extent to which the results are consistent, i.e. they can
be reproduced when the research is repeated under the same
conditions.

 Test-retest: when you repeat the measurement?


 Inter-rated: when different people conduct the measurement?
 Internal consistency: from different part of a test aimed at
measuring the same ‘thing’?

Validity: The extent to which the results are accurate, i.e. they measure
that they are supposed to measure.

 Construct: adhere to existing theories and knowledge?


 Content: cover all aspects of the concept being measured?
 Criterion: correspond to other valid measures of the same
concept?

Branches of statistics

 Descriptive statistics – Are used when you want to summarize a


set or a distribution of numbers in order to communicate their
essential characteristics.
o Bar Graph – Presents data in terms of frequencies per
category. It is usally used with nominal categories.

o Histogram – Represents quantitative data in terms of


frequencies.

o Frequency polygon – Displays the frequency of each number


or score but in the frequency polygon you connect the bars
with dots (compared to the histogram)
o Line Graph – Two axes or dimensions.

 Inferential statistics – Are used to analyze data to determine


whether your independent variable had a significant effect on the
dependent variable.

Exercise 2.1
In your own words, please define the following concepts:
- population: The universe of our study field. Set of objects under
investigation
- sample: A representative proportion of the universe. Subset of
population randomly or non-randomly drawn from the population
- data set: A collection of numbers of values. Information concerning
the sample consisting of a number of variables of interest.
- unit of analysis: The entity being studied. It identifies the nature of
each population element (e.g. person, company).
- variable: Characteristics that are being observed. Information about a
specific characteristic of the unit of analysis.
- value. number or category that a variable can take on

Example
Population: All car producers.
Sample: Top 10 world car producers (non-random or convenient
sampling).
Data set: Sales, Car Segment, Car Model, Car Producer, Year.
Unit of Analysis: car producer by year (from 2003 to 2007).
Variable: Annual sales.
Value: 200,000 Eur.

Exercise 2.2
Explain in your own words, the concept of probability by making use of
a couple of examples.
The degree of certainty that an event can occur. likelihood that a
certain event occurs in a given place and time.
EX The probability that Peru wins the world cup OR The probability
that I win the lottery

Exercise 2.3
Explain in your own words, the two basic sampling principles. What is
the difference between probability and non-probability sampling?
Make a few examples.
Firstly, precision meaning that the sample size must be sufficiently high;
secondly, representativeness, so a sample must have the same
characteristics as the population.

In the probability sampling, each population element is given (by a


controlled procedure) a known non-zero chance of selection. Whereas,
the non-probability sampling it’s arbitrary, subjective, there is not
known non-zero chance of selection.

In non-probable sampling, a person out of 100 would have the same


odds to be chosen, but in probable sampling, that person might have a
better chance of being chosen

Exercise 2.4
Define in your own words, the concepts of validity and reliability. What
is the difference between these two concepts? When a measure is valid
but not reliable, and vice versa?

Validity is the extent to which the results are accurate, and reliability is
the extent to which the results are consistent.

EX. Measuring weight, the scale displays the same weight every time,
so the results are reliable. However, the scale is not calibrated properly
so the measurement is not valid.

Valid but not reliable measure: A sphygmomanometer is used to


measure blood pressure. It is a quite precise instrument for such a
purpose. However, if a patient is not relaxed enough, the
sphygmomanometer will report different measurements, which won’t
help in detecting potential cardiovascular issues.

Exercise 2.5
In your own words, describe the difference between descriptive
statistics and inferential statistics. Identifying the effect of smoking on
respiratory capacity is an exercise of descriptive or inferential statistics?

Descriptive statistics are used when you want to summarize a set or a


distribution of numbers in order to communicate their essential
characteristics.
Inferential statistics are used to analyze data to determine whether
your independent variable had a significant effect on the dependent
variable

Identifying is an exercise of inferential statistics.

Exercise 2.6
When do we use a bar chart? When do we use histograms? What is the
difference between a bar and histogram chart?

We use bar chats when we use nominal categories that cannot be


numerically ordered.
We use histograms when we want to show data in terms of
frequencies.

Exercise 2.7
What is the probability to observe an under 16 student? What is the
difference between mean, median and mode? Determine such values
for the following age distribution in sample of young students:

Age Frequency
10 – 12 2
13 – 15 5
16 – 18 3
19 – 21 5
22 – 24 5
The probability to observe an under 16 student is 0.35 (5+2 / 20)
Mean: 17.9
Median: 16-18 (17)
Mode: 13-15; 19-21; 22-24

Exercise 2.8
Define the concept of skewness and how it relates to mean, median
and mode values. What are the first, second, third and fourth quartiles?
Consider the following distribution:
Class Frequency
[0,1) 70
[1,3) 50
[3,6) 100 Median
[6,10) 30
[10,15) 50
What are the mean and median values? And the first and third
quartiles? What is the sample size?

The skewness shows how the distribution is clustered. The quartiles


describe the division in intervals based on the values.

Skewness is a measure of how asymmetric a distribution is. Positive


(negative) skewness implies that a smaller and more spread-out tail will
be on the right (left)

First quartile: value that leaves to its left at least 25% of the
distribution.
Second quartile: it is the median, i.e. value that leaves to its left at least
50% of the distribution.
Third quartile: value that leaves to its left at least 75% of the
distribution.
Fourth quartile: largest value observed in the distribution.
Mean = (0.5 * 70 + 2 * 50 + 4.5 * 100 + 8 * 30 + 12.5 * 50) / 300 = 4.83
Median = within [3,6) (values around 4 if variable uniformly distributed
in the interval).
Mode(s) = within [3,6)

First quartile = within [1,3)


Second quartile = within [3,6)
Third quartile = within [6,10)

Exercise 2.9
The following data come inform on the time spent (in hours) by
scholars in sport activities weekly.
Time Frequency
< 1 38
1 – 2 22
2 – 3 20
3 – 5 13
>5 7
What is the median interval/value? And the first and third quartiles?
What is the sample size?
Is this distribution right or left skewed?

Median = (1 – 2)
First quartile = (<1)
Third quartile = (2 – 3)
Sample size = 100
Right skewed distribution

Exercise 2.10
Use the data set Loan.sav. Compute mean and median values for the
household income. Is this variable right or left skewed? Report and
discuss a Box and Whisker’s plot of the
household income.

Session 4

Statistical Hypothesis Testing

You might also like