Professional Documents
Culture Documents
Variables :
Pueden ser vistas entonces como causa (variable independiente) y
efecto (variable dependiente). La independiente es controlada por el
experimentador, mientras que la dependiente cambia en respuesta a la
independiente.
Objectivity
o A good researcher strives to be as objective as possible.
o We describe measurements as being empirical because they
are based on objectively quantifiable observations.
Confirmation of findings
o Because the measurements are objective, we should be able
to repeat them and confirm the original results.
o Confirmation of findings is important for establishing the
validity of research.
Self-correction
o If empirical evidence fails to support the predicted relations
between our independent and dependent variables, we
change our view about how the real world operates.
Control of unwanted factors
o Potentially influential and undesirable factors (= other than
the IVs) are not allowed to change; use of control variables
for extraneous influences.
Research Question
Finding a problem
Each research project begins as a problem or a question for which
we are seeking an answer.
You find a research idea when you find a gap in the current
knowledge or an unanswered question that interests you.
The most important characteristic of a good research idea is that
it is testable.
Other than by direct trial-and-error investigation, how can we
determine the relevant factors?
o Examining past research is your best bet.
o Importance of literature review.
Variables
Variable measurements
Nominal scale
o A simple categorization system (e.g. types of ice-cream).
Ordinal scale
o Categorized events can be rank ordered; the intervals
separating the ranks do not have to be comparable
(differences between scale values have no meaning).
Interval scale
o Categorized events can be rank ordered and equal intervals
separate adjacent events; no true zero-point present.
Ratio scale
o Takes the interval scale one step further, it also assumes the
presence of a true zero point.
Category scale
Ratio Variable
Ordinal Variable
Ordinal Variable
1) Ratio
2) Nominal
3) Ordinal
4) Ordinal (Semantic Differention)
5) Ordinal
a. Ratio
b. Ratio
c. Ordinal
d. Nominal
e. Nominal
f. Ratio?
g. Nominal
h. Ratio?
Exercise 1.1
Exercise 1.2
Exercise 1.3
The category scale can be measure with exact numbers. Ex. Quantity of
participans
Exercise 1.4
Exercise 1.5
The World Bank website reports the following information on each
country in the world:
Exercise 1.6
Exercise 1.7
Exercise 1.8
Describe the nature and related scale of each variable reported below:
Exercise 1.9
Exercise 1.10
Session 2
Sampling Principles:
Validity: The extent to which the results are accurate, i.e. they measure
that they are supposed to measure.
Branches of statistics
Exercise 2.1
In your own words, please define the following concepts:
- population: The universe of our study field. Set of objects under
investigation
- sample: A representative proportion of the universe. Subset of
population randomly or non-randomly drawn from the population
- data set: A collection of numbers of values. Information concerning
the sample consisting of a number of variables of interest.
- unit of analysis: The entity being studied. It identifies the nature of
each population element (e.g. person, company).
- variable: Characteristics that are being observed. Information about a
specific characteristic of the unit of analysis.
- value. number or category that a variable can take on
Example
Population: All car producers.
Sample: Top 10 world car producers (non-random or convenient
sampling).
Data set: Sales, Car Segment, Car Model, Car Producer, Year.
Unit of Analysis: car producer by year (from 2003 to 2007).
Variable: Annual sales.
Value: 200,000 Eur.
Exercise 2.2
Explain in your own words, the concept of probability by making use of
a couple of examples.
The degree of certainty that an event can occur. likelihood that a
certain event occurs in a given place and time.
EX The probability that Peru wins the world cup OR The probability
that I win the lottery
Exercise 2.3
Explain in your own words, the two basic sampling principles. What is
the difference between probability and non-probability sampling?
Make a few examples.
Firstly, precision meaning that the sample size must be sufficiently high;
secondly, representativeness, so a sample must have the same
characteristics as the population.
Exercise 2.4
Define in your own words, the concepts of validity and reliability. What
is the difference between these two concepts? When a measure is valid
but not reliable, and vice versa?
Validity is the extent to which the results are accurate, and reliability is
the extent to which the results are consistent.
EX. Measuring weight, the scale displays the same weight every time,
so the results are reliable. However, the scale is not calibrated properly
so the measurement is not valid.
Exercise 2.5
In your own words, describe the difference between descriptive
statistics and inferential statistics. Identifying the effect of smoking on
respiratory capacity is an exercise of descriptive or inferential statistics?
Exercise 2.6
When do we use a bar chart? When do we use histograms? What is the
difference between a bar and histogram chart?
Exercise 2.7
What is the probability to observe an under 16 student? What is the
difference between mean, median and mode? Determine such values
for the following age distribution in sample of young students:
Age Frequency
10 – 12 2
13 – 15 5
16 – 18 3
19 – 21 5
22 – 24 5
The probability to observe an under 16 student is 0.35 (5+2 / 20)
Mean: 17.9
Median: 16-18 (17)
Mode: 13-15; 19-21; 22-24
Exercise 2.8
Define the concept of skewness and how it relates to mean, median
and mode values. What are the first, second, third and fourth quartiles?
Consider the following distribution:
Class Frequency
[0,1) 70
[1,3) 50
[3,6) 100 Median
[6,10) 30
[10,15) 50
What are the mean and median values? And the first and third
quartiles? What is the sample size?
First quartile: value that leaves to its left at least 25% of the
distribution.
Second quartile: it is the median, i.e. value that leaves to its left at least
50% of the distribution.
Third quartile: value that leaves to its left at least 75% of the
distribution.
Fourth quartile: largest value observed in the distribution.
Mean = (0.5 * 70 + 2 * 50 + 4.5 * 100 + 8 * 30 + 12.5 * 50) / 300 = 4.83
Median = within [3,6) (values around 4 if variable uniformly distributed
in the interval).
Mode(s) = within [3,6)
Exercise 2.9
The following data come inform on the time spent (in hours) by
scholars in sport activities weekly.
Time Frequency
< 1 38
1 – 2 22
2 – 3 20
3 – 5 13
>5 7
What is the median interval/value? And the first and third quartiles?
What is the sample size?
Is this distribution right or left skewed?
Median = (1 – 2)
First quartile = (<1)
Third quartile = (2 – 3)
Sample size = 100
Right skewed distribution
Exercise 2.10
Use the data set Loan.sav. Compute mean and median values for the
household income. Is this variable right or left skewed? Report and
discuss a Box and Whisker’s plot of the
household income.
Session 4