Applied Business Statistics

Applied Business Statistics
1
Textbook
Wooldridge, Jeffrey (2008). Introductory Econometrics:
A Modern Approach. 4th edition, paperback. South-
Western, Division of Thomson Learning.
2
The Nature of Business Statistics Data
• Reference: Wooldridge, Chapter 1.
• Business Statistic is used for:
– Estimating Business Models
– Evaluating & implementing policy
– Forecasting
• What is the effect of education on wages?
• How do training programs impact productivity?
• How will share prices develop in the future?
3
• Key ingredient: Data – typically in the form of
large samples.
• Data = information.
• Business Statistic= a method for processing
data and learning about general patterns in
the population of interest.
• For example, what is the effect of education
on labor market outcomes in the US?
4
Common structures of Business data
1. Cross-sectional data: Sample of individuals, households, firms,
taken at a given point in time; often obtained from random sampling
from the underlying population.
2. Time series data: Observations on one or several variables over
time (e.g. GDP for Sweden 1971-2011). Time series observations are
unlikely to be independent over time which implies certain
methodological problems that we will study later.
3. Pooled cross sections: Combines cross-section datasets for
different time periods.
4. Panel (or longitudinal) data: Combines cross-section datasets
for different time periods for the same individuals.
5
Example 1.1:
Becker’s model of crime
• Certain crimes have clear economic rewards,
but they also have costs.
• From Becker’s (1968) perspective, the decision
to participate in illegal activity is influenced by
the rewards and costs.
• Now write down an equation describing the
time spent in criminal activity as a function of
various factors:
6
A model of crime:
y = f ( x1, x2, x3, x4, x5, x6, x7 )
where f ( ) is a function (which remains unspecified for the moment)
y = hours spent in criminal activities
x1 = ”wage” for an hour spent in criminal activity
x2 = hourly wage in legal employment
x3 = other income
x4 = probability of getting caught
x5 = probability of getting convicted if caught
x6 = expected sentence if convicted
x7 = age
Think about whether the various x-variables likely impact on y positively or negatively. 7
Model Specification
• Before we can undertake statisical analysis
linked to crime or worker productivity, the
models above must be made specific.
• This means we must decide exactly what the
function f( ) looks like.
• A second issue is how to deal with variables
tha cannot be observed (e.g. the wage that
someone can earn in criminal activity).
8
A model of crime
where:
crime = measure of the frequency of criminal activity
wage = wage that can be earned legally;
othinc = income from other sources;
freqarr = frequency of arrests for prior crimes;
freqconv = freqency of conviction;
avgsen = average sentence length after conviction;
9
Causality
• A common goal for applied statisian is to estimate
the causal effect of one variable on some outcome of
interest.
• Important: Distinguish correlation (association) from
causation.
• Ceteris paribus: other relevant factors being equal,
what is the effect of…
– a price increase on consumer demand
– training on worker productivity
10
Causality (cont’d)
If…
a) …we succeed in holding all other relevant
determinants of (say) productivity constant;
and
b) …find a link between training and
productivity,
 …then we can conclude that training has a
causal effect on productivity.
11
Causality (cont’d)
• Ideal setting is experimental: laboratory –
administer treatment to half the sample and
use the other half as control.
• Much of the research in Busines and
economics use non-experimental data
• A key challenge in Business Statistics is to
condition on enough other factors, so that a
case for causality can be made.
12
Causality: Example
• Goal: Estimate the causal effect of education on wages
• Data: WAGE1.DTA. (Source: 1976 Current Population Survey in the US).
• Scatter plot:
25
20
average hourly earnings
10 5
0 15
0 5 10 15 20
years of education
13
Causality: Example
• This of course doesn’t imply that education causes wages
• Wages are determined by many other factors except education –
for example, innate ability
– High ability => high wages
– High ability => high education (e.g. intelligent individuals choose high education )
• Perhaps the correlation between education and wages visible
in the graph is driven by ability rather than education?
• To credibly estimate the causal effect of education, we must find
a way of determining the link between education and wages
holding innate ability constant!
14
Chapter 2:
The Simple Regression Model
15
The simple regression model
Suppose we want to ”explain y in terms of x”.
Three issues:
1. Since there’s never an exact relationship between two
variables: how allow for other factors affecting y?
2. What is the functional form?
3. Are we capturing a ceteris paribus (causal)
relationship between y and x?
16
The simple linear regression model
Assume that, in the population, outcome variable y can be
modeled as a function of x as follows:
u: error term; disturbance term;

residual; noise
β0, β1: parameters, coefficients,

constants
17
Simple regression: The functional
relationship between y and x is linear:
• If other factors in u are held fixed, so that the
change in u is zero (Δu=0), then in a linear model x
has a constant effect on y:
• Hence, β1 is the slope parameter, holding other

factors in u fixed – a parameter of primary interest
in applied Business
• The intercept parameter β0 (sometimes called the
constant term) is rarely central to an analysis.
18
Examples:
• How interpret β1 in these equations?

• What are the ”other factors” that make up u in
these settings?
19
• To get reliable estimators of β0 and β1 from a random
sample of data we have to make an assumption
restricting how unobservable u is related to the
explanatory variable x.
• The crucial assumption:

– The left-hand side is a conditional expectation
– The right-hand side is an unconditional expectation (just
the expected value of u, regardless of x).
– So, this expression says that the expected value of u is
independent of x. Formally, we say that u is mean
independent of x.
20
Detour: Expected Values
• The expected value is one of the most important

concepts related to probability that we will come
across
• If X is a random variable, the expected value of X is
denoted E(X), or sometimes μ.
• Sometimes the expected value is called the
population mean, emphasizing that X represents
some variable in a population.
21
The expected value
• The expected value is a weighted average of all

possible values of X, where the weights are
determined by the pdf.
• Example: Suppose X takes on the values -1, 0, and 2
with probabilities 1/8, 1/2, and 3/8, respectively.
Then,
which is equal to 5/8 (or 0.625).

22
Conditional Expectation
• We can summarize the relationship between one
variable Y and another variable X by looking at
the conditional expectation of Y given X.
• Basic idea: Suppose X has taken on a particular
value, say x.
• Then we can compute the expected value of Y,
given the outcome of X (i.e. x).
• We denote this expected value by E(Y|X=x), or
sometimes just E(Y|x).
23
Example: Conditional Expectation
• Let (X,Y) represent the population of all working
indivduals, where X is education and Y is hourly
wage.
• E(Y|X=12) is simply the average hourly wage for all
people in the population with 12 years of education.
• Similarly, E(Y|X=16) the average hourly wage for all
people in the population with 16 years of education
24
Illustration:
25
Now back to Chapter 2
• We encountered the following ’crucial assumption’:
• Look at the figure on the previous slide.

• How would you draw a figure for which
?
26
Example
Model:
Assumption: E[u | educ] = E[u]
• What does the assumption mean in this

context?
• Does this assumption make sense, given the
context?
27
A more innocent assumption
• As long as the intercept β0 is included in the
equation, we can always assume that the average
value of u in the population is zero:
28
Model:
Assumption:
Assumption:
Now show that the following is true:
• E(y|x) is the population regression function (PRF).

• It is a linear function of x.
• A one-unit increase in x changes the expected value of y by β1
29
Dispersion around the
average, given X = x3
30
Interpretation:
Breaking y into two parts
Given the PRF, it follows that
= systematic part of y + unsystematic part of y
31
Deriving the
Ordinary Least Squares Estimates
• We will now discuss how to estimate the

(unknown) parameters of this model.
• Let’s suppose we have a random sample of
size n drawn from the population:
• Note the i-subscripts on the two variables
(observation indices). The regression model:
32
Estimation procedure
Assumption:
Assumption:
The first of these (mean independence) implies zero covariance
between x and u. We can now re-write the above assumptions as
(2.11)
(2.10)
(Note: Cov(x,u) means the covariance between x and u.

33
See Section B.4 in Appendix B.)
Note:
Re-write the above assumptions again:

(2.12)
(2.13)
• We have two unknown parameters. Can’t we just solve for

these?
• Not quite, because we don’t know the expected values of
y and x in the population.
• It’s precisely for this reason that the best we can hope to
do is estimate the parameters 34
This applies for the population:
(2.12)
(2.13)
The ingredients are unobserved.

Suppose we choose estimates
to solve the sample counterparts of (2.12) & (2.13):
35
• Show how we can solve for and from these
equations ( you need to know how to do this).
where and
36
(2.19)
(2.17)
• These are the OLS estimates of the

parameters of the simple regression model.
• Eq. (2.19): Covariance between x and y
DIVIDED by the variance of x.
• Hence, the sign of is always the same as
the sign of Cov(x,y) for this model.
37
Note: We require
• If this does not hold, is not defined.
• Why?
• What does this imply in practice?
38
Why is this estimator called the ’ordinary
least squares’ (OLS) estimator?
• To see why, first define a fitted value for y when
x=xi as
• Next, define the residual for observation i as
Note that there are n such residuals.

• The OLS estimates minimize the sum of squared residuals:
Least squares…
39
40
Some related concepts…
• The OLS regression line (or, the sample
regression function; SRF):
• Interpretation:
What is the difference between the sample regression

function and the population regression function?
41
Taking stock
• We’ve derived the OLS estimator from two explicit
assumptions. Understanding these and why they
matter is important.
• We have said nothing so far about the statistical
properties of OLS
• So still some way to go… But we’ve made a start!
Now let’s look at an example.
42
Example:
CEO Salary and Return on Equity
• These data were obtained by Wooldridge from the

May 6, 1991 issue of Businessweek.
• It would be interesting to do a similar data collection
exercise now and then investigate if the relationship
between RoE and pay has changed.
43
15000
Scatter plot:
Cov(salary,roe) = 1342.5
1990 salary, thousands $
10000
Corr(salary,roe)= 0.11
Var(roe) = 72.6
5000 0
0 20 40 60
return on equity, 88-90 avg
We have enough information to figure out that the

regression coefficient on roe will be equal to…….
44
Results from simple OLS regression
. reg salary roe
Source SS df MS Number of obs = 209

F( 1, 207) = 2.77
Model 5166419.04 1 5166419.04 Prob > F = 0.0978
Residual 386566563 207 1867471.32 R-squared = 0.0132
Adj R-squared = 0.0084
Total 391732982 208 1883331.64 Root MSE = 1366.6
salary Coef. Std. Err. t P>|t| [95% Conf. Interval]
roe 18.50119 11.12325 1.66 0.098 -3.428196 40.43057

_cons 963.1913 213.2403 4.52 0.000 542.7902 1383.592
• How do we interpret this equation?
45
15000
10000
5000
0
The regression line
0 20 40 60
return on equity, 88-90 avg
1990 salary, thousands $ Fitted values

46
Assignment: Use the following model:
Compute OLS estimates based on the following data:

+---------+
| Y X |
|---------|
| 2 1 |
| 6 9 | Note: Complete the assignment
| 4 3 | using pencil, paper and a pocket
| 12 18 | calculator. Once you have an
| 4 5 | answer, you may want to check it
| 10 15 | using some ststistical software
| 6 7 | (e.g. Stata).
| 6 7 |
| 16 20 |
| 12 13 |
| 8 14 |
| 6 5 |
+---------+
47

Applied Business Statistics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applied Business Statistics

Uploaded by

Copyright:

Available Formats

Applied Business Statistics

y = f ( x1, x2, x3, x4, x5, x6, x7 )

where f ( ) is a function (which remains unspecified for the moment)

y = hours spent in criminal activities

x1 = ”wage” for an hour spent in criminal activity

x2 = hourly wage in legal employment

x4 = probability of getting caught

x5 = probability of getting convicted if caught

x6 = expected sentence if convicted

u: error term; disturbance term;

β0, β1: parameters, coefficients,

• Hence, β1 is the slope parameter, holding other

• How interpret β1 in these equations?

• The crucial assumption:

• The expected value is one of the most important

• The expected value is a weighted average of all

which is equal to 5/8 (or 0.625).

• Look at the figure on the previous slide.

Assumption: E[u | educ] = E[u]

• What does the assumption mean in this

Now show that the following is true:

• E(y|x) is the population regression function (PRF).

Given the PRF, it follows that

= systematic part of y + unsystematic part of y

• We will now discuss how to estimate the

(Note: Cov(x,u) means the covariance between x and u.

Re-write the above assumptions again:

• We have two unknown parameters. Can’t we just solve for

The ingredients are unobserved.

• These are the OLS estimates of the

• Next, define the residual for observation i as

Note that there are n such residuals.

What is the difference between the sample regression

• These data were obtained by Wooldridge from the

We have enough information to figure out that the

Source SS df MS Number of obs = 209

salary Coef. Std. Err. t P>|t| [95% Conf. Interval]

roe 18.50119 11.12325 1.66 0.098 -3.428196 40.43057

• How do we interpret this equation?

1990 salary, thousands $ Fitted values

Compute OLS estimates based on the following data:

You might also like