You are on page 1of 4

DASHBOARD LEARN MENU

Learn VEE Mathematical Stats 2 2.2 2.2.2 "Grouped" Data

"Grouped" Data

We can expand this idea of likelihood to handle data that are grouped into intervals. Let's say an
observation is recorded as being in the interval (a, b] (instead of a specific value such as c ). Then, the
likelihood of this observation is Pr(a < X ≤ b) = F(b) − F (a) , where F(x) is the CDF of X .

By extension, for independently drawn "grouped" observations (i.e. the interval that each observation falls in
is known), the likelihood function of generic parameter θ is

n
L(θ) = ∏ [F(bi ) − F (ai )]
i=1

where the i th observation is recorded to be in the interval (ai , bi ] .

Once L(θ) is found, apply the same strategy already discussed to estimate θ .

EXAMPLE 2.2.3

The following table details how long it took 30 independent computers to finish running a particular script of
code:

Number of Computers Running Time (seconds)

3 (0, 1]
5 (1, 3]
8 (3, 7]
12 (7, 15]
2 (15, 31]

Suppose a Pareto distribution with the following CDF is used to model the running times:
F (x) = 1 − , x>0
1
( )α
Estimate α using MLE. x+1

SOLUTION

For generic constants a and b ,

α α
1 1
F(b) − F (a) = [1 − ( ) ] − [1 − ( ) ]
b+1 a+1
α α
1 1
=( ) −( )
a+1 b+1
1 1
= −
(a + 1)α (b + 1)α

Therefore, the likelihood function of α is

3 5 8 12 2
1 1 1 1 1 1 1 1 1
L(α) = [1 − α ] [ α − α ] [ α − α ] [ α − α ] [ α − α ]
2 2 4 4 8 8 16 16 32

From here, we could brute-force our way with arithmetic and calculus. However, we can reduce our efforts
by substituting

1
β=

to express the likelihood function as a function of β .

~ 5 8 12 2
L(β) = (1 − β)3 (β − β 2 ) (β 2 − β 3 ) (β 3 − β 4 ) (β 4 − β 5 )

Furthermore, we can express each of the five factors with only two terms, β and 1 − β , raised to some
8
power. For example, (β
2
− β 3 ) = β 2(8) (1 − β)8 . As a result,
~
L(β) = β 5+2(8)+3(12)+4(2) (1 − β)3+5+8+12+2
= β 65 (1 − β)30
~
With this setup, the α that maximizes L(α) can be found by first calculating the β that maximizes L(β) ,
1
then plugging it into the equation β = 2α
. The reason why this works will be discussed at the end of the
solution.
Proceed with the usual steps.

~ ~
l (β) = ln[L(β)] = 65 ln[β] + 30 ln[1 − β]

~′ 65 30
l (β) = − =0
β 1−β
65(1 − β) − 30β = 0
65 − 95β = 0
65
β=
95

Therefore,

1 65
α =
2 95
65
2−α =
95
65
−α ln[2] = ln[ ]
95

ln[65 / 95]
^=−
α = 0.547
ln[2]

COACH'S REMARKS

Applying the chain rule,

~ ~ ln[2]
= l ′ (β) ⋅ (− α )

l′ (α) = l ′ (β) ⋅
dα 2

~′ ln[2]
l′ (α) = 0 ⇒ l (β) ⋅ (− α ) = 0
2

For a Pareto distribution, α > 0 . Notice that no finite value of α can make − ln[2] / 2α equal 0.
Therefore, − ln[2] / 2α cannot contribute to solving the equation that consequently simplifies to
~′
l (β) = 0 .

In general, when performing a η = g(θ) substitution, you should consider whether dθ
can result in 0 for
some possible θ .

MORE INFORMATION

The "take first derivative; set equal 0; solve" strategy may not always work. It takes for granted that l(θ) (or
L(θ) ) has a global maximum at a critical point. Regardless, remember that the objective is to maximize
l(θ) (or L(θ) ). If and when the default strategy fails, it may be possible to obtain an estimate by reasoning
things through.

Discussions

Ask a question

Nur Alia Kamaluddin

SUMMARY:

MESSAGE:

Type your question...

Previous Lesson Next Lesson


Watch 2.2.1 Complete Data: Examples Watch 2.2.2 Grouped Data

You might also like