University of Southampton lecture notes Chp 1-11

© All Rights Reserved

8 views

University of Southampton lecture notes Chp 1-11

© All Rights Reserved

- Sampling
- Isi Mtech Cs 08
- UT Dallas Syllabus for cs3341.501.09s taught by Michael Baron (mbaron)
- 51
- 10.1.1.95
- Lecture on Probablity
- ENGR 371 Midterm July 2008
- Statistical Distributions
- 04114019
- Chapter1(2) Edit
- probworkshop
- Probability
- Prob-4
- Distributions
- PortaCom Risk Analysis.xlsx
- 5.Sas Codes Extra
- sdfsf
- Chapter9Slide Hand
- Em Th 119 Prob Stats Notes
- AD Week 1

You are on page 1of 192

In this module we are going to examine the properties of distributions of random variables, we

begin by defining what we mean by a random variable.

1.1 Definitions:

The set of all possible elementary outcomes of an experiment is called the sample space.

A random variable is a mapping of a sample space to the real line.

Note: we will usually a random variable by a capital letter (e.g. Y) and the value taken by a random

variable by a lower case letter (e.g. y).

A discrete random variable is a random variable that can take only a finite or countably infinite

number of values.

Note: in this module discrete random variables will usually be integer-valued.

[see 1024 P10]

1.2 Examples

If the experiment consists of tossing a coin with outcomes head or tail and we toss the coin once,

then clearly the sample space is {head, tail}. One possible random variable, X say, is the number of

heads obtained. X can only take values 0 or 1 and so is a discrete random variable.

Suppose we are interested in monitoring the number of hits at a web site in a year. Denote the

number by Y. Clearly, Y must be integer-valued and so is a discrete random variable, but there is no

obvious upper bound to Y, so it may be convenient to take the set of possible values of Y to be the

countably infinite set {0, 1, 2, }.

Associated with a discrete random variable is a probability function (sometimes called the

probability mass function), which gives the probability of each possible value of the random

variable.

Let the random variable of interest be denoted by X with a set of possible values D. Then the

probability function p(.) is given by

p(x) = P(X = x), for x D .

Note: It is important to include the domain D when specifying a probability function.

xD

1

[see 1024 P10]

1.4 Example

If the experiment consists of tossing a coin with outcomes head or tail and we toss the coin once,

then clearly the sample space is {head, tail}. Suppose X, the number of heads obtained, is our

random variable of interest. Then, if the coin is fair, the probability function of X is

p(0) = p(1) = .

However, if the fairness of the coin is unknown the probability function of X could be taken to be

p(x) x (1 )1x , x = 0, 1.

Here is a parameter, i.e. a fixed but unknown constant. Clearly, must lie between 0 and 1 in

this case since it is a probability. This is a common situation encountered in Statistics: we might

assume we know the form of a probability function but it contains one or more unknown quantities

(parameters) whose value(s) we need estimate from sample data.

A Bernoulli trial is an experiment with just two possible outcomes success and failure that

occur with probabilities and 1 respectively, where is the success probability.

A Bernoulli random variable X has probability function

p(x) x (1 )1x , x = 0, 1,

where 0 1.

This is a basic building block for some familiar but more complex discrete random variables.

[see 1024 P10]

1.6.1 Binomial random variables

Suppose we undertake a fixed number, n, of independent Bernoulli trials, each with success

probability . Let X be the number of successes in these n trials. Then X is a Binomial random

variable with probability function

n

p(x) x 1 , x = 0, 1, , n,

n x

x

where 0 1.

We will often say in such circumstances X is Binomially distributed or X is Binomial(n, ) or

X~ Binomial(n, ).

[see 1024 P10]

2

1.6.2 Negative Binomial random variables (including Geometric random variables)

Suppose we undertake a sequence of independent Bernoulli trials, each with success probability .

Let X be the number failures that occur before the kth success. Then X is called a Negative

Binomial random variable with probability function

(k x 1)!

p(x)

(k 1)!x!

1 x k , for x = 0, 1, 2, ,

where 0 1.

In the special case with k = 1 X is called a Geometric random variable with probability function

p(x) 1 x , for x = 0, 1, 2, .

Poisson random variables arise in a variety of practical situations where events occur apparently at

random with a rate of occurrence per unit time, e.g. queuing theory. The random variable Y is

defined as

Y = the number of events in an interval of fixed length t.

Provided that an event occurs instantaneously, the range of possible values for Y is 0, 1, ,

showing that this is a discrete random variable defined over the non-negative integers.

Then the probability function for a Poisson random variable is given by

et t

y

p(y) = , y = 0, 1, ...,

y!

where > 0.

This result was obtained in detail in MATH1024.

Of course, by defining the time unit appropriately one can take t = 1, giving probability function

e y

p(y) = , y = 0, 1, ...,

y!

where > 0. This is a useful form of the probability function if one is aiming to model count data

more generally, particularly when time is not the main focus.

[see 1024 P13]

Suppose that X~ Binomial(n, ) with n = and let n , then it can be shown that for fixed x

n x e x

p(x) 1

n x

as n .

x x!

So for large n and small , Binomial probabilities can be approximated by Poisson probabilities.

3

1.9 A practical illustration of using discrete random variables

Though this module is concerned primarily with obtaining theoretical results about random

variables, it is useful to remember that the results are often useful in applications. Here is an

illustration of a simple example in statistical modelling.

Suppose that we have data consisting of the number of oil producing wells in a region of Texas

(Data given in Davis (1986)). This shows the locations of oil-field discovery wells in part of the

Eastern Shelf area of the Permian Basin, Fisher and Noland counties, Texas. One question that

could be asked of these data is Are the oil wells occurring at random in this region, or is there some

pattern to their distribution?

We may investigate this by defining a suitable random variable and investigating its distribution to

see whether a Poisson distribution might be appropriate. If so, then this would confirm that the

wells are occurring at random. Suppose that we define selected areas according to the grid of

squares in the picture below (these squares are called quadrats), and count the number of wells in

each area. Then the pictorial representation has been transformed into data consisting of counts of

wells over the grid. The data are discrete, in that only integer values are obtainable.

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0

0 1 2 3 4 5 6 7 8 9 10

FIGURE 1. Locations of oil field discovery wells in part of the Eastern Shelf

area of the Permian Basin, Fisher and Noland counties, Texas. Quadrats are

approximately 10 square miles in size.

4

Suppose we count the number of wells in each quadrat. The data set produced is shown below,

from which we might ask the following questions.

2 What proportion of quadrats have no wells?

3 Is this the pattern we would expect if these wells were randomly distributed over this region?

The counts in the 160 quadrats are shown below:

0 1 2 2 1 2 0 0 0 0

0 0 2 0 3 3 1 1 0 0

0 3 0 0 3 5 2 1 1 0

2 3 1 0 1 4 1 1 0 1

3 2 2 0 0 1 2 1 2 0

3 0 0 3 1 0 2 1 0 0

2 0 0 0 0 3 1 1 1 0

4 0 0 1 0 0 4 2 1 1

2 0 1 0 0 0 0 1 2 1

1 0 0 1 0 0 2 2 1 0

0 2 2 0 0 1 0 0 0 1

0 1 0 0 1 1 1 0 2 2

1 1 3 4 2 1 3 0 1 0

0 3 3 3 1 0 0 0 0 1

2 2 3 1 1 0 0 0 0 0

6 3 1 2 2 1 0 0 0 0

The first step is to summarise these data and extract answers to some or all of these questions?

If we count the number of 0s, 1s etc and arrange the results in a table the following is produced:

frequencies Frequencies relative

frequencies

0 69 69 0.4313 0.4313

1 43 112 0.2687 0.7000

2 26 138 0.1625 0.8625

3 16 154 0.1000 0.9625

4 4 158 0.0250 0.9875

5 1 159 0.0062 0.9938

6 1 160 0.0062 1.0000

This table shows the frequency of the number of wells in a quadrat, the cumulative frequency, the

proportion and the cumulative proportion. We can see that a proportion 0.4313 of the quadrats

contain no wells. Multiplying these proportions by 100 gives percentage frequencies which show

the percentage of the sample which has 0, 1, etc wells.

5

To obtain a diagram illustrating this data the frequencies or relative frequencies may be plotted

against the count number as in the diagram below.

Frequency Relative

70 frequency

60 0.375

50

40 0.25

30

20 0.125

10

0 1 2 3 4 5 6

Number of wells

Figure 2 Frequency diagram for the discrete well data

The average number of wells per quadrat is found by adding the 160 observations and dividing by

the sample size ( i.e. number of quadrats = 160).

= (43 1 + 26 2 + 16 3 .....+ 6 1)/160

= (no. of wells freq)/160

= (no. of wells rel. freq)

= 1.0625.

i.e. on average there is just over one well per quadrat in this area.

Does the Poisson distribution provide a good explanation for these data?

Notice that if t is the length of the interval considered (in this case the area of a quadrat), and is

the rate of occurrence per unit area, then we would expect on average an interval to contain = t

events. This value is the mean of the random variable of interest and is the parameter in the

Poisson probability function which must be determined in order to try to answer the question. Since

6

we do not know the true mean or the true rate of the occurrence of the oil wells, we shall need

to use the sample mean 1.06 as an estimate of

If the Poisson distribution fits the data, the relative frequencies will reflect the probabilities that

would be given by the corresponding Poisson probabilities with parameter 1.06;

p(0) = e-1.06(1.06)0 = 0.3464

p(1) = e-1.06(1.06)1/1! = 0.3672

p(2) = e-1.06(1.06)2/2! = 0.1946

p(3) = e-1.06(1.06)3/3! = 0.0687

p(4) = e-1.06(1.06)4/4! = 0.0182

p(5) = e-1.06(1.06)5/5! = 0.0038

p(6) = e-1.06(1.06)6/6! = 0.0007

The expected frequencies for this model may be found using 160 p(x), for x = 0, 1, 2,..., giving:

frequencies Probabilities frequencies

0 69 0.4313 0.3464 55.4

1 43 0.2687 0.3672 58.8

2 26 0.1625 0.1946 31.4

3 16 0.1000 0.0687 11.0

4 4 0.0250 0.0182 2.9

5 1 0.0062 0.0038 0.6

6 1 0.0062 0.0007 0.1

There are considerable differences between these results with too many zero observations and too

many in the larger number of wells relative to the results that assume the Poisson assumptions

(which themselves depend on events, i.e. the presence of a well, being randomly distributed in

two-dimensional space). The problem here appears to be that the wells are clustering together too

much for the Poisson model, which depends on random occurrence, to be valid. So our Poisson

model appears not to be suitable in this case. In fact, a version of the Negative Binomial model (see

section 1.6.2) turns out to fit the data very well. The details are beyond what we have done so far

(they require results that you will meet in the second half of the module) but, purely for illustration

that a good model for these data can be found, here are the corresponding results assuming a

Negative Binomial model.

7

Number of wells Observed Relative Neg. Bin. Expected

frequencies Probabilities

0 69 0.4313 0.4124 66.0

1 43 0.2687 0.3112 49.8

2 26 0.1625 0.1611 25.8

3 16 0.1000 0.0706 11.3

4 4 0.0250 0.0281 4.5

5 1 0.0062 0.0106 1.7

6 1 0.0062 0.0038 0.6

The fit here is seen to be very much better, indicating that these oil-producing wells tend to cluster.

The Negative Binomial is thus seen to be a more appropriate model than the Poisson in explaining

this data set.

Final remarks

In this chapter and in subsequent chapters, where material appeared in MATH1024 lectures I refer

you back to the relevant notes from 2015-16 using the notation 1024 Px or 1024 Sx to mean

Probabilty Lecture x or Statistics Lecture x from the MATH1024 notes.

Where relevant I also give information where the material can be found in Mood, Graybill and Boes

(MGB), though the notation in MGB is not always the same as in these notes.

For this chapter the MGB reference is:

MGB Chapter II section 3.1 and Chapter III sections 2.2, 2.4 and 2.5.

8

MATH2011 Statistical Distribution Theory

We shall now look at random variables which may take any values within an interval, for example

in the range from zero to infinity. Such random variables are called continuous random variables.

In this chapter we revise three well-known types of continuous random variables.

Suppose that we have a Poisson process [1024 P13] with events occurring at random at rate per

unit time, but this time we consider the random variable Y = the time interval between two events.

Clearly this variable cannot be negative, but can take any positive value.

The domain of Y is (0, ), or 0 < y < .

Consider P(Y>y) = P(no events in an interval of length y)

e y y

0

= ,

0!

= e-y,

i.e. P(Yy) = 1 e-y for 0 < y < .

Y is called an Exponential random variable. [1024 P17]

The expression P(Yy) above is known as the cumulative distribution function (cdf) of Y, and

gives the probability that Y y, denoted by F(y) = P(Yy).

The cdf of any random variable is bounded by 0 and 1 and is monotonically increasing. In addition,

for a continuous random variable it is also a continuous function.

For an Exponential random variable, as in 2.1, we have:

F(y) = 1 e-y for 0 < y < .

[1024 P17]

y

If for a random variable Y there exists a function f(y) such that F(y) = f ( u)du , then f(y) is called

If such a function exists, then it has certain properties:

a) f(y) 0, for all y.

b) f (y)dy = 1.

c) f ( y)dy =

a

F(b) F(a) = P(a<Yb).

General relationship between the density function and the distribution function

We may find f(y) from F(y) by noting that

dFy

f(y) =

dy

or, if we have f(y), we may find

y

F(y) = f (u )du .

Typically, in many situations of practical interest, the pdf is more convenient to use than the cdf.

[1024 P17]

For the example concerning the time interval between events in a Poisson process,

F(y) = 1 e-y, y>0.

Consider the function f(y) = e-y, y>0.

y

0

u

du = 1 e-y, y>0

Thus f(y) = e-y, y>0, is the density function of the Exponential distribution.

[1024 P17]

Suppose the lifetime of a certain type of electronic component is described by an Exponential

random variable with parameter = 0.01 hours-1. What is the probability such a component will

have a lifetime of between 100 and 200 hours?

The probability is the area under the curve f(y) = 0.01e-0.01y between y = 100 and y = 200

200

0.01e

0.01y

P(100 Y 200) dy

100

2

2.5 Normal random variables

The most important continuous probability model is the Normal distribution. Two examples of

Normal distributions superimposed on observed sets of data are given below. The first example

shows a frequency diagram of heights of young adult males while the second involves diastolic

blood pressures of schoolboys.

Frequency

12000

6000

0

60 63 66 69 72 75 78

Height (in)

distribution (Martin, 1949, Table 17 (Grade 1))

Frequency

30

15

0

40 50 60 70 80

Diastolic blood pressure (mm Hg)

Normal distribution (Rose, 1962, Table 1)

3

Both distributions are approximately symmetrical about their central values and they exhibit a

similar shape, even though the units of the measurements are very different.

The observed frequencies have been approximated by a smooth curve which in each case is based

on a Normal probability distribution model with appropriately chosen mean and standard deviation

(see later).

Normal random variables are ubiquitous in theoretical statistics and in application areas. One reason

for this is the central limit theorem (which you met last year and which we shall prove later in this

module).

[1024 P19-20]

The pdf of a Normal random variable is given by

1 ( y ) 2

f(y) exp , y

2 2 2

(where exp(z) is a convenient way of writing the exponential function ez). Note that are and > 0

are parameters. If Y is a random variable with the above pdf, we will write Y ~N(, 2).

The curve is shown in the Figure 3. On the horizontal axis of this figure are marked the positions of

the mean and the values of y that differ from by , 2 and 3. The symmetry of the curve

is evident from the mathematical model, since changing the sign of y leaves f(y) unchanged.

The figure shows that a relatively small proportion of the area under the curve lies outside the two

values y = 2 and y = + 2. The vertical scale is arranged so that the area under the curve is

equal to one. This implies that the area between any two points on the horizontal axis represents

the probability that the variable takes a value between these two points. For example, the

probability that the variable takes a value in the interval y = 2 up to y = + 2 is very nearly

0.95 and the probability that Y lies outside this range is correspondingly approximately 0.05. It is

important to be able to find the area under any part of a Normal pdf.

4

0.4

Probability

density x

0.2

0.0

-3 -2 - + +2 +3

Original variable, y

-3 -2 -1 0 1 2 3

Standardised variable, z

Figure 3 The probability density function of a Normal random variable showing the scales of

the original variable and the standardised variable.

Now f(y) depends on two parameters, the mean and standard deviation . It might be thought

therefore that any relevant probabilities would have to be worked out separately for every pair of

values , . Fortunately this is not so. We have seen that the probability that Y lies in the interval

2 up to + 2 is about 0.95, which is true without specifying the values of and . In fact

the probabilities depend on an expression of the departure of y from as a multiple of . The

statement above is equivalent to saying that there is a probability of approximately 0.95 that y lies

within two standard deviations of the mean. On the diagram these multiples are marked on the axis

as 1, 2 and 3 as shown on the lower scale. The probabilities under various parts of any Normal

pdf may be expressed in terms of the standardised deviate

y

z

A few important results are given in the table below. More detailed tables of the Normal

probabilities are available (just search on standard normal tables online).

5

Some probabilities associated with Normal random variables

Standardised deviate Probability of greater deviation

z = (y )/ In either direction In one direction

0.0 1.000 0.500

1.0 0.317 0.159

2.0 0.046 0.023

3.0 0.0027 0.0013

1.960 0.05 0.025

2.576 0.01 0.005

This table shows probabilities of obtaining a standardised deviate z = (y )/ more extreme (in

either direction or in one direction) than the tabulated value. For example, for z = 2.0 the

probability of obtaining a value of (y )/ outside 2.0 is 0.046, while the probability of (y )/

being greater than 2.0 is 0.023. (By symmetry, the probability that (y )/ is less than 2.0 is

also 0.023.) The figure below illustrates these probabilities.

0.4

density

0.3

0.2

0.1

0.023 0.023

0.0

-3 -2 -1 0 1 2 3

z

The usual tabulation of Normal probabilities is in the form of the cumulative probability that

z = (y - )/ is less than the tabulated value. This may be used for any Normally distributed

random variable Y ~N(, 2) because

y1 (y ) 2

P(Y y) F(y) exp dy

2 22

( y )/ 1 z2 y

exp dz (z) P(Z ),

2 2

[1024 P19-20]

6

2.7 Example of Normal probability calculations

Suppose that daily water use at a factory varies about a mean of 15,500 gallons with standard

deviation 1,140 gallons. If demand is Normally distributed

(i) What proportion of days does the demand fall short of 14,000 gallons?

(ii) What proportion of days does demand exceed 18,000 gallons?

(iii) What is your reaction to a demand of 35,000 gallons?

In each case we first require to calculate the standard normal deviate, z = (y )/. Using the table

of the Normal distribution function, and using the symmetry property where necessary, we have

(i) z = (14,000 15,500)/1,140 = 1.32.

From tables, the upper tail probability for z = 1.32 is 0.0934, and the lower tail

probability for z = 1.32 will be identical.

Thus 9.34% of daily demands fall short of 14,000 gallons.

(ii) z = (18,000 15,500)/1,140 = 2.19, with upper tail probability 0.01426. i.e. about

1% of daily demands exceed 18,000 gallons.

(iii) z = (35,000 15,500)/1,140 = 17.11. This lies beyond the range of the tables, but the

tail probability is less than one in a billion. One would be surprised and an

explanation may be sought. It is possible that a mis-recording error has occurred, such

as two days data being taken together. This idea of surprise at an extreme result of

low probability, as predicted by a statistical model, will be important later in this

module and also in modules such as MATH2010 Statistical Methods I.

Frequency distributions resembling the Normal pdf in shape are often observed but this form should

not be taken as the norm - despite the use of the name 'Normal'. Many observed distributions are

undeniably far from 'Normal' in shape yet should not be regarded as abnormal in any way.

The importance of Normal random variables lies in the central place that it occupies in sampling

theory which we shall discuss later. Many of the usual estimation and testing procedures require

that the Normal model for the behaviour of the measurement is reasonably valid.

7

2.8 The use of a Normal approximation to Binomial probabilities

We have seen in 1.6.1 that the Binomial model is appropriate when considering the number of

successes in independent Bernoulli trials. However, Binomial probabilities can often be

approximated by Normal probabilities when n, the number of trials, is large.

Suppose that we have a Binomial situation, i.e. n trials of a dichotomous random variable (success

or failure) with constant probability of success. The probability that the number of successes is r is

given by the Binomial probabilities

n!

P(Y = r) = r (1 )n r for r = 0, 1, , n.

r!(n r)!

Therefore the probability that Y takes a value between r1 and r2 is given by

r r2

n!

P(r1 Y r2 ) = r!(n r)! (1 )

r r1

r n r

The Binomial mean = n and variance 2 = n (1 ) may be used here (see Chapter 4 for

details).

If n is sufficiently large, the observed number of individuals Y (with the particular characteristic,

success) in the sample of size n is approximately Normally distributed with mean value = n and

standard deviation = {n (1 )}.

Account should be taken of the fact that Binomial random variables are discrete, while the Normal

is continuous. Slightly more accurate approximations are provided if a continuity correction is

used. Thus the probability that the sample will contain between r1 and r2 individuals with the

characteristic of interest is approximately given by the standard Normal probability from

r1 0.5 n r2 0.5 n

z1 to z 2 .

n(1 ) n(1 )

[1024 P21]

Perhaps the second most commonly occurring distribution in scientific investigations is the

Lognormal. The random variable Y is said to be Lognormal if X = logY is a Normally random

variable. (Note that all logarithms are assumed to be to base e in this module.)

8

1 (log y ) 2

f(y) exp y 0.

y 2 2 2

Note that, as in the Normal case, the Lognormal model is a two-parameter model. It has applications

in a variety of fields, such as Economics, where a multiplicative form of the central limit theorem may

apply.

[1024 P additional handout]

Final remarks

In MGB the material in this chapter is covered in Chapter II section 2 and Chapter III sections 3.2,

3.3 and 3.5.

9

MATH2011 Statistical Distribution Theory

Suppose that Y1, , Yn represent n independently and identically distributed random

variables each with cumulative distribution function F.

Suppose that the corresponding observed values are y1, , yn.

Let these values, when ordered, be represented by

The y(i), i = 1, , n, are called the order statistics corresponding to y1, , yn.

1

You have already met certain order statistics. For example, the sample median is an order

statistic: for odd values of n the sample median is equal to y({n+1}/2), while for even n the

sample median is defined as

We shall concentrate, however, on two particular order statistics: y(1), the sample

minimum, and y(n), the sample maximum. We define the corresponding random variables:

2

6.2 Applications where maxima and/or minima are of interest

smallest data point.

In reliability engineering a system will tend to fail at its weakest point (which might

be thought of as the point with the minimum strength).

In designing coastal defences one needs to understand the distribution of the wave

heights of the highest tides.

There is a whole area of statistics devoted to the study of extremes (Extreme value

theory). In this short chapter we just give a brief introduction to the subject.

3

6.3 The cdf of Y(n), the largest value in a random sample of size n.

Since Y(n) = max{Y1, , Yn}, the probability that Y(n) y gives the cumulative distribution

function of Y(n), the sample maximum.

Now the event {Y(n) y } is identical to the event {Y1 y and Y2 y and Yn y}.

So

Gn(y) = P(Y(n) y) = P( all Yi y) = P(Y1 y and Y2 y and Yn y).

Thus, by independence

4

6.4 Example: a simple discrete experiment

Suppose I roll a fair die twice. What is the probability function of the maximum of the two

scores?

We know that F(y) = y/6 for y= 1, 2, 3, 4, 5, 6. Also, n = 2 in this example.

So the distribution function of the higher score is:

G2(y) = (y/6)2 for y= 1, 2, ... , 6.

Hence

P(Y(2) = y) = (y/6)2 [(y1)/6]2 for y= 1, 2, ... , 6.

5

6.5 The pdf of the maximum in the continuous case

If the Yi are continuous, each with density function f, then the density function of Y(n) may

be found by differentiating Gn(y) with respect to y to give:

d

gn(y) = [F(y)]n n[F(y)]n 1 f (y) ,

dy

where the domain of the maximum is the same as that of each of the Yi.

6

6.6 Example: the maximum of a uniform random sample

y

Then on (0, ) we have F(y) = .

Hence

n 1

y 1 ny n 1

g n (y)=n , 0 < y < .

n

7

Note how the probability piles up against the upper end point of the domain of the pdf.

How do you think the expected value and variance of the sample maximum in this case

will change as n increases?

What would happen to Y(n) when the domain of the Yi has no finite upper end point?

8

6.7 The cdf Y(1), the smallest value in a random sample of size n.

Since Y(1) = min{Y1, , Yn}, the probability that Y(1) y gives the cumulative distribution

function of Y(1), the smallest value in the sample.

Now

P(Y(1) y) = 1 P(Y(1) > y) = 1 P( all Yi > y) = 1 P(Y1 > y and Y2 > y and Yn > y),

9

6.8 The pdf of the minimum in the continuous case

If the Yi are continuous, each with probability density function f, then the pdf of Y(1) may

be found by differentiating G1(y) with respect to y to give:

g1(y) = n 1 F(y)

n 1

f (y) ,

where the domain of the minimum is the same as that of each of the Yi.

10

6.9 Example: The distribution of the minimum of an Exponential random sample.

Suppose that Yi for i = 1, , n, are independent Exponential random variables, each with

probability density function

f(y) = ey, 0 < y < .

g1(y) = n 1 F(y)

n 1

f (y) for 0 < y < .

11

Now F(y) = 1 ey so that

n 1

g1(y) = n ey ey = ne ny , for 0 < y < .

That is, the distribution of the smallest value in an Exponential random sample of size n

with parameter (i.e. with mean value 1/) is also an Exponential random variable but

with parameter n (i.e. with mean value 1/(n)).

So in this case Y(1) has expected value 1/(n) and variance 1/(n)2, which both decrease as

n increases.

Is that a surprise?

12

6.10 Closing remarks

We have only given a taster here of an interesting area of statistics.

We have only looked at the marginal behaviour of the minimum and the maximum,

and we have only considered the extreme order statistics. The results can be extended

to include other order statistics.

The closure result in 6.9 hints at some interesting structure in the probabilistic

behaviour of maxima and minima. The central limit theorem basically says that under

certain conditions the sum of n independent, identically distributed random variables

is approximately Normal as n grows large. There are corresponding results for

maxima and minima, though the large-n distribution is not Normal (it is the so-called

generalised extreme value distribution).

MGB go into much greater detail on this topic in their Chapter V1 section 5.

13

- SamplingUploaded bymnadeem_248131
- Isi Mtech Cs 08Uploaded byapi-26401608
- UT Dallas Syllabus for cs3341.501.09s taught by Michael Baron (mbaron)Uploaded byUT Dallas Provost's Technology Group
- 51Uploaded byPriyadarshini Gore
- 10.1.1.95Uploaded bySanat Dubey
- Lecture on ProbablityUploaded byatifjaved91
- ENGR 371 Midterm July 2008Uploaded byamnesiann
- Statistical DistributionsUploaded bySalman Goraya
- 04114019Uploaded byHassan A Hamdan
- Chapter1(2) EditUploaded byJames Thee
- probworkshopUploaded byLawrenciaUdife
- ProbabilityUploaded byLuis Alberto Martínez Benítez
- Prob-4Uploaded byrahima khan zubair
- DistributionsUploaded byIra Munirah
- PortaCom Risk Analysis.xlsxUploaded byJayaKhemani
- 5.Sas Codes ExtraUploaded bykPrasad8
- sdfsfUploaded byAnon
- Chapter9Slide HandUploaded byAnkur Dalal
- Em Th 119 Prob Stats NotesUploaded byDavid Olorato Ngwako
- AD Week 1Uploaded bySahil Jain
- Normal Distribution.pptUploaded byPam Fajardo
- [BankSoalan] Johor Add Maths P1 2017Uploaded byKhor Han Yi
- ma3215_exam3_practice_soln.pdfUploaded byPei Jing
- Additional Notes Set 2_DiscreteUploaded byRicky Pompii Shoe
- Probability and Statistics GR11_WorktextUploaded byCristy Balubayan Nazareno
- Continuous RV-handout 2Uploaded byJoana Rose Dela Cruz
- probability-unit-plan.pdfUploaded byParminder Singh
- 16800Distributions-1Uploaded byFaisal Khan
- AngelUploaded byDan Dave
- Unit 3 (1)Uploaded byJai Sai Ram

- 1Uploaded bysufyanbutt007
- The Ultimate Probability CheatsheetUploaded byqtian
- Maths Question PapersUploaded byRaja Ramachandran
- Week 4 - Distribution FunctionsUploaded byKyle Hansen
- MistralUploaded byRiccardo Cozza
- productFlyer_978-3-642-05155-5.pdfUploaded byxavo_27
- Lab2 ProbabilityUploaded bybiebs039
- A Short Guide to Expected Monetary ValueUploaded byAlyanna Hipe
- Monte Carlo SimulationUploaded byJano Lima
- WALTZ_1979_-_Theory_of_International_Pol.pdfUploaded byTuğba Yürük
- Chapter 8 Binomial DistributionUploaded byBid Hassan
- Mathcad for in Class Examples in a Random Processes Course (1)Uploaded byChristine Petate
- probability l5Uploaded byapi-284106128
- Monte Carlo MethodsUploaded bygenmuratcer
- SecurityUploaded bydarkwind33
- Automated Planning for Remote Penetration TestingUploaded bythe syah
- Life-365 Service Life Prediction Model™Version 2.0Uploaded byProfessor Dr. Nabeel Al-Bayati-Consultant Engineer
- Decision Tree 18Uploaded byKulmeet Singh
- Uncertainty TheoryUploaded byVirginia Velásquez
- Probability theory revisitedUploaded byjohseb71
- Practical Trend TradingUploaded byvaldyrheim
- Conditional ProbabilityUploaded byShivakant Upadhyaya
- UseR-SC-10-B-Part2.pdfUploaded byuniversedrill
- Probability DistributionsUploaded byChinh Lê Đình
- Reading221.pdfUploaded bySamuel Colina
- BITS pilani advance statisticUploaded bylucky2010
- The Law of Large NumbersUploaded byMichael Latt
- Section 05 02 Ess Stats2eUploaded bykaled1971
- Worksheet Independent EventsUploaded byHazel Clemente Carreon
- SOS Wear Table - June 2015Uploaded byricardo manriquez

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.