[
[
Probability and
Stochastic Processes
[
This text can be used iri J u nior or Senior level courses in probability and stoch astic
processes. The m at liernatical exposition \vill appeal to students arid practitioners in
rnan:y ar eas. The exarnples , qt1izzes, and problerns ar e typical of those encountered
by practicirig electrical and computer engineers. Professionals iri the telecomrnuni
catioris and vvireless industr}' vvill find it p ar t icularl}' useful.
What's New?
This text has beeri expanded \vith rie"'' irit roductory rnaterial:
Over 160 ne\v horne,vork problerris
New chapters ori Seq'uen,tial Trial.5, D erived Ran,dorn Variables and Coridi
tion,al Probability Models.
MATLAB examples a.nd problerns give students handson access to tlieory and
applications. Every chapter includes guidarice on ho"'' to use MATLA.B to
perform calcl1lations and sirnl1lations relevant t o the subject of the chapter.
Advanced rnater ia.l online in Sigrial Processin,g arid Marko'u Chain,s supple
rnents.
Notable Features
Instructor Support
Instrt1ctors can r egister for the Iristrl1ctor Cornpanion Site at www.wi l ey.com/
coll ege/yates
[
Probability and
Stochastic Processes
A Friendly Introduction
for Electrical and Computer Engineers
Th ird Edition
Roy D. Yates
R?J,tgers; The State Uriivers'i ty of N e111 Jersey
David J. Goodman
Ne'llJ York Uriivers'i ty
WILEY
[
This book vvas set in Con1puter lVIodern by t he a u t hors using LATEX and prin ted a nd bound
by RRDonnelley. The cover \Vas printed by JlRDonnelley.
A bout t he cover: The cover sh o\vs a circun1horizon tal arc. As noted in \:\Tikipedia, t his is a n
iceh alo forn1ed by platesh aped ice crystals in high level cirrus clouds . The misleading tern1
"fire rainbow" is sometimes used to descri be t his rare phenomenon , alt hough it is neit her a
rainboV\T, nor related in any \vay to fire.
F ounded in 1807, J ohn W iley & Sons, Inc. has been a valued source of kno\vledge and
understanding for more t h an 200 years, helping people around t h e \vorld meet t heir needs
and fulfill t heir aspirations . Our con1pany is built on a foundation of principles t h at include
responsibility to t he communities \\Te serve a nd vvhere vve live and \Vork. In 2008, vve
launch ed a Corporate Citizenship Initiative, a global effort to address t he environn1en tal,
social, econ on1ic, and ethical challenges we face in ou r business. Among t he issues \Ve are
addressing are carbon impact, p aper specifications and procuremen t, ethical conduct \vit hin
ou r business and among our vendors, a nd comn1unity and charitable support . F or more
inforn1ation , please visit our website: WV\T\V.V\Tiley.con1/go/citizenship.
Copyright 2014 J ohn vViley & Sons, Inc. All rights reserved. No pa rt of t his publication n1ay be
reproduced , stored in a retrieval system or t ra nsn1itted in any forn1 or by any means, electronic,
mechanical, photocopying, recording, scanning or otherV\Tise, except i:ts pern1itted under
Sections 107 or 108 of t he 1976 United States Copyrigh t _A.ct, wit hou t eit her t he prior \\Tritten
permission of t he Publisher , or a ut h orization t hrough payn1en t of t he appropriate p ercopy fee
to t he Copyright Clearance Cen ter , Inc. 222 Rose\vood Drive, Danvers, l'v'.IA 01923, vvebsite WV\TVV
.copyright .com. Requests to t he Publisher for permission should be addressed to t he
P ern1issions Depar t men t, J ohn vViley & Sons, Inc., 111 River Street, Hoboken , NJ 07030
5774, (201)7486011 , fax (201)7486008, V\Tebsite http : //\vvvw.wil~y. con1/go/pern1issions .
Evaluation copies are provided to qualified acaden1ics and p rofessionals for revie\v purposes
on ly, for use in t heir cou rses during t he next a,c aden1ic year. T h ese copies are licensed and
may n ot be sold or t ransferred to a t hird party. Upon completion of t he revie\v period,
please return t he evalu ation copy to vViley. Return instructions a nd a free of charge return
mailing label are available at Vl.' \VvV.\viley.com /go/returnlabel. If you have chosen to adopt
t his textbook for use in your course, please accep t t his book as your com plin1entary d esk
copy. Outside of t he United States, please contact your local sales representative.
ISBN 9781118324561
Printed in t he United States of _A.n1erica
10 9 8 7 6 5 4 3 2 1
[
Preface
Welcome to the third edition
You are r eading t he t liird edition of our t extbook. Altliough the funda rrientals of
probab ility and stochastic processes liave not ch anged since vve wrote the first ed i
t ion , t lie v.rorld inside and ot1tside t1niversities is different now t lian it v.ras in 1998.
Outside of academia, applications of probabilit:y t heory have expanded enorrriously
iri the past 16 years. Think of the 20 billion+ Web searches eacli mont h and tlie bil
lions of dail}' cornputerized stock excliange trarisactions, each b ased on probabilit}'
rriodels, rriany of tlierri devised by electrical and corriputer erigirieers.
Universit ies and secoridary schools, recognizing t lie ft1ndarnental importan ce of
probab ility theory t o a wide r ange of subject a reas , are offering co11rses in t he sub
ject t o yo11nger students thari the ones who studied probability 16 }'ears ago. At
Rutgers, probabilit}' is riow a required course for Electrical and Corriputer Engi
neering sopliorriores.
vVe liave responded in severa l "''ays to t hese ch ariges a rid t o the st1ggestions of
studerits and instrt1ct ors vvho used tlie earlier editioris. Tlie first and second edit ions
contain rriat eria.1 fo11nd iri postgraduate as well as advanced t1ndergrad11ate courses .
By cont r ast , t he printed a rid ebook versions of tliis t hird edit iori foc1is on the
needs of llridergraduat es stud}ring probability for the first tirrie. T lie rriore advanced
rriater ia l in tlie earlier editions, covering r a ridorri sign a l processing a nd ]\/I arkov
ch ains, is available at t lie cornpanion "''ebsit e ( www.wil ey . com/ college/yates).
T o prornote intt1ition into the practical applicat ioris of t lie rriatlierriatics, v.re have
exparided t lie riurriber of examples and quizzes and horriev.rork problerris to about
600, an increase of about 35 percen t compared to the second edition. ]\/I ariy of t lie
exarriples are rnatlierriatica.1 exercises. Ot liers are questions tliat are sirriple versions
of the ones encot1ntered by professiorials v.rorking ori practical applications .
]\/I ot ivated by our teacliing experience, we have rearra nged the seqt1ence iri vvh icli
"''e present t lie elementar:y rriat erial on prob ability rriodels, count irig rriethods, con
ditional probability rriodels, and derived random variables. Iri this edition, the first
cha pter covers furidam entals, including axiorris and probability of events, and the
second chapter covers counting rnethods and sequential experirrierits. As before, vie
introduce discr et e randorri var iables and continuot1s r andorri variables in separ ate
chap ters. The subj ect of Chapter 5 is rrittltiple d iscret e a rid continuous r andom
variables. The first and second editions preserit derived randorn v ariables arid con
d itiona l r aridorn variables in t lie int rodt1ctions t o discrete a nd cont inuous randorri
variables. In this third editiori, derived randorn variables and coriditional randorri
..
VII
[
viii PREFACE
variables appear in their ovvn chapters>v.rliich cover botli discret e and corit int1ous
randorri v ariables.
Cliapter 8 introdt1ces r andom vectors. It exterids the rnat eri al on rnttlt iple ran
dorri v ariables in Chapter 5 a rid relies on principles of linear algebra to d erive
propert ies of randorri vectors that are useful in realworld data analysis and sim11la
t ions. Chapter 12 on est im ation relies on t lie properties of random vect ors derived
in Chapter 8. Chapters 9 through 12 cover st1bject s relevant to data analysis in
cluding Gaussian approxirnat ions based on the central lirnit theorem >estirnat es of
rnodel paramet ers, liypothesis t esting, and estirnatiori of randorn variables. Chap
t er 13 introduces stoch ast ic processes in t lie context of the probabilit}' rnodel t hat
guides the entire book: a n experirrierit consisting of a procedt1re and obser vations.
E ach of the 92 sections of the 13 ch ap ters ends wit h a quiz. B}' workirig on
the quiz and checking the solution at the book's vvebsit e, st11dents will get qt1ick
feedback on how v.rell the}' liave grasped t lie rriat erial in each sect ion.
vVe think that 60 80% (7 to 10 chapters) of t he book vvould fit into a orie serriester
t1ndergraduate cot1rse for begirining students in probability. \Ve anticipat e that all
courses will cover the first five cliapters>arid tliat instructors vvill select the rerriain
ing cot1rse conterit based on t lie needs of their students. The "roadrnap'> ori page ix
displays t lie thirteen chapter tit les and s11ggests a few possible uridergradt1at e syl
labi.
The Signal P rocessing Supplernerit (SPS ) and Markov Chains Supplernent (1!lCS)
are the firial clia pters of t lie third edition. Tliey a re now available at t he book's
vvebsite. They coritain postgraduatelevel rnaterial. \Ve, and colleagues at other t1ni
versities>ha\re used tliese tvvo chapters in gradt1ate courses that rnove very quickly
through the earl}' ch apters to revievv rnaterial alread}' farniliar t o students arid to
fill in gaps in learning of d iverse post graduat e populat ions.
PREFACE tx
FUNDAMENTALS
1. Experiments, mode ls, probabilities
2. Sequential experiments
3. Discrete random variab les
4. Continuous random variables
5. Multip le random variables
6. Derived random variables
7.

Conditional probability models
Further Reading
Libraries and bookstores contain ar1 endless collection of textbooks at all levels co\r
ering the topics presented in this textbook. We know of two in comic book forrr1at
[GS93 , PosOl]. The reference list or1 page 489 is a brief sarr1pling of books that
can add breadth or depth to the rr1at erial ir1 t11is t ext. 11Iost books on probability,
statistics, stochastic processes, and randorr1 sigr1al processing contain expositions of
[
x PREFACE
Acknowledgments
We are grateful for assista.nce arid suggestions from rriariy sources including our stu
dents at Rutgers and New York u niversities, iristructors "''ho adopted the previous
editions, revievvers, and the vViley tearn.
At \i\Tiley, vie are pleased to ackriowledge the ericomagement and entliusiasrn
of our executive editor D aniel Sayre and t he support of sponsoring editor l\/Iary
O 'Sullivan , project editor Ellen Keohane, production editor Eugenia Lee, and cover
designer Sarriantlia LoV\r.
vVe also convey special thanks to Ivari Seskar of vVINLAB at Rl1tgers University
for exercising his m agic to rriake the \i\TINLAB corriputers particularly hospitable
to the electroriic versioris of t lie book and to the Sllpporting material on t he \ i\T orld
vVide vVeb.
Tlie organization and content of t lie second edition has benefited considerably
frorri the input of m an}' fa.cult}' colleagt1es includirig Alhl1ssein Abouzeid at R ens
selaer Polytechnic Institt1te, Krishna Arora at F lorida State U niversity, Frarik
Candocia at Florida Iriternational Uriiversity, Robin Carr at Drexel u riiversit}r,
Keith Chugg at USC, Cliarles Doeririg at University of 11.Iicliigan, Roger Green
at N ortli Da kota State U niversity, vVitold K rzymien at U niversity of Alberta,
Edl Scharniloglt1 at Un iversit y of New l\/Iexico, Arthtu Dav id Snider at Un iver
sity of South F lorid a, Jur1shari Zliang at Arizoria State U niversit}r, and colleagt1es
Narayan l\/Ianda}rarri, Leo R azurriov, Christopher Rose, Predrag Spasojevic, and
vVade Trappe at Rutgers.
Uriiql1e arriong our teacliirig assistarits, Dave Farn ola ri took t he course as an
t1ndergradl1ate. Later as a. teacliing assistarit , lie did an excellent job V\rriting home
"''ork soll1tions v.rith a t utorial flavor. Other gradt1ate stt1dents who provided valt1
able feedback and suggestions on tlie first edition include Ricki Abboudi , Zheng
[
PREFACE x1
Cai, PiChun C11er1, Sorab11 Gupta , Va.he Hagopian , Arnar }/l ahboob, l van a J\/{aric,
David P andian , Mo11arr1rr1ad Saquib, Sennur Ulukus, and A:ylin Yer1er.
T11e first edition also b enefited frorn revievvs and suggestions corrveyed to the
pt1blisher by D.L. Clark at California State Polytechnic u r1iversit y at Pornor1a ,
JVIark Clements at Georgia T ech , Gust a:vo de Veciana at the Ur1iversity of T exas at
Austin, Fred Fontaine at Cooper U r1ion, Rob Fro11ne at vValla .\ iV alla College, C11ris
Genovese at Carr1egie Mellon, Simon H a:ykin at J\!{cJVIast er , and R atnesh Kurr1ar at
the University of Kentucky .
Finally , "''e acknov.rledge "''ith respect arid gr atitude t11e inspiration and g11idance
of our t eac11ers and rr1en t ors vvho conve}red to us "''lien \Ve wer e st11dents the im
portan ce and elegance of p robability theo1y . \i\Te cite ir1 particular Robert Gallager
and the late Alvin Drake of MIT and t 11e lat e Colir1 Cherry of Imperial College of
Science and T echr1olog}'
xii PREFACE
problerris , and if }' Oll dori 't answer all the quiz qt1est ions correctl}' , go over them
t1nt il } ' OU underst and eac11 one.
vVe can 't resist corrirnenting on the role of probability and stochastic processes
in ot1r careers . The theoretical material covered in this book has helped bot h of
us devise ne\v corrirnunicat ion t ecliriiques arid improve the operation of practical
syst erns. '\'/Ve hope you fir1d t he subject intririsicall}' iriteresting. If you rriast er the
basic ideas , }' Oll will have many opport uriities to apply thern in other courses a rid
throughout yol1r career.
We h ave worked hard t o produce a text that "''ill be useful to a large population
of students and instructors . We "''elcorne comrrierits, criticism , and suggestioris .
Feel free t o send llS erriail at ryates @111irdab. r~u,tgers. edv, or dgoodrna,n,@poly. edv,. Iri
addition, tlie "''ebsite www . wi ley.com/ col lege/yates provides a variet}' of st1pple
rriental rriaterials, includir1g the MATLAB code used to produce t lie examples in t he
t ext.
Contents
.
Fea,tv,res o.f this Text i
..
Pre,fa,ce vii
2 Sequential Experiments 35
2.1 Tree Diagrams 35
2. 2 Counting Methods 40
2.3 Independent Trials 49
2.4 Reliability A na,lysis 52
2. 5 1\1!.A.TLAB 55
Problems 51
XIV CONTENTS
CONTENTS xv
7.3 Conditioning Two Random Va,ria,bles by an Event 252
1.4 Conditioning by a Ra,ndom Varia,ble 256
7. 5 Condition,al Expected VaJ11,e Given a, Ra,ndom Va,riable 262
7. 6 Biva,ria,te Ga,ussia,n Ra,ndom Va,ria,bles: Conditional
PDFs 265
7. 7 1\11.;\TLAB 268
Problems 269
XVI CONTENTS
Re.f'eren,ces 4 89
Index 491
[
Experiments, Models,
and Probabilities
The t it le of this book is Proba b'i lity arid Stochastic Processes. We say arid 11ear and
r ead t h e vvord probability and its relatives (possible; probable; probably) in rna n}'
contexts. vVit 11in t he r ealrn of applied rr1athem atics, t h e rneaning of probability is
a question t hat has occl1pied rnathernaticians, philosophers, scient ists, and social
scient ists for hundreds of years.
Everyone accepts t h at the probabilit}' of an ever1t is a nurnber between 0 an d
1. Sorr1e people interpret probability as a physical property (like m ass or volume
or t ernperat tue) t 11at ca.r1 be rr1easl1red. This is t err1pt ir1g v.rhen \'Ve talk abotlt t he
probability that a coin flip v.rill corne tlp heads. This probabilit}r is closely relat ed
to t 11e nature of t 11e coin. Fiddlir1g around '\A.Tith t he coir1 can alter t he probabilit}r
of heads.
Anot her ir1terpretation of proba.bilit}' relat es to t he knowledge t hat we h ave abol1t
sornethir1g. We rnight assig11 a low probability t o t he trt1t h of the stat ernent, It is
rain,in,g n,ovJ 'iri Phoeriii;; A rizon,a, because '\Ve kno'\v t h at Phoer1ix is in t h e deser t .
However , our knovvledge ch an ges if we learn that it was r a ir1ing an hot1r a.go in
P 11oer1ix. This knowledge '\vould cause us to assign a higher proba bility t o t he
t rut h of t he stat ernent , It is rainin,g r1,ov1 in, Phoen,ix.
Both vie'\vs a re useft1l w11en '\Ve apply probability t heory t o practical problems.
VV11ichever vieV\r '\A.Te t ake, V\Te V\Till r el}r Oil t he abstract rr1athem atiCS Of probability,
"'' hich consists of definitions, axiorns , an d ir1ferer1ces (t heorerns) t hat follow frorn
t he axiorns. W hile t 11e structt1r e of the Sl1bj ect conforms t o principles of pt1re logic,
t he terrr1inology is not er1t irely abstract . Inst ea,d , it reflects the practical origins
of probability t heory, which '\vas developed t o describe phenomena that cannot be
predicted wit h cer tair1ty. The point of view is differ ent frorr1 t 11e one vve t ook '\vl1en
"''e started studying physics . Ther e we said t h at if '\Ve do t he sarr1e t hing in t he
sarr1e '\vay over a nd O\rer again  send a sp ace sht1t tle ir1to orbit, for exarr1ple 
1
[
the result will always be t he same. To predict the rest1lt , vve h ave t o t ake account
of all relevant fact s.
T lie rriathematics of probabilit.Y begiris when the sit uat ion is so cornplex t liat we
just can't replicate everything irnportant exactl}', like when vie fabricat e and t est
an iritegrat ed circ11it . In t liis case, repetit ions of the sarne procedure yield different
res11lts. The situation is not t otally chaotic, ho\vever. W liile each outcorrie rriay
be unpredict able, t here are consist ent patterns t o be observed vvhen "''e repeat t he
procedure a large nurriber of tirries. Understariding these patterns helps engineers
est ablish t est procedures t o erisure t hat a factory meet s quality obj ectives . Iri t his
repeatable procedure (rnaking and testing a chip) "''ith llnpredictable outcornes (the
quality of individual chips), the probability is a number bet vveeri 0 arid 1 that st at es
the proport iori of t irnes \ Ve expect a cer tairi thing t o happen , such as t he proport ion
of chips that pass a t est .
As an introd11ction to probabilit}' and stochastic processes, tliis book serves tliree
p11rposes:
It int rod11ces stt1dents t o t lie logic of probability t heory.
It lielps st t1dents develop irit uition into ho\v tlie theory relat es to practical
situat ions.
It t eaches students hovv to apply proba bility t heory t o solving erigirieering
problems.
To exliibit tlie logic of t he subject , vve slio\v clearly in t he t ext three categories
of t lieoretical material: d efinitions, axiorns, and t heorerris. Definit ions est ablish
the logic of probabilit y t heor}', a rid axiorris are facts that vve accept wit hot1t proof.
Theorerns are coriseqt1ences t liat follow logically from definitions arid axiorris. Each
theorern lias a proof tliat refers t o definitions, ax iorns, and ot her tlieorerns. Al
though there are dozens of defiriitions and theorems , t liere are only three axioms
of probability theory. These t hree axioms are the fo11ndatiori on "'' hich the entire
subj ect rest s. To rrieet our goal of presenting the logic of the s11bject , vve could
set out t lie rnaterial as dozens of definit ions folloV\red by three axiorns followed by
dozens of t heorerns. Each theorern \vould be accompanied by a complet e proof.
W hile rigorous, this ap proacli would cornpletel}' fail to meet our second airn of
conveying the irit uit ion necessar}' t o work on practical problerns. To address this
goal, we augrnent the purely rnat1iernat ica1 rriaterial vvith a large number of exarnples
of practical pheriomena t riat can be anal}rzed b}' means of probability theory. We
also interleave definit ions arid theorerns, presenting sorne t heorerns wit li complete
proofs, presenting others vvit h partial proofs, and omitting sorne proofs alt ogether.
vVe find t h at most engirieering stt1dents stt1dy probability "''it h t he airn of llSing it
to solve practical problerns , a rid we cater rnostly to this goal. \N"e also enco1rrage
students to take an interest in t lie logic of the subject  it is very elegant  arid
"''e feel t liat t lie rnaterial presented is st1fficient to enable these students to fill in
the gaps we ha;ve left in the proofs .
Therefore, as } ' OU read t his book you will find a progression of defiriit ions, axiorris,
theorerns , more definit ior1s , and rriore t lieorerris, all interlea:ved wit li exarnples and
comments d esigried t o contrib11 te to yot1r underst anding of t he t heory. \N"e also
inclt1de brief qt1izzes that you shot1ld try t o solve as you read the book. Each one
[
This riotation tells us to form a set by perforrriing tlie operat ion to t lie left of the
vertical bar , I, on t he nl1mbers to t he right of the bar. T herefore,
it follows that C c I , an d D c I .
The defiriition of set eql1ality, A = B , is
A = B if and onl}' if B C A arid A C B.
This is t he rnathernatica.l \vay of st ating that A and B are ident ical if and orily if
every elernen t of A is an element of B and every elernen t of B is ari element of A .
This definit ion implies t h at a set is unaffected b}' tlie order of the elerrierits in a
definition. For exarnple, {O, 17, 46} = {17, 0, 46} = {46 , 0, 17} are all t he same set.
To vvork vvith set s matlierriat ically it is necessary t o defirie a v,n,iversal set. This
is the set of all t hings t ha t vve coltld possibly consider in a giveri contex t . In an}'
study , all set operations relate t o the 11riiversal set for t liat stud}' The rnernbers of
the uriiversal set include all of the elemeri ts of all of the set s iri the study . \ e "''ill
use t he letter S t o denote the universal set . For ex arriple, the universal set for A
cotlld be S = {all universities in the u nited States , all planets} . Tlie uriiversal set
for C cou ld be S = I = { 0, 1, 2, ... }. B}' d efin ition, every set is a subset of t he
universal set . T hat is, for any set X , X c S .
The n,ull set , which is also irriportarit, ma}' seem like it is not a set at all. By
defin it ion it h as no elernerits . Tlie notation for tlie null set is 0 . By definition 0 is
a subset of e\rer}' set. For an}' set A , 0 c A .
A
It is customary t o refer to Venn diagrams t o displa}'
relationslrips arnong sets . By cori\rention, t lie region
enclosed by t he large rectangle is the uriiversal set S .
Closed surfaces \vitliin t h is rectarigle denote sets . A
Venn diagrarri depicting the relat ionsmp A c B is
shovvn on the left.
[
vVhen we do set algebr a , v..re forrr1 ne\v set s from existir1g sets . There are t hree oper
ations for doing t his: 7J,Tl, io'TI,, i'Tl,tersect'ion,, and cornplerne'Tl,t. Un ion and intersection
cornbir1e tvvo existing sets to produce a third set. The complement operation forms
a ne\v set frorn or1e existir1g set . The notation and definitions follovv.
The 'un,ion, of sets A and B is t:he set of all elerr1en ts
AUB that are eit11er in A or ir1 B , or in both. The unior1 of
A arid B is denoted by A U B. In this Venr1 diagrarn,
I A U B is the corr1plete shaded area. Forrr1ally,
Ir1 vvorking \vit h probability "''e \vill often refer to t vvo irnportant properties of col
lectior1s of sets . Here are t11e definitions .
A1 A2
(1.8)
Ir1 the defir1ition of collect ively eJ}ia'ustive , v..re used t 11e sorr1e\vhat curr1bersorne no
t ation A1 U A2 U U An for t he t1nion of J\T sets. Just as 2=:
1
'
1 x ,i is a short 11and
for x 1 + x 2 + +1;n, we will use a s11orthand for l1r1ions and intersections of n, sets:
n
LJ Ai = A1 U A2 U U An, (1.9)
'i= l
n
n
Ai = A1 n A2 n n An (1.10)
i= l
We \vill see that collections of sets that are bot 11 rr1ut11all}' exclusive and collec
tively exhaustive ar e sufficiently useft1l to rr1erit a definitiori.
Frorn the definition of set operatior1s, we can derive rr1any irr1portant relationships
betvveer1 sets and ot her sets derived frorr1 thern. One exarnple is
A n B c A. (1.11 )
To prove that t l1is is t rue , it is necessary t o sho\v t hat if ;i; EA n B , t h er1 it is als o
true that x EA. A proof t 11at tv..ro set s are eq11al, for exarnple , X = Y , req11ires tvvo
separ at e proofs : X c Y and Y c X. As we see in t he followir1g t11eorem , t11is can
be corr1plicat ed t o s11ovv.
Quiz 1.1.
Gerlandas offers custorners tv.ro kir1ds of pizza crust, T uscan (T)
and Neapolitan (N) . In a,ddition, each pizza rnay have rnush
roorr1s (M) or onior1s (0') as described by the Ven r1 diagram
at right. For the sets specified belovv, shade the corresponding r
regior1 of t11e venn diagram.
(a) N (b) N u J\!f
(c) J\Tn M (d) T Cn ]VJC
Example 1.2
An experiment consists of the following procedure, observation, and model:
Procedure: Monitor activity at a Phonesmart store .
Observation: Observe w hich type of phone (Apricot or Banana) the next customer
purchases.
Model: Apricots and Bananas are equa ll y likely. Th e result of each purchase is
unrelated to the resu Its of previous purchases.
These t wo experirr1er1ts have t11e sa.rr1e procedure: rnor1itor t11e P hor1esrnart store
until t11ree custorr1ers purchase phor1es. They a.re different experirner1ts because the:y
require difl'erer1t observations. vve will describe models of experiments in terrr1s of a
set of possible experimental outcorr1es. In the context of probability, vve give precise
rneanir1g to the vvord outcorne.
Irr1plicit in the definition of an outcorne is the notion that each outcome is distin
guis11able frorn every ot11er Ol1tcome. As a result, vie defir1e the universal set of a ll
possible outcornes. In probability terrns , vve call this l1r1iversal set t11e sarnple space.
The fin,estgrain, property sirr1pl}' mear1s that all possible distir1gl1ishable outcornes
are ident ified separately. T he requirement that 011tcornes be rr1utually exclusive
sa}'S that if or1e outcome occurs, then no other Ol1tcorne also occurs. For t 11e set of
out cornes to be collectivel}' exhaustive, every ot1tcorr1e of the experiment mt1st be
in the sarnple space.
Example 1.6
Manufactu re an integrated circuit and test it to determine w hether it meets quality
objectives. T he possib le outcomes a re "accepted" (a) and "rejected" (r) . T he sa m pie
space is S = {a , r}.
[
Ir1 corrrrr1on speech, a n event is sorr1et hing t hat occurs. In a n experirr1ent , \Ve
rr1ay say that an ever1t occurs when a certain phenomenon is observed. To define
an event rnatl1err1atica.lly, vve rr1ust ider1tify all 011tcomes for \vhich the phenornenon
is obser ved. That is, for e[t.Ch outcorne, either t he particular ever1t occt1r'S or it does
not. In probabilit}' t errns, \Ve define an event in terrns of t he outcorr1es in t he sarr1ple
space.
Table 1.1 relates t he terrr1inology of probability t o set t heory. All of this ma}'
seern so s irr1ple t h at it is borir1g. vVhile t his is true of t he defirlitions therriselves,
applying t11err1 is a different rr1atter. Definir1g the sample sp ace and its outcomes
are key elem ents of the solution of any probability problerri. A probabilit}' problem
arises frorn some practical situation t hat can be rnodeled as an experirr1ent . T o vvork
on the problerr1 , it is necessary to define the experirr1er1t carefull}' and then derive
the sarr1ple sp ace. Getting t his right is a b ig step t ovvard solvir1g the problerri.
Quiz 1.2
]\/I onitor t 11ree consecut ive packets going through a Internet rot1ter. Based or1 t 11e
packet 11eader , each packet can be classified as either video ( v) if it was ser1t frorr1
a Yout ube server or as ordinar}' data. ( d) . Yo11r observatior1 is a sequen ce of t hree
letters (each letter is either v or d) . For exarnple, t wo video packets follovved by
one data packet corresponds to vvd . \i\Trite the elements of t he followir1g sets:
F or each pair of events A1 and B1 , A2 and B2 , an d so on, iden t ify whether t he pair
of events is eit 11er rr1ut ually exclusive or collectively exha11stive or bot h.
We v.rill build our entire t heory of probabilit:y on t hese t hree axiorns. Axiorr1s
1 and 2 sirnply establish a proba bilit}' as a r1urnber betV\reen 0 and 1. Axiorr1 3
stat es that t:he probability of t:he ur1ion of rnutl1ally exclusive events is the Sl1rr1 of
the individual probabilities . \ N"e vvill llSe this a.xiorr1 over and over in developing
the theory of probability and in solving problerris. In fact , it is really all v.re 11ave
to v.rork witli. Ever}rthing else follovvs from Axiorn 3. To use Axiorn 3 t o solve a
practical problem , we "''ill learn in Section 1.5 to ar1a.lyze a corr1plicat ed event as t 11e
l1nior1 of rr1ut ually excll1sive events whose probabilit ies "''e car1 calculate. Then , we
"''ill add t he probabilities of t 11e rr1ut11ally exclusive events to fir1d the probability of
the corr1plicat ed ever1t \Ve are interest ed iri.
A useful exter1sion of Axiorr1 3 a pplies t o the llnion of two rr1l1t 11a lly exclusive
events.
Alt hough it m a}' appear that Theorerr1 1.2 is a trivial special case of Axiorr1 3, this
is not so. In fa ct, a sirr1ple proof of Theorerr1 1.2 rr1ay a lso llSe Axiorr1 2! If you ar e
curio11s, Problern 1.3.13 gives t he first st eps to\vard a proof. It is a sirnple rnatter
to extend Theorern 1.2 to an}' finite union of rr1ut u ally exclusi,re set s.
   Theorem l.3  
If A = A 1 u A2 u ... u Arn an,d Ai n A.i = 0 f OT i =J j ) then,
1n
p [A] = L p [Ai ] .
i= l
In Chapter 10, we s110vv that the probability measure established by t 11e axiorr1s
conesponds to the idea of r elati,re freq11er1C}' T h e correspondence r efers t o a se
quent ial experiment cor1sisting of n, repetit ions of the basic experirnent . We refer to
each repetit ior1 of t11e experirnent as a trial. In t11ese ri trials, NA (n,) is t he number
of times that e\ren t A occurs . The relative frequen cy of A is t he fractior1 NA (n,) /r1,.
Theorerr110.7 proves that limn+oo NA(ri)/n, = P[A].
[
P [A u BJ = P [A) + P [BJ  P [A n BJ .
Theorem 1.5
The probability of ari everit B = { s 1 > s 2 , ... , Srn} is th e s11,rn of the probabilities of
the outcornes con,tain,ed iri the e'Uen,t:
rn
P [B] = L P[{si}] .
i= l
Proof Each outcome Siis a n even t (a set) w it h t h e single elemen t Si . Since outco1nes by
definition are mutually exch1sive, B can be expressed as t he union of m mutually exclusive
sets:
(1.13)
\Vit h { si } n { Sj} = 0 for i =P j . Apply ing Theorem 1.3 \Vi t h B i = {si} yields
Comments on Notation
vVe use the notation P [] t o indicate t11e probability of an event . The expression in
the square brackets is an event. W ithin t he context of one experirnent , P[A) can
be vievved as a fur1ction t11at t ransforrns event A to a nurnber between 0 and 1.
[
Note that { s,i } is the forrnal notation for a set vvith t he single elerr1en t si . For
convenience, vie will sorr1etimes vvrit e P [si] r ather t.han the rr1ore cornplete P[ {s,i}]
to denote the probability of this Ol1tcorr1e.
We will also abbreviat e t he notation for the probability of t:he intersection of tv.ro
events, P[A n B). Sorr1etirnes \Ve "'' ill ,vrite it as P [A , B] and sornetirr1es as P[AB].
Thus b}' definition, P[A n B ) = P [A , B ) = P [AB).
Theorem 1.6,::==
For ari experirnen,t 'tvith sam,ple space S = { s 1 , ... , Sn} in, vJhich each outcorne si 'is
equally likely,
1 < 'i < ri.
Proof Since a ll outcomes have equal probability, t here exists p such t h at P [si] = p for
i = 1, ... , n . Theorem 1.5 implies
Example 1.10
As in Example 1.7, roll a sixsided die in which all faces a re eq ually likely. What is the
probability of each outcome? Find t he probabilities of t he events: "Rol l 4 or higher,"
"Roll an even nu mber ," and "Roll the square of an intege r."
T he probab ility of each outcome is P[i] = 1/ 6 for 'i = 1, 2, ... , 6. The pro babilities of
t he th ree events are
P[Rol l 4 or hig her) = P [4) + P[5] + P [6] = 1/ 2.
P[Rol l a n even number) = P[2] + P [4) + P[6] = 1/ 2.
P[Rol l t he square of a n integer) = P[l] + P [4) = 1/3 .
Quiz 1.3
A stl1dent 's test score T is ar1 ir1teger betvveer1 0 a nd 100 corresponding to the
experirr1ental ot1tcomes so , ... , s 100 A scor e of 90 to 100 is ar1 A , 80 to 89 is a B ,
[
a high proportion of defect ive chips. When the first chip is a reject, the outcome of th e
experiment is in event B and P (A IB J. the probability t hat the second chip will a lso be
rejected, is higher than the a priori probability P( A] because of the likelihood that dust
contam ina ted the ent ire wa fer.
P (A IB] = P [AB] .
p (B ]
Theorem 1.7
A coriditiorial JJ'ro bability rneas11,re P (A IB] has the fo llo111irig pro1Jert'ies that corre
sporid to the axiorns of probability.
A J;iorri 1: P [A IB] > O.
A xiorri 2: P [B IB ] = 1.
A xiorn 3: If A= A1 u A 2 u v.1ith A i n A.i = 0 f or i ~ j, then,
Example 1.12
With respect to Example 1.11, consider the a priori probabi lity model
Find the pro bability of A = "seco nd chip rejected" and B = "first ch ip rejected ." Also
find the cond itional probability that the second chip is a reject given that the first ch ip
is a reject .
We saw in Example 1.11 t hat A is the union of two mutually excl usive events (outcomes)
rr a nd ar. Therefore, the a priori probabi lity t hat the second chip is rejected is
The co ndit ional probabilit y of the second ch ip being rejected given t hat the first c hi p
is rejected is, by defi nition, the ratio of P [AB] to P [B ], where, in this exam ple,
Thus
P [AB]
P [A IB] = P [BJ = 0.01 / 0.02 = 0.5. (1.20)
The in formation that the first chip is a reject drastically changes our state of knowledge
a bout t he second chi p. We started wit h near certainty, P[A] = 0.02, t hat the second
chi p would not fail a nd e nded w ith comp lete un certainty about the q ua lity of the second
ch ip, P[AIB ] = 0.5.
 = Example 1.13
Shuffle a deck of cards and observe the bottom card. What is t he cond it ional probabil ity
that t he bottom card is t he ace o f cl u bs give n that the bottom card is a black card?
The sample space consists of the 52 cards t hat can appear on t he bottom of the deck.
Let A denote the event that t he bottom card is the ace of c lubs. S in ce a ll cards are
equally li kel y to be at the bottom, the p robab il ity that a particular card, such as the
ace o f cl u bs, is at the bottom is P [A] = 1/ 52. Let B be the event that the bottom
card is a black card. The event B occurs if the bottom card is one of the 26 clubs or
spades, so that P[B] = 26/ 52. Given B, t he cond itional probability of A is
Example 1.14
Roll two fair fou rsided dice. Let X 1 and X 2 denote t he number of dots t hat a p pear
on die 1 and d ie 2, respectively. Let A be the event X 1 > 2. What is P[A]? Let B
denote the event X 2 > X 1 . What is P [B]? What is P[A IB]?
x2 15
w~b~g i~ b;,~b~~~~i.~g th~t the ~~ ~ pi~ sp~~~ h~~ el~
1 2 3 4
X 1 ues (X1 >X2 ) . The rectangle re presents A. It conta ins 12
outcomes, each w ith probab ility 1 / 16.
[
To find P(A], we add up the probabi lit ies of outcomes in A , so P(A] = 12/ 16 = 3/ 4.
The triangle re presents B. It co nta ins six outcomes. Therefore P(B ] = 6/16 = 3/8 .
The event AB has t hree o utcomes, (2, 3) , (2, 4), (3, 4) , so P[AB] = 3/ 16. From the
definition of cond itional probability, we write
P [AIB ] = P [AB]
p [B ]
 21 (1 .22)
We can also derive th is fact from the diagram by restrict ing our attention to the s ix
outcomes in B (t he cond it ioning event) a nd noting that three of the six outcomes in
B (oneha lf of the tota l) are a lso in A.
Quiz 1.4
]\/I onitor three cor1secl1t ive packets goir1g through an Interr1et router. Classify each
one as eit her video ('v) or dat a (d) . Yotlr observatior1 is a seqtlence of three letters
(each one is either v or d) . For exam ple, three video packets correspor1ds t o vvv .
T he Otltcorr1es vvv and ddd each ha:ve probability 0.2 whereas each of t he other
ot1tcomes vvd, vdv, vdd, dvv , drvd, an d ddv has probabilit:y 0.1. Col1nt the nl1rnber
of video packets JV v in the three packets yol1 have observed. Describe in vvords and
also calc11late the follo,vir1g probabilit ies:
(a) P (JVv = 2] (b) P [Nv > 1]
( c) P [{ vvd} IJ\Tv = 2] (d) P({ ddv }INv = 2]
(e) P [Nv = 21Nv > 1] (f) P [Nv > 1 INv = 2]
A partit ion d ivides the sarnple space into rn t1tt1a.lly excll1sive set s.
T he lavv of total probabilit}' expresses the probability of an even t
as the st1rr1 of t he p robabilit ies of Ol1tcomes t hat a.re in the separat e
set s of a part ition.
Example 1.15
Fl ip four coins, a penny, a nickel, a dime, a nd a quarter. Examine the coins in order
( penny , then nickel, then dime, then q uarter) a nd observe whether each coin shows a
head (h,) or a tail (t) . W hat is t he sample s pace? How many elements are in the samp le
space?
The sample space consists of 16 fo urlett er words, with each letter either h, or t. For
examp le, t he outcome tth,h, refers to the pen ny and the nicke l show ing tails and t he
dime and quarte r showing heads . The re are 16 members of t he sample space.
[
A B, c,
Figure 1.1 In t his example of T h eorem 1.8, t h e p ar tit ion is B = {B1, B2, B3, B4} and
Ci= _4 n Bi for i = 1, ... , 4. It sh ould be a ppa ren t t h at A = C 1 u C2 u C3 u C4 .
  Example 1.16~=~
The experirr1er1t in Exa rnple 1.15 and Exarnple 1.16 refers to a "to:y problem ,"
one that is easil}' visualized but isn 't sorr1ethir1g \Ve "''ould do in the course of our
professional "''ork. 11Iathernatica.lly, 11owe\rer, it is equivalent to rnar1y real engi
neering problems. For ex ample, obser\re a pair of rr1oderns tra r1srr1itting fol1r bits
frorr1 one corr1puter t o another. For each bit, observe \vhet11er the receiving rr1odern
det ects the bit correctl}' (c) or rr1akes an error (e) . Or t est four integrat ed circuits.
For each one, observe "'' hether the circl1it is acceptable (a) or a reject (r) . Ir1 all
of t hese examples, the sarnple sp ace contains 16 fo11rletter words formed \vith an
alp11ab et conta.inir1g tV\ro ~etters. If \Ve are ir1ter ested only in the r1urr1ber of tirnes
one of the lett ers occurs, it is sufficient t o refer only t o t 11e partit ion B , which does
not con tair1 all of t he inforrr1ation about t11e experiment but does cor1tain all of
the information vve r1eed. The pa rtit ior1 is sirnpler to deal "''ith t 11an t h e sarnple
space because it h as fewer rr1err1bers (t here a re five e\rer1ts in the partit ion and 16
outcornes ir1 the sarr1ple sp ace) . The simplification is rr1ucl1 more signifia1nt \vl1en
the cornplexity of the experiment is 11igher. For exarr1ple, in t esting 20 circuits the
sarnple sp ace 11as 2 20 = 1,048,576 rr1err1bers, while the corresponding p art ition has
only 21 rr1err1bers .
[
We observed in Sect ion 1.3 that the en t ire theory of probability is based or1 a
t1nion of rr1t1tually exclusi ve event s . The followir1g t heorerr1 sl1ov.rs 110vv t o use a
partition to represent an event as a union of mutually exclusive events.
A = C1 U C2 U .
   Example 1.11  
ln t he cointossi ng experiment of Exa mple 1.15, let A eq ual t he set of outcomes with
less t ha n three heads :
A = {tttt, httt, thtt, ttht, ttth , hhtt, h,th,t , htth, tth,h , th,th,, th,h,t} . (1 .23)
(1.24)
We advise }'Oll t o rr1ake sure }'Oll l1nderstand Theor err1 1.8 and Example 1.17.
]\/! any practical problerns use the rr1athernatical techniq11e contained in t11e theorerri.
For exarr1ple, find t11e probabilit}' that there are three or rnore bad circ11its ir1 a batcl1
that cornes frorn a fabrication m achine.
The followir1g theorern refers t o a partitior1 {B1 , B 2 , ... , Brn} and any event , A.
It stat es that we can find t11e probability of A by adding t11e probabilities of the
parts of A t11at are in the separate componer1ts of the event space.
P [A] = L P [A n B,i] .
i= l
P roof The proof follows d irectly from Theorem 1.8 a nd T heorem 1.3. In t his case, the
mutually exclusive sets are C i = {4 n B i } .
[
Theorerr1 1.9 is oft en used wh en the sarr1ple space can be vvritten in the forrr1 of a
table. In this table, the rows and columns each represent a partition. This rnethod
is s11own in t he follovving exarr1ple.
Example 1.18
A company has a model of email use. It classifies a ll emails as e ither long (Z) , if they
a re over 10 MB in size, o r brief (b). It also observes whether the ema ii is just text
(t), has attached images (i), or has an attached video (v). T his model implies an
experiment in which the procedure is to monitor an email and the observation consists
of the type of email, t, i , or v, and the length, l orb. The sample space has six
outcomes: S = {lt ,bt ,li,b'i ,lv,bv} . In this problem, each email is classifed in two
ways: by length and by type . Using L for the event that an email is long and B for the
event that a email is brief, {L, B} is a partitio n. Similarly , the text (T) , image (I), and
video (V) classification is a partition {T, I , V}. The sa m pie space can be represented
by a table in which the rows and columns are labeled by events and the intersection of
each row and column event contains a single outcome. The corresponding table entry
is the probability of that outcome . In this case , the table is
T I V
L 0.3 0.12 0. 15 (1.25)
B 0.2 0.08 0. 15
For example , from the tab le we can read that the probab ility of a brief image email is
P[bi] = P[BI) = 0.08. Note that {T, I , V} is a partition corresponding to {B1, B 2, B 3 }
in Theorem 1.9. Thus we can app ly Theorem 1.9 to find the probability of a long ema il:
Proof This follo,vs from Theorem 1.9 and t he identity I=> [ABi ] = l=> [AIBi] P[Bi], which is a
direct consequence of t he definition of conditional probability.
(1.27)
The production figures state that 3000 + 4000 + 3000 = 10 ,000 resistors per hour are
prod uced. The fraction from machine B 1 is P[B1) = 3000/ 10,000 = 0.3. Simi larly,
P[B2 ) = 0.4 and P[B 3) = 0.3. Now it is a sim ple matter to apply the law of total
probabi lity to find the acceptable probabil ity for all resistors shi pped by the compa ny:
For the whole factory , 78% of resistors are withi n 50 n of the nomina l value.
Bayes' Theorem
W hen vie have advance information about P[A IB ] and need to calcl1late P[BIA],
"''e refer to t he follovving forrr1ula:
P[BIA) = P[AIB]P[B]
p [A) .
Proof
"''e cannot obser ve directly (for exarnple) the rnachir1e that rnade a particular resis
t or ) . F or each possible sta,te, B i, v.re kr1ovv the prior probability P [B.i:] and P [A IBi],
the probability that ar1 ev ent A occurs (the resist or rneets a quality criterion) if
B i is the actl1al st ate. No\v we obser\re the actl1al even t (either t 11e resistor passes
or fails a t est ) , a rid \ve a.sk abou t t11e t11ing "''e a re inter est ed in (t11e m ac11ines
that migh t have produced the resistor) . That is) vve use Ba:yes' t heorern to find
P[B 1 IA], P [B 2IA], ... , P[B1nl A]. In perforrr1ing t he calculations, \Ve use t11e laV\r of
t otal probabilit}' t o calc11lat e t he denorninat or ir1 T 11eorerr1 1. 11. Thus for st ate B .i,
(1.31)
p [B3IA] I
= (0.6) (0.3) (0. 78) = 0.23. (1.33)
Similarly we obtain P [B 1 IA] = 0.31 and P [B 2IA] = 0.46. Of all resistors within 50 D
of t he nominal value, o nly 23% come from machine B 3 (even though this m ach in e
produces 30% of al l resistors). Machine B 1 produces 31% of the resistors that meet
the 50 D criterio n and machine B 2 produces 46% of them .
Q uiz 1.5
]\/Ior1itor cust orner beh avior in t 11e Phonesrnart store. Classify t he behavior as b l1y
ing (B ) if a custorner pl1rchases a sm artphor1e. Ot hervvise t he beh avior is no pur
chase ( N) . Classify t 11e tirne a customer is in the st or e as long ( L ) if t he Cl1storner
sta}'S rnore t h ar1 t hree rnir1utes; otherwise classify t he arnount of t ime as r apid
(R). Based on experien ce with ma n}' c11storners, we llSe t he prob ability model
P[N ] = 0.7 , P[L] = 0.6 , P[N L] = 0.35. Fir1d t 11e following probabilities :
(a) P [B u L] (b) P[JVU L]
(c) P[N U B] (d) P[LR]
[
1.6 Independence
P [AB] = P [A] P [B ] .
vVhen events A and B ha\re r1onzero probabilities, the follov.rir1g formulas are equiv
alent to t11e definition of ir1dependent events :
Each element of the sa m p ie space S = {bbb, bl!r1,, brib , bn:n,, n,bb, n,bri, n/nb, n,rin,} has
[
1.6 INDEPENDENCE 25
(1.35)
In this exarr1ple vve ha\re analyzed a probability model to deterrnir1e 'ivhether two
events are independent. In rr1any p ractical applicatior1s 'ive reason in t he opposite
direction. Our kr1owledge of an experirnent leads us t o ass?J,rne that certain pairs of
events are independent . We t hen llSe t:his knovvledge t o b11ild a probability rr1odel
for the experirnent.
= Example 1.22
Integrated circuits undergo two tests. A mecha nical test determines whether pi ns have
the correct spaci ng , and an electrical test checks the relationshi p of outputs to inputs .
We assume that electrica l fa ilures and mechanica l fai lures occur independently. O ur
information about circuit production tel ls us that mec han ica l failures occur with prob
ability 0.05 and electrical failures occur wit h pro bability 0.2. What is t he probab ility
model of an experiment that consists of testing an integrated circu it and observing the
resu lts of the mechan ical and electrical tests?
To bu ild the pro bability model, we note that t he sample space contains four outcomes :
T11us far , v.re have con sider ed independen ce as a propert:y of a pair of ever1ts .
Often we consider larger sets of independer1t events. For rnore thar1 tvvo events to
be in,de1Jeriderit, the probabilitJr rr1odel has to rneet a set of conditions. To define
rnutual independence, vve begin \vitr1 tr1ree set s.
The final cor1dition is a sirr1ple exter1sion of Definition 1.6. The follovving exarr1ple
shows \Vh}' t his cor1dition is ir1sl1fficient to guarantee that "everything is ir1dependent
of e\rerythir1g else," the idea at the 11eart of indepen dence.
T hese three sets satisfy the fina l cond it ion of Definition 1.7 becauseA 1 n A 2 n A 3 = 0,
and
(1 .41)
However, A 1 and A 2 are not independent because, with all outcomes equ iprobable ,
1. 7 l\llA TLAB 27
This defir1it ion and Exa rr1ple 1.23 shov.r us that v.r11en ri > 2 it is a corr1plex m atter
t o det errnir1e w11et l1er or not ri e\rents ar e rr1ut ually ir1dep er1dent. On t h e ot11er
hand ) if we knovv t 11at n, events are rnt1t tu.1lly ir1dependent ) it is a sirr1ple m atter t o
det errnine the probability of t11e ir1tersection of any st1bset of the n, events. Just
rr1ultipl:y t11e probabilities of t he events in the su bset .
Quiz 1.6
l\llonit or two consecutive packets going t hrough a router. Classify each or1e tiS video
(v) if it v.ras sent frorr1 a Y out ube server or as ordir1a ry d at a ( d) otherv;.rise. Yot1r
observation is a seqt1ence of t wo letters (either v or d) . For ex ample) two video
p ackets correspor1ds to vv . The t v.ro packets are indeper1dent and t he probability
t h at any one of t 11em is a v ideo p acket is 0 .8. Denote t 11e ident ity of p acket i by C,i .
If packet i is a video p acl<:et ) then Ci = v; othervvise) Ci = d. Cour1t t h e r1urr1ber
JVv of video packets in t he t wo packets } ' OU have obser\red. Deterrr1ine w11ether t 11e
following pairs of e\rents a.re independer1t:
(a) {Nv = 2} arid {J\Tv > 1} (b ) {Nv > 1} and {C1 = v}
(c) {C2 = v} and {C 1 = d} (d ) {C2 = v} and {Nv is even }
1. 7 1\1.IATLAB
MATLA.B can be used two ways to stt1dy and appl}' probability t heory. Like a
sophisticated scient ific ca.lculator , it can perform cornplex n11merical calct1lations
and draw gr aphs. It can also simt1late experirr1ents vvith random outcorr1es . To
sirr1ulate experirr1ents, vie need a so11rce of r andomr1ess. MATLAB t1ses a computer
algorit hm, referred to as a pse'udora,n,dorn 'n11,rnber gerierator, to produce a sequence
of nurr1bers betv.reen 0 and 1. Unless sorr1eone knovvs t he algorithm, it is irr1possib le
to ex amine sorr1e of the nurr1bers ir1 t11e seqt1ence and t11ereb}' calculate others .
The calculation of each randorn n11mber is sirnilar to ar1 experirnent in vvhich all
outcornes are equally likely and the sarr1ple space is all bir1ary nt1rr1bers of a cer tain
lengtli. (The length depends on the rnachine r11rming MATLA.B .) Each r1urr1ber
is interpreted as a fraction , wit11 a binary poir1t preceding t he bits in the binary
ntm1ber. To llSe the pset1dorandom nt1rr1ber generator t o sirr1t1late an experirr1en t
that contair1s an ever1t vvith probability r>, vie examine one nurr1ber , r , prod11ced b}'
the J\II ATLAB a lgorit11m ttr1d Sa}' that the event occurs if r < p; otherwise it does
not occur.
A MA.TLAB sirnt1lation of an experirr1ent start s wit11 rand: the randorr1 number
generator rand Cm, n) returns ar1 rn x 11, arra}' of pset1dorandorn r1t1mbers. Sirnilarly,
rand(n) prod11ces a n ri x 11, array and rand(1) is jt1st a scalar randorn r1urnber.
Each nurr1ber produced by rand(1) is in t he interval (0 , 1). Each tirr1e we use rand,
vve get new, seemingly ur1predictable n11mbers. Suppose I> is a n11mber betvveen 0
and 1. The compa risor1 rand( 1) < p prodt1ces a 1 if the ra ndom r1urr1ber is less
than r>; otherwise it produces a zero. Roughly speaking, the fur1ction rand( 1) < p
simulates a coir1 flip vvit h P [tail] = J> .
MATLAB also has sorne cor1venient v ariations on rand. For exarnple, randi (k)
gener ates a r ar1dom integer from t11e set {1 , 2, ... , k} and randi (k,m,n) ger1erates
an rn x 11, array of such random integers.
Example 1.25
Use MATLAB to generate 12 random student test scores T as described in Quiz 1.3.
Since randi (50, 1, 12) generates 12 test scores from t he set { 1, ... , 50}, we need
on ly to add 50 to each score to obtain test scores in the range {51 , ... , 100}.
>> 50+randi(50,1,12)
ans =
69 78 60 68 93 99 77 95 88 57 51 90
[
PROBLEMS 29
Finally, v.re note t h at l\IIATLAB 's rar1dom nurr1bers are only seerningly llnpredictable.
Ir1 fact, l\IIATLAB rr1aintair1s a seed val11e that determines the s11bseq11ent "random"
nt1rr1bers that vvill be returned. T11is seed is controlled by the rng f\1r1ction; s =rng
saves t11e current seed arid rng(s) restores a previously saved seed. Initializing the
r andorr1 nurr1ber generator 'ivith the sarr1e seed al'ivays generates t11e sarne sequence:
Example 1.26
>> s=rng;
>> 50+randi(50,1,12)
ans =
89 76 80 80 72 92 58 56 77 78 59 58
>> rng(s);
>> 50+randi(50,1,12)
ans =
89 76 80 80 72 92 58 56 77 78 59 58
vVhen you run a sirnulation t h at uses rand, it r1ormally doesn't m att er 11ow the
rng seed is initialized . Hovvever, it can be instruct ive to use the sam e repeatable
seqt1ence of rand values 'ix.rhen yot1 are debugging }' Our sirnulation.
===Quiz 1. 7c===:a
The n11mber of c11aracters ir1 a t 'iveet is equally likely to be an y integer betV\reen 1
and 140. Sirr1ulate an experirr1ent that generates 1000 tV\reets and counts t11e number
of "long" t weets that 11ave over 120 ch aracters . R epeat this experirr1er1t 5 t imes.
Problems
Difficulty: Easy Moderate D ifficu lt t Experts Only
1.1.1 Continuing Quiz 1.1 , write Ger 1.1.3 Ricardo 's offers customers two kinds
landa's ent ire menu in words (supply prices of pizza crust, Roman (R) and Neapolitan
if you 'ivish). (N). All pizzas have cheese but not all piz
zas have tomato sauce. Roman pizzas can
1.1.2 For Gerlanda's pizza in Quiz 1.1, an
have tom ato sauce or t hey can be white
swer t hese questions:
(W); Neapolitan pizzas always have tomato
(a) Are N and M mutually exclusive? sauce. It is possible to order a Roman pizza
(b) Are N, T, and M collectively exhaus with mushrooms (JV!) added. A Neapolitan
t ive? pizza can contain mushrooms or onions ( 0)
or both , in addit ion to t he tomato sauce and
( c) Are T and 0 mutually exclusive? State
cheese. Draw a v enn diagram t hat shows
t his condition in 'ivords.
the relationship among t he ingredients N,
( d) Does Gerlanda's m ake Tuscan pizzas 111, 0 , T, a nd W in t he menu of Ricardo's
wit h mushrooms and onions? . .
p1zzer1a.
(e) Does Gerlanda's m ake Neapolitan piz
zas t hat have neit her mushrooms nor 1.2.1 A hypothe t ical w ifi transmission
onions? can take place at any of t hree speeds
[
( f) Are Ai, A2, and A3 collectively exhaus (a) Events A and B ar e a pa rt ition and
t ive? l=> [AJ = 3 P [B J.
(b) For even ts A and B , P [A U BJ = P [AJ
1.2.2 _An integrated circu it factory h as and P [A n BJ = 0.
t h ree machines X, Y, a nd Z . Test one in
(c) For events _4. and B , P [A U BJ = P [AJ
tegr ated circuit prod uced by each machine.
l=> [BJ.
E it her a circuit is accep tab[e (a) or it fa ils
(f) . _An observation is a sequence of t hree 1.3.2 You r oll two fair sixsided d ice; one
test results correspo nding to t he circuits d ie is red , t he other is \vhite. Let R i be t he
from m achines X, Y, and Z , respectively. event t hat t he red d ie rolls i. Let vVj be t he
For example, aaf is t he observation t hat event t hat t he white d ie rolls j .
t he circuits fr om X and Y pass t he test and
t he circuit from Z fails t he test. (a) W hat is P [R3W2J?
(a) W hat are t he elements of t he sample (b) \tV hat is t he P [S5J t hat t he sum of t he
space of t his experiment? t \vo rolls is 5?
(b) W hat are t he elem ents of t he sets 1.3.3 You r oll two fair sixsided d ice.
F ind t he probability P [D 3J t hat t he abso
Z F = {circuit from Z fails} , lu te value of t he difference of t he d ice is 3.
XA = {circuit from X .is accep table} .
1.3.4 Indicate \vhether each statemen t is
(c) Are Z p and XA mut ua lly exclusive? t r ue or false.
(d) Are Z p and XA collectively exhaus (a) If P [4.J = 21=>[Ac), t hen P [AJ = 1 / 2.
t ive? (b) For all A and B , P [ABJ < P [AJ P [BJ.
[
PROBLEMS 31
( c) If P[A] < P[ BJ, t hen P [AB] < P[ B J. 1.3.10 Use Theorem 1.4 to prove the fol
lo,ving facts:
(d) If P[A n B J = P[A], t hen P[A] > P[B].
(a) P[A U BJ > P[4]
1.3.5 Computer programs are classified by (b) P [_4 uB] >P[B]
the length of the source code and b y the (c) P[A n BJ < P[A]
execution t ime. l=>rograms w ith more t han (d) P [A n B J < P[B]
150 lines in t he source code are b ig ( B ).
Programs \Vit h < 150 lines are li ttle (L). 1.3.11 Use Theore1n 1.4 to prove by in
Fast programs (F) run in less than 0.1 sec duction the 7J,nion bound: For any collection
onds. Slow programs (W) r equire at least of events A1, ... , _4n,
0. 1 seconds. l\/Ionitor a program executed n
by a computer. O bserve the length of the
source code and t he run time. The prob
P [A1 U A2 U U 4n] < LP [Ai] .
i =l
ability model for this experiment contains
the follo\ving informat ion: P [LF] = 0.5 , 1.3.12 Using only t he three axioms of
P[BF] = 0.2 , and P [BW] = 0.2. \i\!hat is probability, prove P[0] = 0.
the sample space of t he experiment? Calcu
late the follo,ving probabilities: P [W], P[B], 1.3.13 Using t he three axioms of p roba
bility and the fact that P [0 ] = 0, prove
and P[vV u BJ.
Theorem 1.3. Hint: Define _4i = Bi for
1.3.6 There are two types of cellu lar i = 1, . . . , m and _4 i = 0 for i > 1n.
phones, handheld phones (H) that yo u 1.3.14 For each fact stated in Theo
carry and mobile phones (M) t hat are rem 1.4, determine \vhich of the t hree ax
mounted in ve hicles. Phone calls cru1 be ioms of probability are needed to prove the
classified by the t raveling speed of the user fact.
as fast (F) or slo\v (W). l\/Ionitor a cellular
phone call and observe the type of telephone 1.4.1 Mobile telephones perform handoffs
and the speed of the user. The probability as they move from cell to cell. During a
model for this experiment has the follow call , a telephone either performs zero hand
ing information: P [F] = 0.5, P [HF] = 0.2, offs (Ho), one handoff (H1), or more than
P[MW] = 0.1. \tVhat is the sample space of one handoff (H 2 ) . In addition, each call is
the experiment? Find the follo,ving proba either long ( L), if it lasts more than three
bilities P [W], P [MF], and P [HJ. m inutes, or b rief ( B). The following table
describes the probabilities of t he possible
1.3.7 Shuffle a deck of cards and turn over types of calls .
t he firs t card. What is the probability t hat
the first card is a heart?
L O.l 0.1 0.2
1.3.8 You have a sixsided die t hat you B 0.4 0.1 0.1
ro ll once and observe the number of dots
facing up,vards. \t\fhat is the sample space?
(a) What is the probability that a brief call
\i\fhat is the probability of each sample out
\vill have no handoffs?
come? \i\fhat is the probability of E, the
event that t he roll is even? (b) \i\fhat is t he probability that a call with
one handoff \Vill be long?
1.3.9 A student's score on a 10point quiz (c) \t\fhat is t he probability that a long call
is equally likely to be any i11teger bet\veen \vill have one or more handoffs?
0 and 10. \tVhat is the probability of an _4 ,
'vhich requires the st udent to get a score 1.4.2 You have a sixsided d ie that you
of 9 or more? \i\fhat is the probability t he roll once. L et Ri denote the event that
student gets an F by getting less than 4? the roll is i. Let Gj denote t he event t hat
[
the roll is greater t han j. Let E denote pea has yello'v seeds. In one of Mendel's ex
the event that the roll of the die is eve n periments , he started \Vith a parental gen
numbered. eration in which half the pea plants \Vere yy
(a) W hat is P[Rs lG1], the conditional and half the plants \Vere gg. The two groups
probability t hat 3 is rolled given t hat were crossbred so that each pea plant in the
the roll is greater than 1? first generation \Vas gy. In the second gen
eration, each pea plant \Vas equally likely
(b) What is the conditional probability to inherit a y or a g gene from each first
that 6 is rolled given t hat the roll is generation parent. V\fhat is the probability
greater than 3? P [Y] that a randomly chosen pea plant in
( c) \i\1 hat is P [Gs IE], t he conditional prob the second generation has yello'v seeds?
a b ili ty that the roll is greater than 3
1.4.6 Fiom Problem 1.4.5, what is the
given that the roll is even?
conditional probability of yy, that a pea
(d) G iven that the roll is greater than 3, plant has two dominant genes given the
what is the conditional probability that event Y that it has yellow seeds?
the roll is even?
1.4.7 You have a shuffled deck of three
1.4.3 You have a shuffled deck of three cards: 2, 3, and 4, and yo u deal out the
cards : 2, 3, and 4. You dra':v one card. Let three cards. Let Ei denote the event that
Ci denote the event t hat card i is picked. ith card dealt is even numbered.
Let E denote the event t hat the card cho (a) \iVhat is P[E2 IE1], the probability t he
sen is a evennumbered card. second card is even given that the first
(a) What is P[C2IE], the probability that card is even?
the 2 is picked g iven that an even (b) \tVhat is the conditional probability
n umbered card is chosen? that the first t\vo cards are even given
(b) What is the conditional probability that the third card is even?
that an evennumbered card is picked ( c) Let Oi represent t he event that the ith
given that t he 2 is picked? card dealt is odd numbered. W hat is
P[E2 I01], the conditional probability
1.4.4 Phonesmart is having a sale on Ba
that the second card is even given that
nanas. If you buy one Bana11a at full price,
the first card is odd?
you get a second at half price. \tVhen cou
ples come in to buy a pair of phones , sales ( d) \iVhat is the conditional probability
of Apricots and Bananas are equally likely. that the second card is odd given that
Moreover, given that the first phone sold the first card is odd?
is a Banana, the second phone is twice as
likely to be a Banana rather than an Apri 1.4.8 Deer t icks can carry both Ly me dis
cot. What is the probability that a couple ease and human granulocytic ehrlichiosis
buys a pair of Bananas? (HGE). In a study of t icks in the ~1id,vest,
it was found t hat 16% carried Ly me d is
1.4.5 The basic rules of genetics \Vere d is ease, 10% had HGE, and that 10% of the
covered in mid1800s by ~1endel , who found ticks that had either Ly1ne disease or HGE
that each characteristic of a pea plant, such carried both diseases.
as \vhether the seeds \Vere green or yello,v,
is determined by two genes, one from each (a) What is t he probability P [LH] that a
parent. In his pea plants, Mendel fo und t ick carries both Ly me disease ( L) and
that yello\v seeds \Vere a do1ninant trait over HGE (H)?
green seeds. A yy pea with two yellow genes (b) \iVhat is the conditional probabili ty
has yello'v seeds; a gg pea \Vith two reces t hat a tick has HGE given that it has
sive genes has green seeds; a hybrid gy or yg Ly me d isease?
[
PROBLEMS 33
1.5.1 G iven the model of handoffs and call 1.6. 1 Is it possible for A and B to be in
lengt hs in Problem 1.4.1, dependent events yet satisfy A = B?
(a) What is the probability P[Ho) that a 1.6.2 Events A and B are equiproba
phone makes no handoffs? ble, mutually exclusive, and independent.
(b) What is t he probability a call is brief? What is P[A]?
(c) \i\fhat is the probability a call is long or 1.6.3 At a P honesmart store, each phone
there are at least two handoffs? sold is twice as likely to be an Apricot as a
Banana. _Also each phone sale is indepen
1.5.2 For the telephone u sage model of dent of any other phone sale. If you monitor
Example 1.18, let Brn denote the event that the sale of t'vo phones, what is the probabil
a call is billed for m, minutes. To generate a ity that the two phones sold are the same?
phone bill, observe t he duration of the call 1.6.4 Use a \ ! enn diagram in 'vhich the
in integer minutes (rounding up). Charge event areas are proportional to t heir prob
for M minutes JV! = 1, 2, 3, .. . if the exact abilities to illustrate t'vo events A and B
duration T is M  1 < t < M. A more that are independent.
complete probability model sho,vs that for
m, = 1, 2, . . . the probability of each event 1.6.5 In an experiment, A and B are mu
Brri is
tually exclusive events 'vith probabilities
P [A) = 1 /4 and P[B) = 1/8.
(a) Find P[A n B J, P [A u B J, P [A n B e],
and P[A U B e) .
'vhere a = 1  (0.57) 113 = 0.171.
(b) Are A and B independent?
(a) Classify a call as long, L, if the call
lasts inore than three minutes. \i\fhat 1.6.6 In an experiment, C and D are in
is P [L)? dependent events with probabilities P [C ) =
5/8 and P[D) = 3/8 .
(b) What is the probabilitJr that a call will
be billed for nine minutes or less? (a) Determine the probabilities P[C n DJ,
f>[C n D e), and P [Cc n D e).
1.5.3 Suppose a cellular telephone is (b) Are cc and D e independent?
equally likely to make zero handoffs (Ho),
one handoff (H1), or more t han one hand 1.6.7 In an experiment, A and B are mu
off (H2). Also, a caller is either on foot ( F) tually exclusive events 'vith probabilities
'vith probability 5/12 or in a vehicle (V). P [A U B J = 5/8 and P[A) = 3/8.
(a) Given t he preceding in.formation, find (a) F ind l=> [B], P[A n B e], and P[A U Be).
three ways to fill in the follo,ving prob (b) Are A and B independent?
ability table:
1.6.8 In an experiment, C, and D
are independent events with probabilities
F P [C n D J = 1/3, and P [C) = 1/2.
v (a) F ind P[D], P[C n De], and P [Cc n D e).
(b) F ind P [C uD) and P [C u D c).
(b) Suppose 've also learn that 1 /4 of all
callers are on foot inaking calls with no ( c) _A.re C and D e independent?
handoffs and that 1 /6 of all callers are
1.6.9 In an experiment with equiproba
vehicle users making calls 'vi th a single
ble outcomes, the sample space is S =
handoff. G iven these additional facts ,
{1 , 2, 3,4} andP [s] = l/4forall s ES.
find all possible ways to fill in the table
Find three events in S that are pair,vise in
of probabilities.
dependent but are not independent. (Note:
[
Pair,vise independent events meet the first many visib ly different kinds of pea plants
three conditions of Definition 1.7). would l\/Iendel observe in the second gener
at ion? \tVhat are the probabilities of each
1.6.10 (Continuation of Problem 1.4.5) of these kinds?
One of rvlendel's most s ignificant results
'vas the conclusion that genes determin 1.6.1 1 For independent events A and B ,
ing different characteristics are transmit prove that
ted independently. In pea plants, l\/Iendel (a) A and B e are independent.
found that round peas (r) are a domi
nant trait over 'vrinkled peas ('UJ). Mendel (b) A c and B are independent.
crossbred a group of (rr, yy) peas with a (c) Ac and B c are independent.
group of (?lJ'UJ,gg) peas. In t his notation,
rr denotes a pea w it h two ((round" genes 1.6. 12 Use a Venn d iagram in which the
and ?1J?1J denotes a pea w ith t'vo "wr in event areas are proportional to their proba
k led" genes. The first generation 'vere ei bilities to illustrate three events A, B , and
ther (r1D,yg) , (r1D ,gy) , ('1Dr, yg), or (v1r, gy) C that are independent.
plants 'vith both hy brid shape and hy brid
1.6. 13 u se a Venn diagram in which event
color. Breeding among the first gener
areas are in proportion to their probabilities
at ion yielded secondgeneration plants in
to illustrate events _4 , B, and C that are
'vhich genes for each characteristic were
pair,vise independent but not independent.
equally likely to be either dominant or re
cessive. \1Vhat is the probabilit y P [Y] that 1.7. 1 Follo,ving Quiz 1.3, use 1VIATLAB,
a secondgeneration pea plant has yello'v but not the r andi function, to generate
seeds? What is the probability P [R] that a vector T of 200 independent test scores
a secondgeneration plant has round peas? such that all scores bet,veen 51and100 are
Are R and Y independent events? How equally likely .
[
Sequential Experiments
1
Unlike b iological trees , \v hich grow from the groun d up, probabilities usually grow from left to
right . Some of them h ave t h eir roo ts on top a n d leaves on t he bo t tom.
35
[
 = Example 2. 1
For the resistors of Example 1.19, we used A to denote the event that a random ly
chosen resisto r is "with in 50 D of the nom inal value ." T his could mean "acceptable."
We use the notation /ll ("not acce ptab le") for the complement of A. The experi ment
of testing a resistor can be viewed as a twostep procedu re. First we identify which
machine (B 1 , B 2 , or B 3 ) produced the resistor . Second, we find out if the resistor
is acceptable. Draw a tree for this sequentia l experime nt . What is the probability of
choosing a resistor from machine B 2 that is not acceptable?
0 .24
Th is twostep procedure is shown in t he
~ A tree o n the left. To use the tree to
0.3 0.2 N 0.06
0.9 A 0 .36 find the probability of the eve nt B 2 J\T,
a no nacce ptable resistor from machine
0.1 0.04
B 2 , we start at t he left and find that t he
B3 ~ 0 .18
~N 0.12
probab ility of reaching B2 is P [B 2] =
0.4. We then move to the right to B 2 J\T
and mu ltiply P[B2] by P [N IB2] = 0.1 to obtain P[B2N] = (0.4)(0.1 ) = 0.04.
We observe ir1 this exarnple a gener al propert}' of all tree diagrarr1s that represent
sequential experiments. The probabilit ies on the brar1ches leaving an}' node add
up to 1. This is a consequer1ce of the lav.r of total probability arid the property of
conditional probabilities that corresponds to Axiorn 3 (Theorerr1 1.7). Moreover ,
Axiorr1 2 implies that the probabilities of all of the leaves add up to 1.
Example 2.2
T raffic engineers have coordinated the t iming of two traffic lights to encourage a run of
green lights. In particula r, the tim ing was designed so that with probability0.8 a driver
will find the second light to have t he sa me co lor as the first . Assuming t he first light
is equa lly likely to be red o r green, what is the probabil ity P [G2] t hat the second light
is green? Also, what is P[W ], the probab ility t hat you wait for at least one of t he first
two lights? Lastly, what is P [G 1 IR2]. the conditional probability of a green first light
given a red second light?
[
(2.2)
The probability that you wait fo r at least one light is
An alternative way to the same answer is to observe that VV is a lso the complement of
the event that both lights a re green . Thus ,
(2.4)
To find P [G1 IR 2), we need P [R2] = 1  P [G2] = 0.5. Since P [G 1R2] = 0.1, the
conditional probabil ity that you have a green first light given a red second light is
(2.5)
p [C1 IH]  3I 8  ~
 3/8 + 1/ 4 5
[
Similarly,
1/8 1
(2.6)
1/8 +1 / 4 3
As we would expect, we a re more like ly to have chosen coin 1 when the first flip is
heads , but we are more likely to have chosen coin 2 when the first flip is ta ils.
The r1ext exarr1ple is t11e "Mont}' Hall" garne, a farnous problerr1 v.rith a solutior1
that rr1any regard as cour1terintl1itive. Tree diagrarns provide a clear explanatior1 of
the ansvver.
Monty opens door 2 (event R2), you switch to door 3 and t hen Monty opens door 3 to
revea l a goat (event G). On t he other ha nd, if the car is beh ind door 2 , Monty revea ls
the goat behi nd doo r 3 and you switch to door 2 a nd win t he car. Si milarly , if the car
is beh ind door 3, Monty revea ls the goat behind door 2 and you sw itch to door 3 and
w in the car. For always switch, we see that
(2.7)
If yo u do not switch , the t ree is shown in Figure 2 .1 (b). In th is tree , when the car
is behi nd door 1 (eventH 1 ) and Monty opens door 2 (event R2), you stay w ith door 1
and then Monty opens door 1 to revea l the car. On the other hand , if the car is behi nd
doo r 2 , M onty w ill open door 3 to revea l the goat . Since your final choice was doo r 1,
Monty opens door 1 to rev ea I the goat. For do not switch ,
T h us switch ing is better ; if you don't switch , you win the car on ly if you in it iaIly guessed
t he location of the car correctly, an event that occurs w ith probabil ity 1/ 3. If you switch,
you win the car w hen your initia l guess was wrong , an event with probabil ity 2/ 3 .
Note that the two trees look largely the sa me because the key step where you make
a choice is somewhat hidden because it is impl ied by t he f irst t wo bra nches fol lowed in
the tree .
Quiz 2.1
In a cellular phone syst ern, a rnobile p hone rr1ust be paged t o receive a phone call.
However , paging atternpts don't alvva:ys st1cceed because the rr1obile phone rnay not
receive t he paging signal clearly. Consequent ly, t 11e system v.rill page a phone up to
t hree tirnes before giving up. If t he results of all paging atterr1pts are indeper1der1t
and a single pagir1g att err1p t succeeds vvit h probability 0 .8, sketch a probability tree
for t his experirnent and fir1d t he probability P [F] t hat t he pl1one receives t he paging
signal clearly.
[
Alt hougl1 rr1any practical experirnents are rnore complicated , the t ecl1r1iques for
det errnining the size of a sarnple sp ace all follo\v frorr1 the fundarnenta.l principle of
cot1r1t ing in Theorerr1 2 .1:
Example 2.6;:==
There a re two su bexperi ments . The first su bexperi ment is "Flip a coin and observe
either heads Hor tails T." The second subexperiment is "Roll a sixsided die and
observe the number of spots. " It has six outcomes , 1, 2, ... , 6 . The experiment, "Flip
a coin and roll a die ," has 2 x 6 = 12 outcomes:
NI t1ltiplying the right side b}' ( n,  k) !/ (ri  k) ! yields our next theorerri.
ri!
(ri) k = n,(ri  1) (n,  2) (n,  k + 1) = ( _ k) 1
r/, " .
In contrast to this exarr1ple vvith s ix outcorr1es , the r1ext exarnple shows that the
kpermt1tation corresponding to an experirnent ir1 vvhich the observation is these
quence of tvvo letters 11as 4! / 2! = 12 outcorr1es.
[
n,
) (
(k  ri  k
'n, ) (2.11 )
The logic beh ind t his ider1t ity is that choosir1g k out of n, elerner1ts to be part of a
subset is equivalen t to choosing n,  k elerner1ts t o be ex cluded from the su bset .
In most contex ts, (~) is u ndefined ex cep t for integers n, and k vvit h 0 < k < n,.
Here, vve ad op t the follov.ring definition t h at a pplies t o all nonnegative integers n,
a nd all real nurr1bers k :
(~) = k !(n,  k) !
0 other111ise.
[
This definit ion captures t11e intuit ion t11at given , say, n, = 33 object s , t here a re
zero vvays of choosir1g k =  5 objects, zero ways of choosing k = 8.7 objects, an d
zero v.rays of choosing k = 87 object s. Although t his exten ded defir1it ion ma}' seem
t1nr1ecessary, and perhaps even silly , it v.rill rr1ake many forrr1ulas in lat er c11apt ers
rr1ore concise arid easier for studer1ts to grasp .
The number of comb inations of seven cards chosen fro m a deck of 52 cards is
Example 2.10
There are four queens in a deck of 52 cards. You are given seve n cards at ra ndo m from
the deck. What is the probability t hat you have no quee ns?
Consider an expe ri ment in wh ich the proced ure is to select seven cards at random from
a set of 52 cards and the observatio n is to determine if there a re o ne or more queens
in the se lection. The sample space co ntains H = (572 ) possib le combinat ions of seven
cards, each with probabilit y 1/ H . T here are n NQ = ( 52; 4 ) combi nations with no
queens. The pro bability of receiving no queens is the ratio of the number of outcomes
with no queens to the numbe r of outcomes in the samp le space. H 1vQ/ H = 0. 5504.
Another way of analyz ing th is experime nt is to co ns ider it as a seque nce of seven
su bexperiments. The first su bexperiment consists of select ing a card at ra ndom and
observing whether it is a queen. If it is a queen, an outcome wit h probabi lity 4/ 52
( because there are 52 outcomes in t he sample space and four of t hem are in t he event
{queen }) , stop looking for queens. Otherwise, with pro babil ity 48/52, select another
card from the rema ining 51 cards and observe whether it is a queen . This outcome of
th is subexperiment has pro bab ility 4/51 . If the second card is not a queen , an outcome
[
with probability 47 /5 1, co111tinue un til you select a q ueen or you have seven cards with
no queen. Using Q i and J\Ti to indicate a "Queen" or "No queen" on subexpe rim ent i ,
the tree for this experiment is
.Ji.
51
The probabil ity of t he event N 7 that no queen is received in your seven cards is the
product of the probabi lities of the bra nches leadi ng to N 7 :
The sample space contains 52 7 outcomes. There are 48 7 outcomes with no q ueens. The
ratio is (48 /52) 7 = 0.5710, the proba bil ity of receiving no quee ns. If this experiment is
considered as a sequence of seve n subexperiments, t he tree looks the sa me as the tree
in Example 2.10 , except t hat all the horizonta l bra nches have proba bil ity48/52 and a ll
the diagona l branches have probabil ity 4/52.
Example 2.12
A laptop computer has USB slots A and B . Each slot ca n be used fo r co nnecting a
memory card (177,), a camera (c) or a printer (r>). It is possib le to connect two memory
cards, two cameras, or two printers to the laptop . How many ways ca n we use the two
USB slot s?
Th is example corresponds to sampli ng two times with replacement fro m the set {rn,,c ,r>} .
Let x;y denote the outcome that device type x is used in s lot A a nd device type y is
used in slot B. The possible outcomes a re S = {m,rn ,rn,c,'IT1,7>,crn ,cc,cr>,I>m,,r>c ,r>p} .
The sample space S co ntains ni ne outcomes.
[
The fact that Exarr1ple 2.12 has nine possible outcomes should not be st1rprising.
Since v..re were sarnplir1g '\:vit11 replacernent , t here were alv..ra:ys three possible out
comes for each of the subexperiments to attach a device to a USB slot . Hence, by
the fundarnenta.l t heorern of cour1t ing , Exarr1ple 2.12 rr1ust h a\re 3 x 3 = 9 possible
ot1tcomes.
In Exarnple 2.12 , rn,c arid C'I T/, are distinct outcorr1es. This result ger1eralizes nat
urally vvhen v.re v.rar1t to choose with replacerr1ent a sample of 11, objects out of a
collection of 'IT/, distinguishable objects. The experirr1ent consists of a sequen ce of 'n
identical st1bexperiments v..rith rn 011tcornes in the sarnple space of eac11 subexperi
rnent . Hence there ar e 'JT/,n v.rays to c11oose v..rith replacerner1t a sarr1ple of ri obj ects.
Example 2.14
T he letters A through Z can produce 264 = 456,976 four letter words.
Note that we can think of the observation seq11ence x; 1 , ... , Xn as the r est1l t of
sarnpling wit11 r eplacerr1er1t 11, tirr1es from a sample sp ace Ssub For sequences of
identical subexperiments, vve can express Theorem 2.4 as
[
In Example 2.12 and Example 2.16, repeating a st1bexperiment n, tirr1es arid record
ing the observation consists of constructing a word witl1 n, letters. Ir1 gen era l, n,
repetitions of the same sub experiment consists of choosing S}rrnbols frorr1 t11e alpl1a
bet { s 0 , ... , S 7n 1 }. In Example 2.1 5, rn, = 2 and "''e 11ave a binary alph abet v.rith
symbols s 0 = 0 and s 1 = l.
A more ch aller1gir1g problerr1 than finding t he n11mber of possible corr1binations
of 777, obj ects sampled "''ith replacernent frorr1 a set of n, objects is to calculat e t11e
nt1rr1ber of observation sequences such t hat ea.cl1 object appears a specified r1urnber
of tirr1es . \"Ve start vvith the case in v.rl1ich ea.ch subexperirnent is a t rial v.rith sarr1ple
space Bsub = {O, 1} indictt t ing failure or success.
T he 10 f ive letter words w ith O appearing twice and 1 appeari ng three t imes are:
  Theorem 2.6
The rt?J,rnber of observation, seq'1J,erices f or n, s11,be1;1Jerirnerits v;ith sarnple space S =
{ 0) 1} 71Jith 0 appearin,g T/,o tirnes arid 1 appearirig T/,1 = T/,  T/,o tirnes is c~:) .
T 11eorem 2. 6 can be gen era lized t o subexperirnents with rn, > 2 elerner1ts in
t he sarnple sp ace. For n, trials of a subexperiment with sarr1ple space Ssub =
{ s 0 , . . . , Srn 1 } , \Ve want to find the nt1rnber of ot1tcornes in vvhich s 0 appears n,0
t irr1es, s 1 appears n,1 t irr1es, an d so on. Of co11rse, t11ere a.re r10 s11cl1 outcornes unless
n ,o + + n ,rnl = n , . The notat ion for t he r1urr1ber of outcomes is
ri )
( rio, .. , Tl,,rn1
== Theorem 2. 7
For n, re1Jetition,8 of a 811,bex;perirnen,t v.;ith 8arnple space S = {so, . . . , Srn1}, the
ri'1J,rnber of len,gth n, = n,o + + n ,rn 1 observation, seq1J,er1,ces v.;ith s,i appear~in,g n ,i
tirnes is
ri )
( Tl,O' . . . ''TLrn 1
Proof Let M = (n0 , . . . ~i,rn_ 1 ) . Start wit h n, empty slots and perform t he follo,ving sequence
of su bexperiments:
Subexpe riment Proce dure
0 Lab el n,o slots as so .
1 Lab el n, 1 slots as s 1.
Ther e are C:~) 'vays to perform subexperimen t 0. After n,o slot s have been labeled, t here
are (n n1no) wavs
J
to per for m subexper imen t 1. After subexperimen t j  1, rio + + n 1  1
slots have a lready been filled , leaving (n  (no+1~+nj  1 ) ) 'vays to perform su bexperimen t j .
J
From t he fundamen tal count ing principle,
n,! (nno) !
(2. 15)
(n  n,o) !n,o! (ri  no  n,1)!n,1! ( 'YI
I lJ  'YI
I 11 0   'YI
I ll7J'1,  1 ) I 'YI
I l17n  1 I .
[
Note t h at a binorr1ial coefficient is t11e specia l case of the multinomia l coefficient for
a n a lpl1abet \vit h m = 2 syrr1bols . In particular ) for n, = n,o + n,1 )
(2.16)
Lastly) in the sarn e \vay that we ex tended the definition of the b inorr1ia l coeffi
cien t ) we vvill ernploy a n ex ter1ded definition for the mt1ltinomia.l coefficien t .
0 other~1u'ise.
Example 2.18
In Examp le 2.16, the professor uses a curve in dete rmin ing student grades. Whe n the re
are ten students in a probabil ity c lass, the professor always issues two g rades of A, three
grades of B, th ree grades of C and two grades of F. How many different ways can the
professor assign grades to t he ten students?
In Example 2.16, we determine t hat w ith four possible g rades t here are4 10 = 1)048 )576
ways of assigning grades to ten students. However, now we a re Ii m ited to c hoosing
11,0 = 2 students to receive an A, ri, 1 = 3 students to receive a B, ri,2 = 3 students to
receive a C and n,3 = 4 st udents to receive an F . The n um ber of ways that fit the
curve is the multinom ial coefficient
10 ) = 10! = 2 5 2 00 (2.17)
( 2 ) 3) 3) 2 2!3!3!2! )
Quiz 2.2
Consider a binar}' code w itl1 4 b its (0 or 1) in each code word. Ari exarr1ple of a
code vvord is 0110.
Example 2.19
What is the probability P [E2,3] of two failures and three successes in five independent
tr ials with success probability p .
To find P [E 2 ,3 ], we observe that the outcomes with three successes in five tria Is a re
11100, 110 10 , 11001 , 10110, 10101, 1001 1 , 01110, 01101, 0101 1, and 001 11 . We
note that the probability of each outcome is a product of five probabilities , each related
to one su bexperi ment. In outcomes with three successes, three of the probabi lit ies
are '[J and the other two are 1  '[J . Therefore each outcome with three successes has
probability (1  p ) 2 p 3 .
From Theore m 2.6, we know that the number of such sequences is (~). To find
P[E 2 ,3 ] , we add up the prolbabi lities associated with the 10 outcomes with 3 successes,
yielding
(2.18)
The secor1d formula in this theorerr1 is t11e result of mult ipl}ring t11e probabilit}' of
'no failures in n, t rials by t h e nl1mber of Ol1tcomes vvith n,0 failures.
Example 2.20
In Examp le 1.19, we fou nd that a rando mly tested resistor was acceptab le with proba
bility P[A) = 0.78 . If we random ly test 100 res istors, what is t he probab ility of Ti, the
event t hat i resistors test acceptable?
Testi ng each resistor is an indepe ndent trial with a success occurring when a resistor is
acceptab le. Thus for 0 < i < 100 ,
We note that ou r int uition says that since 78% of the res istors are acceptable , t hen
in testing 100 resistors , the nu m ber acceptab le should be near 78. However, P[T78 ) ~
0.096 , which is fa irly sma l l. This shows t hat although we might expect the number
acceptable to be close to 78, that does not mea n that the probabi lity of exact ly 78
acceptab le is high.
In th is case, we have five t r ia ls correspond ing to the five t im es the binary sym bo l is
sent. On eac h trial, a success occ urs when a bin ary symbol is rece ived correctly. T he
probabi lity of a success is p = 1 q = 0.9. T he error event E occurs when t he number
of successes is strictly less t han t hree:
By increasing the number of bina ry sym bols per information bit fro m 1to5, the cellular
phone reduces the probabi lity of error by more than one order of magnitude, from 0 .1
to 0 .0081.
En0 , ... ,n=_ 1 = {so occurs 77,o t imes, ... ,srn 1 occurs n,rn 1 t irnes} (2.22)
lS
no n1 nrn 1
Po P1 'fJrn 1 (2.24)
Next , vve observe t hat ar1y ot11er ex perimental outcome that is a reor dering of the
preceding sequence 11as t11e same probabilit}' because or1 eac11 pat11 t11rough t he t ree
to such an outcome there are rii occurrer1ces of s,i As a result,
vvhere M, t11e r1urr1ber of s uc11 ot1tcorr1es , is the rr1ult inorr1ial coefficien t (,n0 , ... ~~i= _ 1 )
of Defir1it ion 2.2. Applying Theorerr1 2.7, we h a\re t he follo\ving theorem:
'/7, ) no nrn  1
( Po Prn 1
n,o, , Tl,rn 1
100 ) ( 7 ) a ( 2 ) 'U ( 1) t
p [Ea ,v,t] = ( a,v ,t lO lO lO (2.26)
[
Kee p in mind t hat by the extended def inition of the m ultinom ial coefficient , P[Ea,v,t]
is nonzero on ly if a + v + t = 100 and a, v, and tare nonnegative integers.
Example 2.23
Continu ing w ith Example 2 .16, suppose in testing a microprocessor that all four grades
have probabi lit y 0.25, independent of any other microprocessor. In testing n, = 100
m icroprocessors, w hat is the proba bility of exactly 25 microprocessors of each grade?
Let E 25 ,25 ,25 ,25 denote the probab ility of exactly 25 mi croprocessors of each grade .
From T heorem 2.9,
2 5 25
~~~ , 100 (2.27)
P [E2s,2s,2s,2s] = ( , ) (0.25 ) = 0.0010.
25
(a) Let Ek , lOOk denote t11e ever1t that a received packet 11as k bits in error and
100  k correctl}' decoded bits. vVhat is P [Ek,100k ] for k = 0, 1, 2, 3?
(b) Let C denote the event that a packet is decoded correctly. vV11at is P [C)?
WI
WI w2 w3 wl
w3
Components in Series Components in Parallel
process st1cceed v.rith probabilit:y '[J ir1depender1t of the success or faill1re of ot.her
cornpor1ents.
Let Wi denote the event t hat corr1por1ent i succeeds. As depict ed in F igt1re 2.2 ,
there are tv.ro basic types of operatior1s.
Cornpon,erits in, series. The operation succeeds if all of its components st1cceed.
One exarnple of such an oper ation is a seqt1en ce of cornputer programs in
"'' hich each progra rn after t he first one uses the result of the previous pro
grarn. Therefore, t he corr1plete oper ation fails if any cornponent program
fails . vVhenever t 11e operation cor1sists of k cornponents in series , v.re r1eed
all k components t o st1cceed in order t o 11ave a su ccessful operation. The
probability t 11at the operatior1 succeeds is
(2.28)
(2.29)
w, w2 w5
w3 w4 w6
Figure 2.3 The operation described in Example 2.24. On the left is t he origi nal operation.
On t he right is t he equivalen t operation vvith each pair of series con1ponents replaced Virith
an equivalent component.
Example 2.24
An operation co nsists of two redundant parts. The fi rst part has two components in
series (W1 and W2 ) and the seco nd part has two compo nents in series (vT
/ 3 and vT14 ).
Al l components succeed with probabil ity p = 0. 9. Draw a diagram of t he operat ion
and calculate the probabi lity t hat the operation succeeds.
(2.34)
Simi larly, the combination of W 3 and W4 in series produces an equivalent com ponent ,
W 6 , w ith probability of success r>6 = p 5 = 0.81. The entire o peration then consists of
W 5 and W6 in para llel , wh ich is also shown in Figure 2 .3. The success probab il it y of
the o peration is
Note that in Equation (2.29) V\'e corriputed t.he probability of Sl1ccess of a pro
cess V\rit h componen ts iri series as tlie product of t h e success probabilit ies of t he
components . The reason is that for the process to be su ccessful , all corriponents
rriust be su ccessful. The event { all cornponents successful } is the intersectiori of
the individual Stlccess events and the probability of t he intersection of tv.ro everits is
the product of t he tvvo success probabilities. Ori the other hand, v.rith corriponents
in parallel, tlie process is successful v.rhen one or rriore corriponents is Sl1ccessful.
The event { orie or rnore corriponent s successful} is the 11niori of individual success
probabilities . Recall that t he probability of t lie union of tV\ro ev ents is the differ
ence bet\veen the surri of t he individual probabilit ies and the probabilit y of their
iritersection. Tlie forrriula for t he probabilityf of rriore tlian t\vo events is even rnore
complicated . On t he ot11er h and, V\rith cornponen ts in parallel, the process fails
V\' hen all of tlie corriponerits fail. The e\ren t {all cornporien ts fail} is the intersec
tion of the individual failure probabilities . Each failure probability is tlie difference
betvveen 1 and the success probabilit y. Hence in Eql1atiori (2.30) and Exarriple 2.24
V\'e first corripute the failure probability of a process wit h components in para.llel.
In general , De l\!Iorgan 's la\v (Theor erri 1.1) allovvs us to express a uriion as the
corriplernent of an intersection and vice versa. Therefore, in rriany applications of
probabilit}', vvhen it is difficult to calculat e directly t he probabilit}' \Ve need, V\'e can
often calcl1late the probability of the corriplem entar}' e\rent arid then Sl1btract this
probability from 1 t o find t he ans\ver. Tliis is ho\v vve calculated t he probabilit y of
success of a process V\rith corriponents in parallel.
Quiz 2.4
A merriory rriodule consist s of nine chips. The device is designed vvit h redundan cy
so tliat it \vorks even if one of its chips is defective. Eacli chip contains n, t r ansistors
and functions properly only if all of its transistors V\rork. A trarisistor V\rorks \vitli
probabilit}' '[J independent of an}' other transistor.
2.5 l\!JATLAB
y =
Columns 1 through 12
47 52 48 46 54 48 47 48 59 44 49 48
Columns 13 through 24
42 52 40 40 47 48 48 48 53 49 45 61
Columns 25 through 36
60 59 49 47 49 45 48 51 48 53 52 53
Columns 37 through 48
56 54 60 53 52 51 58 47 50 48 44 49
Columns 49 through 60
50 46 52 50 51 51 57 50 49 56 44 56
Figure 2.4 The sin1ulation output of 60 rep eated experin1ents of 100 coin flips .
>> X=rand(100,60)<0.5; The MATLAB code fo r t his task appears o n the left. The
>> Y=surn(X,1) 100 x 60 matrix X has 'i>jth element X(i,j)=O (tai ls)
or X(i,j)=1 (heads) to ind icate the resu lt of flip i of
subexperiment j. Since Y sums X across t he first dimension, Y(j) is the numbe r of
heads in t he j th subexperiment. Each Y (j) is between 0 a nd 100 and general ly in t he
neighborhood of 50. The output of a sample run is shown in F igu re 2 .4.
Example 2.26
Sim ulate the testi ng of 100 m icroprocessors as described in Examp le 2 .23. Your o utput
should be a 4 x 1 vector X such that X ,i is t he nu mber of grade i microprocessors.
Note that in 1\IIA.TLAB all variables are assurr1ed to be rnat rices. In w riting
MATLAB code, X rr1a:y be an n, x rn, rnatrix>an n, x 1 colurr1n vector, a 1 x 1T1, rO\iV
vector>or a 1 x 1 scalar. In 1\IIATLAB , we write X(i,j) to index thei, jth elerr1er1t,
By contrast, in t11is t ext , we \rary the notation dep endir1g on \iVhether v.re have a
[
PROBLEMS 57
scalar X ) or a vector or rr1atrix X . In addition) v.re t1se X i,j to denote the 'i) jth
element. T11us) X and X (in a M.A.TLAB code fr agrnent) ma}' both refer to the sarr1e
variable.
Quiz 2.5
The flip of a thick coin }'ields heads with probability 0 .4, tails vvith probability 0.5 )
or lands on its edge vvith probability 0.1. Sirnulate 100 thick coin flips. Yol1r outpl1t
sl1ol1ld be a 3 x 1 vector X such that X 1 ) X2 ) and X3 are t11e nurr1ber of occurrences
of heads, tails) and edge.
Problems
Difficulty: Easy Moderate D ifficu lt Experts Only
2.1 .1 Suppose you flip a coin t'vice. On 2.1.5 Suppose that for the general popula
any flip , the coin comes up heads with prob t ion, 1 in 5000 people carries the human im
ability 1/4. Use H i and Ti to denote the munodeficiency virus (HIV). A test for the
result of flip i. presence of HIV yields either a positive ( +)
(a) What is t he probability, P[H1 IH2], that or negative () response. Suppose t he test
the first flip is heads given that the sec gives the correct ans,ver 993 of the t ime.
ond flip is heads? What is P [ IHJ , the conditional probabil
ity that a person tests negative given that
(b) What is the probabilit:y that t he first
the person does have the HIV virus? What
flip is heads and the second flip is tails?
is P [HI+], the condit ional probability that
2.1 .2 For Example 2.2 , suppose P[G1) = a randomly chosen person has the HIV virus
1/ 2, P [G2 IG1) = 3/4, and P[G2 IR1] = 1/4. given that the person tests positive?
F ind P[G2), P[G2 IG1), and P[G1 IG2).
2.1.6 A machine produces photo detectors
2.1 .3 At the end of regulation time, a bas in pairs. Tests show that the first photo
ketball team is trailing by one point and a detector is acceptable with probability 3 /5.
player goes to the line for t\vo free throvvs. W hen the first photo detector is accept
If the player inakes exactly one free throw, able, the second photo detector is accept
the game goes into overtime. The proba able with probability 4/5. If the first photo
bility that the first free throw is good is detector is defective, the second photo de
1/ 2. However , if the first attempt is good, tector is acceptable \vit h probability 2/5.
the player relaxes and the second attempt is
good \Vi th probability 3 / 4. However, if the (a) F ind the probability that exactly one
photo detector of a pair is acceptable.
player misses the first attempt, the added
pressure reduces the success probability to (b) Find the probability t hat both photo
1/4. What is the probability that the game detectors in a pair are defective.
goes into overtime?
2.1 .4 You have t\vo biased coins. Coin A 2.1.7 You have two biased coins. Coin .4
comes up heads with probability 1 /4. Coin comes up heads \Vith probability 1/ 4. Coin
B comes up heads \v ith probability 3/4. B comes up heads w ith probability 3/4.
However, you are not sure which is \Vhich, Ho,vever , you are not sure which is w hich
so you choose a coin randomly and you flip so you flip each coin once, choosing the first
it. If t he flip is heads, yo u guess that the coin randomly. Use H i and Ti to denote the
flipped coin is B; otherwise, you guess that result of flip i. Let .41 be the event t hat coin
t he flipped coin is .4. \tVhat is the probabil A was flipped first. Let B1 be the event that
ity P[C) that your guess is correct? coin B was flipped first. \tVhat is P[H1 H 2)?
[
Are H 1 and H 2 independent? Explain your counted in the attempt to win t'vo of three
answer. and that Dag,vood never performs any un
necessary flips. Let Hi be the event that
2 .1 .8 A particular birth defect of the heart
D ag,vood flips heads on try i. Let Ti be the
is rare; a ne,vborn infant w ill have t he de
event t hat tails occurs on flip i.
fect D 'vith probability P[D) = 10 4 . In
the general exa1n of a ne,vborn, a particular (a) Draw the tree for this experiment. La
heart arrhythmia A occurs with probability bel t he probabilities of all outcomes.
0. 99 in infants 'vi th the defect. However , (b) \i\fhat are P [H3) and P [T.1)?
the arrhythmia also appears ,;vith probabil ( c) Let D be the event t hat Dag,vood must
ity 0.1 in infants withou t the defect. \!\!hen diet. What is P[D)? \i\!hat is P[H1ID J?
the arrhythmia is present, a lab test for the ( d) Are H 3 and H 2 independent events?
defect is performed. The result of the lab
test is either positive (event r+) or nega 2.1.10 The quality of each pair of photo
tive (event T  ). In a newborn 'vith the de detectors produced by the machine in Prob
fect, the lab test is positive 'vith probabil lem 2.1.6 is independent of the quality of
ity p = 0.999 independent from test to test. every other pair of detectors.
In a ne,vborn ,;vithout the defect , the lab (a) \!\!hat is the probability of finding no
test is negative 'vith probability p = 0.999. good detectors in a collection of n pairs
If the arrhythmia is present and the test produced by the machine?
is positive, then heart surgery (event H) is (b) How many pairs of detectors must the
performed. machine produce to reach a probability
(a) Given the arryth1nia A is present, 'vhat of 0.99 that there 'vill be at least one
is the probability the infant has the de acceptable photo detector?
fect D?
2.1.11 In Steven Strogatz's New York
(b) Given that an infant has the defect, Times blog http: I I opinionator. blogs.
w hat is the probability P [H IDJ that nytirnes.corn/2010/04/25/chancesare/
heart surgery is performed? ?ref=opinion, the follo,ving problem 'vas
( c) Given that the infant does not have posed to highlight the confusing character
the defect, what is t he probability of conditional probabilities.
q = P [HIDc) t hat an unnecessary heart Before going on 1;acation for a 71Jeek, you
surgery is performed? ask yo1J,r spacey friend to 71Jater yo1J,r ailing
(d) F ind the probability P[H) that an in plant. Without 111ater, the plant has a 90
fant has heart surge1y performed for percent chance of dying. E1;en 71Jith proper
the arrythmia. 111atering, it has a 20 percent chance of dy
ing. And the probability that your friend
(e) G iven that heart surgery is performed, 1Dill forget to 71Jater it is 3 0 percent. (a)
w hat is the probability that the new What's the chance that yo1J,r plant 7Dill sur
born does not have the defect? vive the 111eek? {b) If it's dead 71Jhen you
return, 71Jhat 's the chance that your friend
2.1 .9 Suppose Dagwood (Blondie's hus forgot to 71Jater it? ( c) If yo1J,r friend forgot
band) wants to eat a sandwich but needs to to 11Jater it, 71Jhat 's the chance it'll be dead
go on a diet. Dagwood decides to let the flip 1Dhen you return?
of a coin determine 'vhether he eats. u sing
an unbiased coin, Da~vood w ill postpone Solve parts (a), (b) and (c) of t his problem.
the diet (and go directly to the refrigerator) 2.1.12 Each t ime a fishe1man casts his
if eit her (a) he flips heads on his first flip or line, a fish is caught ,;vith probability p, in
(b) he flips tails on the first flip but then dependent of 'vhether a fish is caught on
proceeds to get t'vo heads out of the next any other cast of t he line. The fisherman
three flips. Note that the first flip is not will fish a ll day until a fish is caught and
[
PROBLEMS 59
then he 'vill quit and go home. Let Ci de 2.2.5 In a game of rummy, you are dealt
note the event that on cast i the fisherman a sevencard h and.
catches a fish. Draw the tree for this exper (a) W h at is the probability P[R7 ] that your
iment and find P[C1 ), P[C2], and P[Cn] as hand has only red cards?
func t ions of p.
(b) \i\fhat is the probability P [F] that your
2.2.1 On each turn of the knob, a gum hand has only face cards?
ball machine is equally likely to dispense a (c) \tV hat is t he probability P[R1F] that
red, yellow, green or blue gumball, indepen your h and has only red face cards?
dent from turn to turn. After eight turns, (The face cards are jack, queen, and
what is the probability I>[R2Y2G2B2] that king.)
you have received 2 red, 2 yellow, 2 green
and 2 blue gumballs? 2.2.6 In a game of poker, you are dealt a
fivecard hand.
2.2.2 A Starburst candy package contains
12 individual candy pieces. Each piece is (a) \t\fhat is the probability I>[R5 ] that your
equally likely to be berry, orange, lemon, or hand has only red cards?
cherry, independent of all other pieces. (b) \i\fhat is the probability of a "full
house" with threeofakind and twoof
(a) What is the probability that a Star
akind?
burst package has only berry or cherry
pieces and zero orange or lemon pieces? 2.2. 7 Consider a binary code 'vi th 5 bits
(b) What is the probability that a Star (0 or 1) in each code 'vord. An example
burst package has no cherry pieces? of a code word is 01010. How many differ
en t code words are there? Ho'v many code
( c) \i\fhat is t h e probability P [F1] that all
twelve pieces of your Star burst are the words have exactly three O's?
same flavor? 2.2.8 Consider a language containing four
letters: A , B, C, D. Ho'v many threeletter
2.2.3 Your Starburst candy has 12 pieces, words can you form in this language? Ho'v
three pieces of each of four flavors: berry, many fourletter 'vords can you form if each
le1non, orange, a nd cherry, arranged in a letter appears only once in each word?
random order in the pack. You draw the
first three pieces from the pack. 2.2.9 On an American League baseball
team 'vith 15 field players and 10 pitchers,
(a) What is the probability they are all t he the manager selects a starting lineup with
same flavor? 8 field players, 1 pitcher, and 1 designated
(b) What is the probability they are all dif hitter. The lineup specifies the players for
ferent flavors? these positions and the positions in a bat
ting order for the 8 field players and desig
2.2.4 Your Starburst candy has 12 pieces, nated hitter. If t h e designated hitter must
three pieces of each of four flavors: berry, be chosen among all t he field players, how
lemon, orange, a nd cherry, arranged in a many possible starting lineups are there?
random order in the pack. You draw the
2.2.10 Suppose that in Proble1n 2.2.9, the
first four pieces from the pack.
designated hitter can be chosen from among
(a) What is t he probability P[F1] they are all the players. How many possible starting
all t he same flavor? lineups are there?
(b) What is t he probability P [F4] they are 2.2.11 At a casino, the only game is num
all different flavors? berless roulette. On a spin of the 'vheel,
(c) \i\f hat is the probability P [F2 ] that your the ball lands in a space wit h color red ( r),
Star burst has exactly two pieces of each green (g), or black ( b). The wheel has 19 red
of t'vo different flavors? spaces, 19 green spaces and 2 black spaces.
[
(a) In 40 spins of the wheel, find t he prob of the Celt ics winning eight straight cham
abili ty of the event pionships beginning in 1959? A lso, w hat
would be t he probability of the Celtics win
A= {19 reds, 19 greens, and 2 blacks} . ning the t it le in 10 out of 11 years, starting
in 1959? G iven your answers, do you trust
(b) In 40 spins of the 'vheel , find the prob this simple probability model?
ability of G19 = {19 greens}.
2.3.3 Suppose each day that you drive to
( c) The onl y bets a llowed are red and
work a traffic light that you encounter is ei
green. Given that you randomly choose
ther green \Vith probability 7 /1 6 , red with
to bet red or green, 'vhat is t he proba
probability 7 / 16, or yello\v 'vith probability
bility p that your bet is a vvinner?
1/8, independent of the status of the liaht 0
on any other day. If over the course of five
2.2.12 A basketball team has three pure
days, G, Y, and R deno te the number of
centers, four pure for\vards, four p1ue
times the light is found to be green, yello,v,
guards, and one swingman w ho can p lay
or red, respectively, \vhat is the probability
either guard or forward. A pure posit ion
that P [G = 2, Y = 1 , R = 2]? _Also , 'vhat is
p layer can play only the designated posi
the probability P [G = R]?
t ion. If the coach must start a lineup with
one center, t\vo for,vards, and two guards, 2.3.4 In a game between t\vo equal teams,
how inany possible lineu ps can the coach the home team \Vins \Vith probability p >
choose? 1/ 2. In a best of t h ree playoff series, a
2.2.13 An instant lottery t icket consists team 'vith the home advantage has a game
of a collection of boxes covered with gray at home, followed b y a game a\vay, followed
\Vax. For a subset of the boxes, the gray wax by a home game if necessary. The series is
hides a special mark. If a p layer scratches over as soon as one team \vins t\vo games.
off the correct nu1nber of the marked boxes \tVhat is P [H], t he probability t hat the team
(and no boxes 'vithout the mark) , then that with the ho1ne advantage wins t he series? Is
ticket is a \Vinner. Design an instant lottery the home advantage increased b y playing a
threegame series rather than a onegame
game in 'vhich a player scratches fi ve boxes
playoff? That is, is it true that P [HJ > p
and the probability that a ticket is a \vinner
for all p > 1/2?
is approximately 0.01.
2.3.1 Consider a binary code 'vith 5 bits 2.3.5 A collection of field goal kickers are
(0 or 1) in each code \vord. An example of divided into groups 1 and 2. Group i has
a code word is 01010. In each code word a 3i kickers. On any kick, a kicker fro1n
'
bit is a zero with probability 0 .8 , indepen group i vvill kick a field goal with proba
dent of any other bit. bility 1/(i +l), independent of the outcome
of any other kicks.
(a) What is the probability of the code
word 00111? (a) A kicker is selected at random from
among all the kickers and attempts one
(b) What is the probabili ty t hat a code field goal. Let K be the event that a
word contains exactly three ones? field goal is kicked. F ind P [K].
2.3.2 T he Boston Celtics have won 16 (b) T'vo kickers are selected at random' J{J
NBi\. championships over approximately 50 is the event that kicker j kicks a field
years. Thus it may seem reasonable to as goal. Are J{ i and J{ 2 independent?
sume that in a given year the Celt ics \Vin (c) _A. kicker is selected at random and at
the t it le \Vith probability p = 16/5 0 = 0.32, tempts 10 fie ld goals. Let M be the
independent of any other year. G iven such number of inisses. F ind P [M = 5].
a model, what \Votlld be the probabili ty
[
PROBLEMS 61
2.4.1 A particular oper ation has s ix com record 'vhether it \Vas heads (Hi = 1) or
ponents. Each component has a failure tails (Hi = 0), and Ci E { 1, 2} \Vill record
probability q, indepe ndent of a ny other which coin \Vas picked.
component. A successful operation requires
2.5.2 Following Quiz 2.3, s u ppose the
both of t he following condit ions:
communication link has different error
Components 1, 2, and 3 all \Vork, or probabilities for trans1nitt ing 0 and 1.
component 4 \Vorks. \tVhen a 1 is sent, it is received as a 0 with
Component 5 or component 6 works. probability 0.01. \tV hen a 0 is sent, it is re
Dra'v a block diagram for this operation ceived as a 1 'vi th probability 0 .03. Each
similar to those of F igure 2.2 on page 53. bit in a packet is still equally likely to be a
Derive a formula for t he probability P[W] 0 or 1. Packets have been coded such t hat if
t hat the operation is successful. fi ve or fewer bits are received in error, t hen
the packet can be decoded. Simulate the
2.4.2 We wish to modify t h e cellular tele
transmission of 100 packets, each contain
phone coding system in Example 2.21 in
ing 100 bits. Count the number of packets
order to reduce the num ber of errors. In
decoded correctly .
particular, if there are t\vo or t hree zeroes
in t he received sequence of 5 bits , \Ve \vill 2.5.3 For a failure probability q = 0.2,
say that a deletion (event D) occurs. O t h s imulate 100 tria ls of the s ixcomponent
er,vise, if at least 4 zeroes are received, t he test of I>roblem 2.4. l. Ho\v many devices
receiver decides a zero \Vas sent, or if at least were found to work? Perform 10 repetitions
4 ones are received , the receiver decides a of the 100 trials. What do you learn from
one was sent. We say t hat an error occurs 10 repetitions of 100 trials com pared to a
if i \Vas sen t and the receiver decides j f=. i simulated experiment vvith 100 trials?
\Vas sent. For t his modified protocol, \vhat
is the probability P [E] of a n error? W hat 2. 5 .4 \i\1 rite a JVIA TLAB function
is the probability P[D] of a deletion? N=countequal(G,T)
2.4.3 Suppose a 10digit phone number is that duplicates the action of h i st (G, T) in
transmitted by a cellular phone using four Example 2.26. Hint : Use ndgr i d.
binary symbols for each d ig it, using t he
model of binary symbol errors and deletions 2.5.5 In this problem, \Ve use a MATLAB
given in Problem 2.4.2. Let C denote the simulation to "solve" Problem 2.4.4. Recall
number of bits sent correctay, D t he num that a particular operation has six compo
ber of deletions, and E the number of er nents. Each componen t has a failure prob
rors. Find P[C = c, D = d, E = e] for all c, ability q independent of any other compo
d, and e. nent. The operation is successful if both
2.4.4 Cons ider the dev ice in Prob Components 1, 2, and 3 a ll work, or
lem 2.4. l. Suppose we can replace any one component 4 \vorks.
component \vith an ultrareliable componen t Component 5 or component 6 \Vorks.
that has a failure probability of q/2 = 0.05.
\tVith q = 0.2, simulate the replacemen t of
\i\Thich component should we replace?
a component \Vith an ultrareliable compo
2.5 .1 Build a IVIATLAB simulat ion of 50 nent. For each replacement of a regular
trials of t he experiment of Example 2.3. component, perform 100 trials. Are 100
Your ou tput should be a pair of 50 x 1 vec trials sufficient to decide which componen t
tors C and H . For the ith trial, Hi will should be replaced?
[
3.1 Definitions
A ra ndorr1 va riable assigns nurr1bers to outcorr1es in the sample
sp ace of an experiment.
   Example 3.1  
Th e experime nt is to attach a photo detector to an optica I fi ber and count the
number of photons arriving in a onemicrosecond t ime interva l. Each observation
62
[
3.1 DEFINITIONS 63
   Example 3.2'  
The experiment is to test six integrated circuits and afte r each test observe
whether the circuit is accepted (a) or rejected (r). Each observation is a sequence
of six letters where eac h letter is eit her a or r . For example, s 8 = aaraaa. The
sample space S consists of t he 64 possible sequences. A random variable related
to t his experiment is N, the nu mber of accepted circuits. For outcome s 8 , J\T = 5
circuits a re accepted. The ra nge of N is SN = {O, 1, ... , 6} .
Example 3.3
In Exam ple 3.2, the net reve nue R obta ined fo r a batch of six integrated circuits is
$5 for each circu it accept ed minus $7 for each circuit rejected. (This is beca use
for each bad circ ui t that goes out of t he factory, it wi 11 cost the company $7
to deal with t he customer's compla int and supp ly a good replacement circuit.)
When N circuit s are accepted , 6  N circuits are rejected so that the net revenue
R is related to N by the functio n
The revenue associated with s 8 = aaraaa and all other outcomes for which N = 5
IS
If v.re have a probability rnodel for the integr ated circuit experirr1ent in Exarn
ple 3.2 , we can use t11at probabilit}' rnodel to obtain a probability rr1odel for the
r andorn variable. The rerr1ainder of this chapter will develop rnethods t o c11arac
terize probability models for random variables. We observe that in the preceding
exarr1ples, the val11e of a r.a,ndorr1 variable car1 al vvays be derived frorr1 the outcorne
of the ur1derlying experirn.er1t . This is not a coincidence. T11e formal definition of a
randorr1 variable reflects this fact.
[
This defir1ition acknowledges t 11at a r andom variable is t 11e result of ar1 underlyir1g
experirnent , but it also perrr1its llS t o separate the experiment , in p art icl1lar , t he
observa.tior1s, frorr1 t h e process of assigning numbers to Ol1tcomes . As vve saw ir1
Exarnple 3.1 , the assigr1rn.er1t rr1ay be irr1plicit in the definition of t 11e experirr1er1t ,
or it may require further a n alysis.
In sorne defir1itions of experirr1ents, t he procedures contain variable par arneters .
In these experirnents, t here can be values of t h e pa r arneters for which it is irr1
possible t o perform the o bserva,tions specified in the experiments. In t 11ese cases ,
the experiments do not produce r andom variables. ;. e r efer to experirr1ents vvit h
p ararnet er settings t h at do riot produce randorn variables as 'irnproper experirnen,ts.
Example 3.4
The procedure of an experime nt is to fire a rocket in a vertica l direction f rom Earth's
surface with initial velocity V km / h. The observation is T seconds, the time elapsed
until the rocket returns to Earth. Under what conditions is the experiment improper?
{X = x} = {s ES IX (s) = x} . (3.4)
sarr1ple space. In this c11apter, we st11dy the properties of discret e rar1dom v ariables.
Chapter 4 covers continuous r andorr1 variables.
The defining c11aracteristic of a discrete r ar1dom variable is that t11e set of possible
values can (ir1 principle) be listed, even t houg11 the list rr1ay be ir1finitely lor1g. Often ,
b11t not alvva:ys, a discret e randorr1 variable takes on ir1teger values. An exception is
the randorn variable related t o your probability grade. T11e experirr1er1t is to t ake
this co11rse and observe your gr ade. At Rutgers, t he sarnple space is
vVe use a funct ion G 1 () to rr1ap t his sarr1ple space int o a rar1dorn v ariable. For
exarr1ple, G 1 (A) = 4 and G 1 (F) = 0. The table
011tcornes F D c c+ A
0 1 2 2.5 3 3.5 4
Quiz 3. t =::::::..
A student t akes tvvo cotuses . Ir1 each course , t he studen t v.rill earn eit her a B or
a C. To ca.lc11lat e a grade point aver age (GPA) , a Bis v.rort11 3 points and a C is
vvorth 2 poir1ts. The student 's GPA G 2 is t he surn of the poir1ts earn ed for each
course divided b}' 2. }/l ake a table of the sarr1ple space of t h e experirr1ent arid t11e
corresponding values of t11e GPA, G2.
Recall that the probability model of a discrete randorri variable assigns a nt1mber
betvveen 0 arid 1 to each ot1tcorrie iri a sarnple space. vVhen we h a\re a discrete
randorri variable X , we express t he probabilit}' rnodel as a probability rriass function
(P MF) Px(x). Tlie argl1rr1ent of a P 1!{F ranges over all real nurnbers.
Px(x) = P [X = x;]
Note tliat X = x is ari event corisisting of all Ol1tcomes s of the underlying exper
iment for vvhicli X(s) = x; . On the other h a nd , Px(x) is a function ranging over
all real nl1rnbers x . For ar1y vall1e of x;, the functiori Px(x;) is the probabilit}' of tlie
event X = x .
Observe Ollr notation for a randorri \rariable and its PMF. vVe llSe an uppercase
let ter ( X in the preceding definition) for tlie narne of a randorn variable. We ust1all}'
t1se the corresponding lowercase letter ( x) to denote a possible value of the raridom
variable. The notation for the P 1!{F is t he letter P v.rith a st1bscript iridicating the
narne of tlie ra ndorn variable. T hus PR(r) is tlie notation for the P1!{F of raridom
variable R. In t hese examples, r and x are d11rnmy variables. Tlie sarrie randorn
variables and P1!{Fs COl1ld be denoted PR(v,) and Px(v,) or , indeed , PR() arid Px().
vVe deri\re the PMF from the sarriple space, the probabilit}' rnodel, and tlie ru le
that maps outcorries t o values of the random \rariable. vVe t hen graph a PMF by
rriarking on t he horizontal axis eacli \ralue \vith norizero probabilit}' and dravving a
vertical bar with length proportional to the probability.
Example 3.5
W hen the basketball player Wi lt Cha m berla in s hot two free throws, each s hot was
equally like ly eithe r to be good (g) or bad (b) . Each s hot that was good was worth 1
point. What is the PM F of X, t he number of points t hat he scored?
T here are four outcomes of th is experim ent: gg , gb, bg, and bb . A s imple tree diagra m
ind icates that each o utcome has probability 1/ 4. T he sa mple space and probabilities
of t he experi ment and the correspond ing va lues of X are given in t he tab le:
Outcomes bb bg gb gg
P[ ] 1/ 4 1/ 4 1/4 1/4
x 0 1 1 2
T he random variable X has three possible values correspond ing to three events:
Since each outcome has probabil ity 1/ 4, these three events have probabilities
P[X = O] = l / 4, P [X = 1] = 1 / 2, P [X = 2] = 1/ 4. (3.7)
[
We can express the probabilities of these events in terms of the probability mass function
( 1/4 .T,  0 )
I
1/2 .T,  1 )
Px(x) = (3.8)
1/4 ;i; = 2,
0 otherwise.
It is often usefu I or convenient to depict Px( ;r;) in two other display formats: as a bar
plot or as a tab le.
0.5
Px(x) .T, 0 1 2
Px(x) 1/4 1/2 1/4
0
1 0 1 2 3 x
Each PMF display format has its uses. The function definition (3.8) is best when
Px(x;) is given in terms of algebraic functions of x for various subsets of Sx. The
bar plot is best for visualizing the probability masses. The tab le can be a convenient
compact representation when the PMF is a long list of sample values and corresponding
probabilities.
No matter 11ovv the Px(x) is forrr1atted, the PMF of X states t11e value of Px(x;) for
every real r1urr1ber x . The first three lir1es of Equation (3 .8 ) give the function for
the v alues of X associated with nor1zero probabilities: x; = 0, x = 1 and x; = 2. The
final lir1e is necessary to specify the ft1nction at all other nt1rnbers. Although it rnay
look sill}' to see "Px( x;) = 0 otherwise" included ir1 rr1ost forrr1 l1las for a P 1!{F, it is
an essential part of the PMF. It is 11elpful to keep t11is part of the definition in m ind
vvhen vvorking with the P11F. However , in the bar plot and table representatior1s
of the PNIF, it is ur1derstood that Px(x;) is zero except for those va1t1es x explicitly
shown.
The PNIF contains all of our ir1formation abot1t the random variable X. Because
Px( x) is the probability of the event { X = x}, Px( x;) has a nt1rr1ber of irr1portant
properties. T11e following theorern applies the three axioms of probability to discrete
randorr1 variables.
P [B] = L Px(x).
xE B
Proof All t hree properties are consequences of t h e axio1ns of probability (Section 1.3).
[
F irst, Px(x) > 0 since Px(x) = P[X = :::r] . Next, v.;e observe t hat every outcomes ES is
associated 'vit h a num ber x E Sx . Therefor e, P [:::r E Sx ] = L: xESx Px(x) = P [s E SJ =
P[S] = 1. Since t he events {X = x} and {X = y} ar e m utually exclusive when :::i; =f. y, B
can be wr itten as t he union of mut ually exclusive events B = UxEB{ X = x }. Thus we
can use Axiom 3 (if B is countably infinite) or Theorem 1. 3 (if B is finite) to write
(3.9)
xE B x EB
Quiz 3.2
The ra ndorn variable JV 11as P 1!lF
c/ n, ri = 1, 2, 3,
(3.10)
0 otherwise.
Find
(a) T h e value of t h e constant c (b) P [JV = 1)
(c) P[N > 2) (d) P[N > 3)
Thus far in our d iscussior1 of ra ndorn varia bles we h ave described how each ra ndorr1
varia ble is related t o the o t1tcom es of a n experirr1en t . We h ave also introd uced t he
probab ilit}' rnass f\1r1ctior1, w hich contair1s t h e probabilit y rr1odel of the experirnen t .
I n pract ical a pplications, certain farr1ilies of ra n dom variables a ppear over and over
again in rr1an y experirr1en ts. In eacl1 fa rnily, the probability rnass functions of all t h e
r a ndorr1 variables h a:ve the sarr1e m atherr1atical forrri. The}' differ onl}' in t 11e valt1es
of one or two par am eters . T11is er1ables t1s to stt1d}' in advar1ce eacl1 fa.mil}' of ra ndorr1
variables a rid lat er appl}' t 11e knovvledge vie gain t o specific practical a pplications . In
this section , we define six fa milies of d iscret e ra ndorn v aria bles. T 11ere is one form11la
for the PNIF of all the ra n d orr1 var iables in a fa rr1il}' . Dependin g on t h e fa rnily, the
P 1!lF formula con t ains one or t vvo pa ra rr1et ers . By assigning n11rr1erica.l values to t he
p a ra rr1eters , vve obtain a specific r a r1dorn variab le . O ur r1ornen clature for a fa m ily
consists of the fa rr1ily n a rr1e folloV\red by or1e or t v.ro p a ra m et ers in pa rentheses . F or
ex a rr1ple, bin,ornial (ri , p) refers in gener al to the fa rr1ily of b inorr1ial randorn variables.
[
Bin,ornial (7 , 0.1 ) refers to the biriorriial randorri variable wit.h parameters n, = 7 arid
p = 0. 1. Appendix A summarizes irnporta nt properties of 17 families of r andorn
variables.
1/2 x; = 0,
Px(x) = 1/2 x; = 1, (3. 11)
0 otherwise.
Because all three experirrierits lead to the sarne probabilit:y rnass funct iori, t.hey can
all be arial}rzed the sarrie vvay. T he P l\/IF iri Exarnple 3.6 is a merriber of the farriily
of Bern,o'ulli randorn varia,bles.
Px(x) = p .X =l '
0 other'tlJ'ise,
1JJhere the pararneter '[J is in, the ran,ge 0 < '[J < 1.
c:::== Example 3. 7
Test one circu it and observe X, the number of rejects. What is Px(x) the PMF of
random variable X?
[
Because there are on ly two outcomes in the sample space, X = 1 with probability p
and X = 0 w ith probabi lity 1  p ,
1 p x= o.I
Px (x) = p .x = l ) (3.12)
0 otherwise.
T herefore, the number of circuits rejected in one test is a Bernoulli (r>) random variable.
Exa mple 3 .8
If there is a 0.2 probab ility of a reject, the PM F of the Bernoul li (0 .2) random variable
IS
1
Px(x) 0.8 x = 0,
0.5
Px(x;) = 0.2 x = 1, (3.13)
0
I 0 otherwise.
1 0 1 2 x
i::::::== Example3 .9
In a sequence of independent tests of integrated circuits , each circuit is rejected with
probability r>. Let Y equal the number of tests up to and including the first test that
results in a reject. What is the PMF of Y?
The procedure is to keep testing circu its until a reject appears. Using a to denote an
accepted circuit and r to denote a reject, the tree is
r Y= l r Y= 2 p
a ......__ _ __
a ...
1p
From the tree, we see that P [Y = 1) = r> . P[Y = 2) = p(l  p), P['Y = 3) = r>(l  r>) 2 ,
and, in general , P [Y = y) = r>( l  p)Y 1 . Therefore,
p( l  r>)y1 y = 1) 2, ...
Py(y) = (3.14)
0 otherwise.
Y is referred to as a geometric random variable because the probabi lities in the PMF
constitute a geometric series.
In general, the number of Berr1oulli trials that t ake place until t he first observation
of one of t11e two outcorr1es is a geornetric rar1dom variable.
[
111here the pararneter I> 'is iri the ran,ge 0 < I> < 1.
Example 3.10
If the re is a 0.2 probability of a reject, t he PM F of the geo metric (0 .2) random variable
IS
0.2
Py(y) (0.2)(0.8)yl y = 1, 2, ...
0.1 Py(y) =
0 ot herwise.
1111 . ...
0
0 IO 20 y
Example 3.11  
o::::::==
1n a sequence of n, independent tests of integrated circu its, each circuit is rejected with
probability J> . Let K equa l t he nu mber of rejects in the n, tests. Find the PM F P1<(k) .
Ado pting the vocabu lary of Sect ion 2.3, we ca ll each discovery of a defective circu it
a success, a nd eac h test is an independe nt tria l with success probability J> . T he event
K = k corresponds to k successes in ri trials. We refe r to T heo rem 2.8 to determ ine
that the PMF of J( is
(3 .15)
vVhenever vve h ave a seql1ence of n, indepen dent Bernot1lli t rials each v.rit h success
probabilit}' p, the nurnber of Sl1ccesses is a binornial randorr1 variable. Note t.hat a
Bernoulli random variable is a binorr1ial randorn variable vvit h 'n = 1.
[
Example 3.12
If there is a 0.2 probabil ity of a reject a nd we perform 10 test s, the PM F of the bino mia l
(10,0.2) rando m variable is
0.4
PK(k)
0.2
(3 .16)
0
I I
() 5 10 k
Example 3.13
Perform independen t t ests of integrated circuits in which each circuit is rejected with
probabil ity p . Observe L , the number of tests performed until t here are k rejects . What
is the PM F of L ?
For large va lues of k , it is not practical to draw the tree. In this case, L = l if a nd on ly
if there are k  1 successes in the first l  1 t ria ls and there is a success on tria l l so
t hat
The events A a nd B are independent since t he outcome of atte mpt l is not affect ed
by the previous l  1 attempts. Note that P[A] is t he binomia l proba bil ity of k  1
successes (i. e. , rejects) in l  1 tria ls so that
In general, the r1l1rr1ber of Be1noulli t rials that take place unt il one of t he t\vo
outcornes is observed k t imes is a Pascal randorr1 variable. For a P ascal ( k, p)
[
r andorr1 variable X , Px(x) is nonzero only for x = k, k + 1, .... Definit ion 3. 7 does
not state t h e valt1es of k for which Px(x;) = 0 because in Defir1ition 3.6 we h a;ve
(~) = 0 for x tf. {O, 1, ... , ri} . Also note t h at t 11e P ascal (l ,JJ) r ar1dom v ariable is
t he geometr ic (p) randorr1 variable.
Example 3.14
If t here is a 0.2 probabi lity of a reject and we seek fou r defective circuits , the ra ndom
variable L is the number of tests necessary to find the four circu its . T he PMF of the
Pascal( 4,0.2 ) random variable is
0.1 ~
0.05
0 ...........................................................................
0 20 40 l
T o describe this discret e 11r1iforrn randorr1 variable, vve use the expressior1 "X is
uniformly distribt1t ed betvveer1 k and l ."
Example 3.16
Roll a fair die. T he random variable N is the number of spots on the s ide fac ing up .
T herefore, JV is a discrete un iform (1, 6) random variable with PMF
[
0.2 , . . .       .....
1/ 6 n, = 1, 2, 3, 4, 5,6,
(3.21 )
0 .__..___..___.___.___,___..___, 0 otherwise.
0 5 T/,
The prob ability rr1od el of a Poisson r andorn var ia ble d escribes p l1enornen a t h at
occur r andornly in t ime. W hile t 11e t ime of each occurrence is complet ely randorn,
there is a k r1ovvn aver age nt1rr1ber of occurrences per unit time . The Poisson rnodel
is l1sed \videly ir1 rr1an y fields . For example , the arrival of inforrnatior1 r equest s at
a \ orld '\ ide \ eb ser\rer , t he init iation of t elephone calls , and the err1ission of
p art icles frorr1 a radioactive source are often rr1odeled as P oisson randorn varia bles.
vVe will r ett1rn t o Poisson randorr1 variables rnan y times in this t ext. At t his point ,
"''e cor1sider onl:y the basic properties .
a xe  a./ x ! x = o, 1 , 2, . . . ,
Px (:i; ) =
0 other'tuise,
T o d escribe a P oisson r ar1dom \rariable, \Ve \vill call the occurrence of t 11e pl1e
nornenon of interest ar1 arrival. A P oisson rr1od el often specifies an aver age r at e,
,.\ arrivals per second, and a tirne ir1terval, T seconds . In this tirr1e interval , t h e
n11rnber of arri\rals X has a Poisson P l\/IF V1rith a = ,.\T.
i::::::== Ex a m p Ie 3 . l ri___,;;:::::::::11
Th e numbe r of hits at a website in any t ime in terva l is a Poisso n random variab le . A
particular site has on average,.\= 2 hits per second. W hat is the probability t hat there
are no hits in an interva l of 0.25 seconds? What is the probability that there are no
more than two hits in an interva l of one second?
In a n interval of 0.25 seconds, t he number of hits His a Po isson ra ndo m va ri able with
a = ,.\T = (2 hits/ s) x (0.25 s) = 0.5 hits. Th e PM F of n is
I
0.5he 0 5 / h! fi = 0, 1, 2, ...
0 otherwise.
o
0 2 4 h
The probab ility of no hits is
P.J(j) 0.2
2.ie 2 /j! j =0, 1, 2, ...
0.1
0 otherwise.
I
0
0 2 4 6 8 J
To fi nd the probability of no more t han two hits, we note t hat
The PMF of J( is
0.2
5ke 5 / k! k = 0, 1, 2, ...
0 otherwise.
I II .
0
0 5 10 15 k
Therefore, P[I< = O] = P1<(0) = e 5 = 0.0067 . To answer t he questi o n about t he
2second interva l, we note in t he prob lem definition that a = 5 queries = >..T with
T = 10 seconds. Therefore, >.. = 0.5 queries per second. If N is t he number of queries
processed in a 2second interval , a = 2>.. = 1 and N is the Poisson (1) random variable
w ith PMF
e 1 /ri! ri = 0 , 1, 2, ...
(3.25)
0 otherwise.
Therefore,
Note t h at the units of >.. arid T have to be consistent . Ir1stead of >.. = 0.5 ql1eries
per second for T = 10 seconds, v.re could t1se >.. = 30 ql1eries per rr1int1te for the tirne
[
interval T = 1/6 rninl1tes t o obtain the sarr1e o~ = 5 qlleries , and therefore t he sarne
probabilit}' rr1odel.
In t he follovving exarr1ples, vve see that for a fixed rate ,\ , the shape of the P oisson
P MF depends on the dl1r c.ition T over w11ich arrivals are counted.
Example 3.19
Calls arrive at ra ndom t imes at a te lep hone switching office with an average of,\ = 0.25
ca lls/ second. T he PM F of the numbe r of ca lls that arrive in a T = 2second int erva l
is the Poisson (0.5) ra ndom variab le with PM F
1     
P.J(j) (o.5).i e 0 5 I j ! j = o, 1, ... ,
0.5
0 ot herwise.
o 0 2 J 4
Note that we obtain the same PMF if we define the arriva l rate as ,\ = 60 0.25 = 15
calls per minute a nd derive the PMF of t he nu mber of ca lls t hat arrive in2 / 60 = 1/ 30
minutes.
Example 3.20
Calls arrive at random t imes at a te lep hone switch ing office with an average of,\ = 0.25
ca lls per second . T he PM F of t he nu mber of ca lls that arrive in any T = 20second
interval is the Poisson (5) random variab le with P MF
0.2
P.J(j) 5.ie 5 / j ! j = 0, 1, ... ,
0.1 p.J (j) =
0 otherwise.
0
.I II
0 5 10 15 J
Quiz 3.3
E ach t irne a rnodern trar1srnits on e b it, t h e receivir1g rr1od err1 a n al}rzes t h e sign al
that arrives a nd d ecides \vhet11er the t r ansmit t ed b it is 0 or 1. It rr1akes a n error
vvith probability p, independent of whet11er an}' ot11er b it is received correctly .
(a) If t he t ransmission cont inues until t 11e r eceiving rr1odem rnakes its first error ,
wh at is the P JVIF of X , the nl1rr1ber of b its t ransrr1itted?
(b) If IJ = 0.1 , what is t he probability t 11at X = 10? \Vhat is the probability t hat
x > 10?
(c) If t he rnoderr1 t ransmits 100 bits, what is the PMF of Y , t he nl1mber of errors?
( d ) If [ J = 0.01 and t he rnoderr1trar1srnits 100 bits , wh at is t he probability of Y = 2
errors at the recei\rer ? \Vh at is t he probability that y < 2?
( e) If the t ransmission contir1ues until t he r ecei\ring modem rr1akes t hree errors,
'\vh at is the P JVIF of Z , t 11e nurnber of bits trar1srnit t ed ?
[
( f) If '[J = 0.25 , vvhat is t11e probabilit}' of Z = 12 bit s transmitted t1ntil the moderr1
rnakes three errors?
Like the P}v1F, the CDF of random variable X expresses the prob
ability rr1odel of a n experiment as a rnatherr1atical funct ion. The
function is the probability P [X < 1'; ] for every nurr1ber x .
The PNIF and CDF are closely relat ed. Eac11 can be obtained easil}' frorr1 the ot11er.
For any real r1urnber x, t11e CDF is the proba bility t h at t11e randorr1 varia ble X
is no larger than x . All randorr1 variables h ave c11mulative distribution fur1ct ions,
b11t onl}' discret e randorn v ariables ha;ve probabilit}' rr1ass functior1s . The notat ion
convention for the CDF follovvs that of the PNIF , except that vve use the letter F
vvith a subscript corresponding t o the narr1e of the randorr1 variable. Because F x( x)
describes t11e probabilit y of an event , the CDF h as a nurnber of properties.
{d) Fx(x ) = Fx(x;i ) for all x s'uch that x;i < x < x;,i+I
Each proper ty of Theorerr1 3.2 has ar1 equivalent stat err1er1t ir1 v.rords:
(a) Going from left to right on the xaxis, Fx(x ) st a rts at zero a rid er1ds at or1e.
(b ) The CDF never decreases as it goes from left to right.
(c) For a discret e randorr1 variable X , t here is a jurnp (discor1t inuity ) at each
value of xi E Bx . T11e heigh t of the jt1rnp at x,i: is Px (xi )
[
( d) B et\veen jumps, the graph of the CDF of the discrete randorr1 variable X is a
horizontal line.
Another irnportant cor1seql1er1ce of the definition of t11e CDF is that the differ
ence betvveen the CDF evah1at ed at two points is t11e probability that the randorn
variable takes on a vall1e b et\veen these tvvo poir1ts:
Theorem 3.3
For all b > a, 7
Fx(b)  Fx(a) = P [a< X < b].
Note also that E a and Eab are mutually exclusive so t hat I>[ Eb] = I> [Ea] + P [Eab ] Since
P[Eb] = Fx(b) and P[Ea ] = Fx(a), \Ve can \Vrite Fx(b) = Fx(a)+P[a < X < b]. Therefore,
P[a < X < b] = Fx(b)  Fx(a) .
In vvorking with the CDF, it is necessar}' to pay car eful attent ion to the n ature
of ineqt1alities, strict ( <) or loose ( <) . The defir1ition of the CDF contair1s a loose
(less thar1 or equal to) inequalit}' , which mear1s t11at the ft1nction is cont inuous from
the right. To sket ch a CDF of a discrete r andom variable, \ve dra\v a graph \vith
the vertical va1t1e begir1r1ing at zero at the left end of t he horizor1tal axis (r1egati\re
nt1rr1bers witl1 large magnitude) . It remair1s zero until x 1 , the first value of x witl1
nonzero probability. The graphjurr1ps by an arr1ount Px(x,i) at each ;,r;i \vith nor1zero
probabilit}' We draw t11e graph of t11e CDF as a st aircase with j11rr1ps at each Xi witl1
nonzero probability. The CDF is the upper value of e\rery jump in t11e staircase .
Example 3.21
In Example 3.5, ra ndom variable X has PMF
0.5
1/ 4 x = O
Px(x) )
1/ 2 x = 1,
Px(x;) = (3.28)
0 .....___.__.......__...._..... 1/ 4 x = 2,
1 0 1 2 3 x 0 otherwise.
Fi nd and sketch the CDF of random variable X.
Referring to the P MF Px(x;) , we derive the CD F of ra ndom variable X:
1 0 x < 0,
I
Fx(x;) 1/4 0 < x < 1.
0.5 Fx (x;) = P [X < x;] =  I
3/4 1 <x< 2
0
I
1 0 1 2 3 x 1 x > 2.
[
Kee p in mind that at the d iscont in uit ies x; = 0, x = 1 and x = 2, t he va lues of Fx(x)
a re the upper va lues: Fx(O) = 1/4, Fx( l ) = 3/ 4 a nd Fx( 2) = l . Math texts ca ll this
t he right hand limit of Fx(x) .
Consider any finit e rar1dorr1 var iable X vvit h all elern ents of Sx betvveen x;rriin
and x;rnax For t his r andom variable, t he nurr1erica.l specification of t he CDF begins
vvit h
Fx (x) = 1,
Like tl1e statement "Px(x;) = 0 otherwise," the descript ion of t he CD F is incorr1plete
vvithout t11ese two statem ents. The next exam ple disp la}'S t he CD F of an infinite
discrete random variable.
l n Examp le 3.9, let the probabil ity t hat a circu it is rejected equa l p = 1/ 4 . T he PM F
of Y, t he number of t ests up to a nd inc lud ing the first reject, is the geomet ric (1/ 4)
rando m variable with PM F
What is the CD F of y7
Random variab le y has nonzero probab ilit ies for a ll positive integers. For any integer
n, > 1, the CDF is
Eq uat ion (3 .30) is a geometric series. Fam ilia rity with t he geometric series is essen
tial for ca lcu lating probabi lities invo lving geometric rando m variables. Appendix B
su mmarizes t he most important facts. In part icu lar, Math Fact B.4 imp lies (1 
x) ~~ 1 x;.i l = 1  xn . Substituting x; = 3/4, we obta in
The complet e expression for the CD F of y must show Fy(y) for a ll integer and nonin
teger va lues of y. For an in tegervalued ra ndom va riable Y , we can do t his in a sim ple
way using the floor function lYJ , wh ich is the largest intege r less than or equal to y . In
pa rt icu la r, if n, < y < n,  1 for some integer ri, then lYJ = n, and
P [3 < y < 8] = F y (8)  Fy (3) = (3/ 4)3  (3/ 4)8 = 0.322. (3.34)
Quiz 3.4
Use t he CDF Fy(y ) t o find t:he followir1g probabilities:
1
0.8 I (a) P[Y < 1] (b) P[Y < 1]
Fy(y) 0.6 I
(c) P [Y > 2] (d) P[Y > 2]
0.4
0.2 (e) P [Y = 1] (f) P['Y = 3]
0
0 1 2 3 4 5 y
(3.35)
Exarr1ple 3.23 arid the preceding comrnents on aver ages apply to a set of r1urn
bers observed ir1 a practical situation. T he probability rnodels of randorn v ar iables
characterize experirner1ts vvith r1urr1erica.1 outcornes, and in practical applications
of probability, we assum e that the probabilit}' rnodels are related to t11e nl1mbers
observed in practice. Just as a statistic describes a set of r1urnbers observed in
practice, a pararneter desc1ibes a probabilit}' rnodel. Each pararr1eter is a r1urr1ber
that can be cornputed from t11e P l\/IF or CDF of a r andorr1 v ariable . \ hen "'ir..re use a
probabilit}' rnodel of a randorn variable to represent an application t11at resl1lts in a
set of numbers , the expected valv,e of the r andorn variable corresponds to the rr1ean
value of the set of nl1rnbers. Expected v alues appear thro11ghout the remainder of
this textbook. T v.ro notatio ns for the expected value of rar1dom variable X ar e E [X )
and 11,x .
Correspor1ding to the ot11er t wo averages, v.re h ave the follovvir1g definitions:
Neither the rnode nor the rnedian, of a randorr1 variable X is necessarily unique.
There ar e random v ar iables that 11ave several rnodes or m edians.
E [X ) = ,x = L xPx(i;) .
:r;ESx
[
Ex1Jectation, is a S}rnor1}rrn for expected vall1e. Sometirnes the t erm rnean, valv,e is
also used as a synon}rm for expected value. \e prefer to tlse rnean value to refer
to a stat'istic of a set of experirnen tal d ata (the surr1 divided by t he number of data
iterns) t o distinguish it frorn expected v alue, "'' hi ch is a pararneter of a probabilit}'
rnodel. If you recall your studies of rr1echanics , t he forrn of Definition 3.13 m ay
look familiar. Think of point m asses or1 a line "''ith a rr1ass of Px(x) kilogr ams at
a distance of x m eters from the origin. In t11is rnodel, ,x in Definit ion 3.13 is t he
center of rnass. This is "''h Y Px(x) is called probability rnass function.
   Example 3.24:=""'
Random variable X in Example 3.5 has PMF
0.5
1/ 4 x= O
Px(x) )
1/ 2 x = 1,
Px (x;) = (3.36)
0 ...___.__....___.._.... 1/ 4 x = 2,
1 0 1 2 3 x 0 otherwise.
What is E [X ]?
Each x(i) takes values in the set S x . Out of t11e ri t rials, ass11rr1e that eac11 x ESx
occurs N:i: t irr1es . Then the surr1 (3.38) becorr1es
(3 .39)
. N~1;
Px ( x) = 11m  . (3.41)
n+oo ri,
lirr1 rn,n =
n+oo
L x ( lirr1
n+oo 'T/,
!!2:_) = L :i;Px (:.r; ) = E [X ] . (3. 42)
x ESx x ESx
Theorem 3.4.===
T he B ern,oulli (rJ) r an,dorn variable X h as expect ed 1;alv,e E[X ] = p .
Theorem 3.5
T he geornetric (p) r a/ndorn, 1;ar~i able X has expected value E [X] = l / '[J.
pqx 1 x= l , 2, ...
Px (x) = {0
(3.43)
other\vise.
(3.44)
x =l x= l
E [x ] = P '"'"'
L xq
x l
= qP '"'"' x P q P 1
L :i;q = q 1 _ q2 = p2 = p. (3.45)
x =l x =l
This restllt is intuit ive if }' Oll recall the ir1tegrat ed circtlit t esting experirnents
and consider sorne nurnerical values. If the probability of rejecting an integrated
circuit is '[J = 1/ 5, then or1 aver age, }' OU have t o perforrn E[.Y ] = l / p = 5 t ests tlntil
[
you observe t11e first reject. If r> = 1/ 10, the average r1urnber of tests llnt il the first
reject is E[Y] = l /p = 10.
Theorem 3.6
The Poisson, (a) ran,dom, variable in, Defin,ition, 3. 9 has expected val'LJ,e E[X] = o~ .
Proof
(3.46)
V/e observe that x/1;! = l /(1;  1) ! and also t hat the x = 0 term in t he sum is zero. In
addition, 've substitute ax =ex nx  l to fact or ex from t he sum to obtain
(3.47)
\'\f e can conclude that the sum in this fo r mula equals 1 either b y referring to t he identity
ea = I:~ 0 al/ l! or b y applying T heorem 3.1 (b) to the fact that the sum is the sum of the
P lVIF of a Poisson random var iable L over all values in SL and P [SL ] = 1.
E [X] = n,p.
E [X] = k/r>.
(c) For the discrete 'un,iforrn (k, l) ra/ndorn 1;ar~iable X of Defin,ition, 3.8,
E [X] = (k + l)/2.
[
Theorem 3.8
P erfo rrn 11, B ernovili trials. Jn, each trial) let the probability of s'u,ccess be a/n,;
1JJhere a > 0 is a con,st an,t ar1,d n, > cv. . Let the ran,dorn variable I<n be the n,11,rnber
of s'u,ccesses in, then, trials. A s 11, + oo) P1<n(k) co'nverges to the P MF of a Poisson,
(a) ran,dorn variable.
Proof We first note t hat Kn is t he binomial (n,,ari) r andom variable 'iVit h PMF
( a)nk
1 
. (3.50)
'r/,
Notice t hat in t he first fraction , t here are k terms in t he numer ator. The denom inator is
nk, a lso a product of k terms, all equal to n,. Therefore, we can express t his fr action as
t he product of k fractions, each of t he form ( n  j)/n,. As n too, each of t hese fr act ions
approaches 1. Hence,
. n,(ri l ) (n,  k + 1)
lnn k = 1. (3.5 1)
n+oo n '
(3.5 2)
.
1llTI . (k)' _
p Kn a k e  0'./k'
,. k = 0 , 1, ... (3.53)
n + oo { 0 otherwise,
Example 3.25
A pa rcel shipping com pany offers a charging pla n : $1.00 for t he f irst pound, $0.90
for the second po und, etc . , down to $0.60 for the f ifth pound, wit h rounding up for a
fraction of a pound. For all packages between 6 and 10 pounds, the shipper wi ll charge
$5.00 per package. ( It w i ll not accept shi pments over 10 pounds.) Find a function
Y = g(X) fo r the charge in cents for send ing one package .
When t he package weight is an integer X E {1, 2, ... , 10} that speci fi es t he number
of pounds with round ing up for a fract ion of a pound , the f unction
y = g(X) =
105X  5X 2 x = 1, 2, 3,4,5
(3.54)
500 x = 6, 7, 8, 9, 10.
[
Ir1 this section we deter rn1ir1e t he probability rnodel of a derived randorr1 v ariable
frorn t11e probability rr1odel of t h e original randorr1 variable. \' e st art vvit h Px(x)
and a ft1nction Y = g(X ). We llSe this inforrnatior1 t o obtain Py(y) .
B efore "''e present the procedure for obtair1ing Py(y)) v.re alert stu dents t o t he
different nature of the ft1n ctions Px(:i;) a rid g(:i;). Alt hot1gl1 the}' are both ft1n ctions
V1rith t 11e argurr1ent x) t hey are ent irely different. Px(x) describes the probability
rnodel of a ra ndorn variable. It h as t he special structt1re prescribed in Theorem 3.1.
On the other hand ) g(x) car1 be any function at all. W hen we cornbir1e Px(:i;) and
g(x) t o derive t he probability rnodel for Y ) we arrive at a P MF that also conforrns
t o Theorem 3 .1.
T o describe y in terrr1s of our basic rnodel of probability, vve specify an exp erirner1t
consisting of t he following procedure and observation:
(3.55)
The sit11ation is a little rno re cornplicated vvhen g (:r;) t ransforrr1s several v alues of x
t o the sarne y . For each y E Sy ) "'irve ad d the probabilit ies of all of t11e values ;r; E Bx
for v.rhich g(:.c) = y . Theorern 3.9 applies ir1 gener al. It reduces t o E quation (3.55)
"''hen g( ;i;) is a onet oone transformation.
Theorem 3.9
F or a discre t e ran,dorn variable X ) t he P M F of Y = g(X ) 'ts
Py (y ) = L Px (x) .
'.J; :g(:i; )=y
If vve vieVI' X = x a.s t he 011tcome of an experirr1er1t) then Theorerr1 3.9 sa}'S t 11at
Py(y) is t 11e surr1 of the probabilit ies of all t he outcornes X = ;i; for V1rhich Y = y .
X= Y = l OO
.1 5
x Y = l 90
x Y = 270
x Y = 340
x "' Y =400
X=
.1 0 X= Y =500
X=
1/4 x = 1, 2, 3, 4,
Px (;r:) = (3.56)
0 otherwise.
The charge for a shipment, y , has range Sy = {100, 190, 270, 340} corresponding to
Sx = {1 , ... , 4} . The experiment can be described by the following tree. Here each
value of Y derives from a unique va lue of X. Hence , we can use Equation (3.55) to
find Py(y) .
Example 3.27
Suppose the probability model for the weight in poundsX of a package in Example 3.25
IS
0.2~~~
Px(x) 0.15 x = 1, 2, 3, 4,
0.1 Px (x; ) = 0.1 x =5 , 6, 7,8,
0 ................_ ........._____. 0 otherwise.
0 5 10 x
For the pricing plan given in Example 3.25, what is the PMF and expected value ofY,
the cost of shipping a package?
Now we have th ree values of X, specifically (6, 7,8), transformed by g() into y = 500 .
[
For t his situation we need t he more genera l view of the P MF of Y, given by Theorem 3.9.
In particular, Y6 = 500, and we have to add t he probabil it ies of the out comes X = 6,
X = 7, and X = 8 to find Py(500) . Th at is ,
Py (500) = Px (6) + Px (7) + Px (8) = 0.30. (3 .57)
T he steps in the procedure a re il lustrated in t he diagram of Figure 3.1. App lying
T heo rem 3.9, we have
0.15 y = 100, 190, 270, 340,
Py(y) 0.2
0.10 y = 400 ,
Py(y) =
0.30 y = 500 ,
o 100 270 500 y
0 otherwise.
For this proba bility model , the expected cost of sh ipping a package is
Pv(v) 1/ 7 v =  3,  2, ... ) 3,
0. 1 Pv(v) =
0 otherwise.
0 ....._...................................__.
5 0 5 v
Let y = V 2 / 2 watts de note t he power of t he transmit ted signal. Fi nd Py(y) .
Py(y) 0.2 1/ 7 y = 0,
Py (y) = 2/ 7 y = 0.5 , 2, 4.5 , (3.58)
0 _,______....___ __.
0 otherwise.
0 1 2 3 4 5 y
Quiz 3.6
l\/Ionitor three Cl1storr1ers purchasing srnartphones at the Phonesrnart store and
observe whetl1er each b uys an Apricot phor1e for $450 or a Banana phone for $300.
The ra.ndorn variable N is the nurnber of customers purchasing an Apricot phone.
Assume N has PMF
0.4 T/, = 0
'
PN (n,) = 0.2 T/, = 1, 2, 3, (3.59)
lO otherwise.
[
If Y = g(X), E[Y] can be calculat ed frorn Px( x) and g (X) v.rit hout
deriving Py(y ).
We en cour1ter rnar1:yr situations in v.rhich v.re need t o kr1ovv only the expected value
of a derived rar1dom variab le rather t h an the er1tire prob ability rr1odel. Fortur1at ely,
to obtain t his average, it is not necessar:yr t o compute the P l\/IF or CDF of t he nevv
r andorr1 variable. Ir1stead , 'ive can use t he follo'iving proper ty of expect ed \ralues .
Proof F rom t he d efinit ion of E[Y ] and Theorem 3.9, we can 'ivrite
'ivher e t he last d ouble summation follows because g(x) = y for each x in t he inner sum.
Since g(x) t r ansforms each possible ou tcome x E Sx t o a value y E Sy, t he preceding
d ouble summat ion can be 'ivritten as a single sum over all possible values x E Sx . That
IS,
   Example 3.291~
l n Examp le 3.26,
This of course is the sarne ansvver obtair1ed ir1 Exarnple 3.26 b}' first calculating
Py(y) and t11er1 applying Definition 3.13. As an exercise) you might "''ant to cornpute
E[Y] in Exarnple 3.27 directly from T11eorerr1 3.10.
Frorn t his t 11eorerr1 \Ve can derive sorr1e important properties of expected va.lt1es.
The first or1e h as to do wit h t he d ifference betv.reen a randorr1 variable and its
expected va.111e. vVher1 students learn t 11eir O'ivn grades on a rr1idterrr1 exarn, they
are quick to ask about t 11e class average. Let's say one student h as 73 and t he class
average is 80. She may be inclir1ed to t hink of her grade as "se'iren points belovv
average," rather tha r1 "73." In t errns of a probability rnodel, vve "''ould say t hat
the randorn variable X points or1 t he rnidterrn has beer1 transformed t o the randorn
variable
E [X  ,x ] = O.
The first term on t he right side is x b y d efinit ion. In t he second term ) 2=xESx Px(x) = 1,
so bot h terms on t he righ t s ide are J.lx and t he difference is zero.
Anot her property of the expect ed \ralue of a fur1ction of a ra ndorn variable applies
to linear t ransforrnations. 1
1 We ca ll the t ra nsform ation a X + b linear a l t ho ugh, strictly speaking, it, should b e called affine.
[
E [aX + b] = a E (X] + b.
This follov.rs directly frorr1 Defir1it ion 3.13 arid Theorem 3.10. A lir1ear transforma
tion is essentially a scale change of a qt1antity, like a t ransforrr1ation frorr1 inches
to cent imeters or from degrees Fahrenheit to degTees Celsius. If v.re express the
data (r andom variable X) in nevv t1r1its , the new aver age is just t he old average
trar1sformed to the nevv units. (If the professor adds five points to everyone's grade,
the aver age goes up by five points.)
T11is is a rare exarnple of a sitl1atior1 in v.r11ich E(g(X )] = g(E(X]) . It is ternpt'in,g,
[yu,t 11,sually 'UJrorig, to apply it to other transform,ation,s. For ex ample, if y = X 2 )
it is t1su a lly t 11e case that E(Y] =I= (E (X] ) 2 . Expressing this ir1 gen eral terrr1s, it is
t1sually t11e case that E(g(X)] ~ g(E(X]).
Example 3.30
Recall fro m Examples 3.5 and 3.24 that X has PMF
0.5
Px(x) 1/ 4 x=O )
1/ 2 x=l
Px(x;) = ' (3.66)
0 ...___..____,.._____, 1/ 4 x = 2,
1 0 1 2 3 x 0 otherwise.
What is the expected va lue of V = g(X) = 4X + 7?
From T heorem 3.12,
Example 3.31
Continu ing Examp le 3.30, let W = h,(X) = X 2 . W hat is E (T1V] ?
E (W] = Lh(x) Px( x;) = (1 / 4)0 2 + (1/ 2)1 2 + (1/4) 22 = 1.5. (3 .69)
(]"x = )Var [X ).
It is usef\11 t o take t he square root of v ar[X] because (]"x has the sarr1e unit s (for
exarr1ple, exarn poir1ts) as X . T11e units of t 11e variance ar e sql1ares of the units of
the randorr1 variable (exa.rr1 points sq11ared). Thus (]" x car1 be cornp ared directly
vvith t 11e expect ed value. Ir1forrr1ally, we think of sarr1ple v alues vvit hir1 (]" x of t h e
expected value, ;i; E [1;,x  (]"x , ,x + (]"x], as "t ypical" values of X and other values
as "unusua l." In m a r1y a,pplicatior1s, a bout 2/ 3 of t h e observatior1s of a r andorn
variable ar e vvit hin one s t andard deviation of t h e expect ed value. Thus if the
standard deviatior1 of exarn scores is 12 poir1ts , t 11e st11dent vvith a score of + 7 vvit 11
r espect t o the rnean can thir1k of herself in the rniddle of t11e class. If the st andard
deviat ion is 3 points, she is likely t o be r1ear t he t op .
T11e varia nce is also u seful when you gu ess or predict t11e value of a r a ndom
variable X. Suppose you ar e asked to rr1ake a predict ion 5; before you p er forrn ar1
experirr1ent and observe a. sarr1ple value of X. The prediction x is also called a blin,d
estirnate of X sir1ce }'Ollr predictior1 is an estirr1ate of X vvithout t he benefit of ar1y
observation. Since yo11 vvould like the prediction error X  x to be srnall, a popular
approach is to choose x t o rninimize the expect ed square error
e = E [(X  x) 2 ] . (3.71)
Another narr1e for e is the rr1ean square error or MSE. \i\Tit 11 kno\vledge of t he P1!{F
Px(:i;), vve can choose x t o m inirnize the MSE.
Theorem 3.13
Iri the absen,ce of observation,s; the rn'in,irnv,rn rnean, sqv,are error estirnate of ran,dorn
variable X is
5; = E [X) .
Proof After substit u t ing X = x, we expand t he squ are in Equat ion (3.71) to \Vrite
[
To minimize e, we solve
de = 2 E [x l + 2::i; = 0'
d
A
A
(3. 73)
x
yielding i; = E[ X].
Therefore, E[X] is a best estimate of X and Var[X ] is the J\!{SE associated vvith
this best estirr1ate .
Because (X  ,x )2 is a function of X , Var[X] can be corr1puted according to
Theorerr1 3.10.
By expandir1g t11e sql1are in this formula, v.re arrive at the rr1ost 11seful approach to
cornputing the variance.
2
Var [X] = E [X 2 J  11,'i = E [X 2 J  (E [X]) .
Var[X] = L 2
x Px(x)  L 2p,xxPx(x) + L p,2xPx(x)
xESx xESx
xESx xESx
(3. 76)
Thus, E(X ] is the first rnornen,t of r andorn v ariable X. Sirr1ilar ly, E(X 2 ] is the
secon,d rnornen,t. T11eorem 3.14 says that t11e variar1ce of X is the second moment
of X minllS the sql1are of t11e first moment.
Like the P NIF a rid t he CDF of a rar1dom variable, the set of rr1oments of X is
a corr1plete probabilit}' rr1odel. We learn in Section 9. 2 that the rnodel based on
rr1oments car1 be expressed as a rnornen,t gerieratirig f11,n,ction,.
Example 3.32
Continu ing Examples 3.5, 3.24, and 3.30, we recall that X has PMF
0.5
Px(x) f l/4 x =O,
Px(x;) =) 1/ 2 x = 1, (3.77)
o
1 O 1 2 3 x
l
) 1/ 4
O
x = 2,
otherwise,
and expected va lue E(X] = 1. What is the variance of X?
In order of increasing simplicity , we present three ways to compute Var(X].
From Defin ition 3.15, define
1/ 2 'UJ = 0 , 1,
Pvv ('ID) = (3.79)
0 otherwise.
Then
Recall that Theorem 3.10 produces the same result without requiring the deriva
tio n of P11v('ID) .
Var[X] = E [(X  ,x ) 2 ]
= (0  1) 2 Px (0) + (1  1) 2 Px (1) + (2  1) 2 Px (2)
= 1/2. (3.81)
(3.82)
Proof \Ale let Y = aX +band apply Theorem 3.14. We first expand t he second m oment
to obtain
(3.85)
1/ 7 v =  3,  2, ... ) 3,
Pv(v) = (3.88)
0 otherwise.
[
A new voltmeter records t he amplitude U in millivolts. Find the variance and standard
deviation of U.
Note that U = lOOOV. To use Theorem 3.15, we first find the variance of v. The
expected value of the amplitude is
Quiz 3.8
Ir1 an experirr1en t wit l1 three cust orr1ers enterir1g the Phonesrnart store> t 11e obser
vation is N, the number of pl1ones pl1rchased. The P MF of N is
(4  n,)/10 n, = 0, 1, 2>3
(3.92)
0 othervvise.
Find
[
3. 9 l\IIA TLAB 99
(a) The expect ed value E [N] (b) The second mornent E [N 2 ]
3. 9 1\1.IATLAB
2 Although column vectors a re supposed Lo appear as columns, \~re generally write a column vector
x in the form of a transposed row vector [x1 Xrn ]' Lo save space.
[
0.15 :J; = 1, 2, 3, 4,
Px (x) = 0.1 ;i; = 5,6,7, 8, (3.93)
0 otherwise.
W rite a MATLAB functio n that ca lculates Px(x) . Ca lcu late the probab ility of an Xi
pound package for x 1 = 2, :i;2 = 2. 5, and :i;3 = 6 .
Example 3.36
W rite a l\IIATLAB funct ion geometricpmf (p, x) to calculate, for the sa mple val ues in
vector x , Px(x) for a geometric (p) ra ndom variable.
Example 3.37
W rite a MATLAB f unct ion t hat ca lcu lates the Poisson (a) PMF .
For an integer x , we could calculate Px(x) by the d irect ca lculation
px= ((alpha~x)*exp(alpha*x))/factorial(x)
T his w ill y ield the right answe r as long as the argu ment ;i; for the factoria l function is
not too large . In l\IIATLAB version 6, factoria l (171) causes an overflow. In add it ion,
for a > 1, ca lcu lati ng the rat io a/1; /x ! for large ;i; ca n cause numerica l problems beca use
both ax and x ! w i ll be very large nu m bers , p ossibly with a sm all quot ient. Another
shortcomin g of t he direct calcu lation is ap pare nt if you wa nt to ca lcu late Px(x) for
[
a,:i :ea a
Px(;r;) = = Px(x  l ) . (3 .94)
x! x
T he poissonpmf .m f unct ion uses Equation (3.94) to ca lculate Px(x) . Even this code
is not perfect because 1\11.A.TLAB has limited range.
For t he Poisson CDF, t her e is no sirr1ple way t o avoid sumrr1ir1g t11e P MF . T 11e
follovvir1g exarnple shows an implernent atior1 of t he P oisson CDF . The code for a
CDF t er1ds t o be more com p licat ed t h an t h at for a P MF because if x is not a n
ir1teger, Fx(J';) m a}' still be nor1zero. Ot her CDFs a re easily developed following the
sarne approach.
Example 3.38
Write a MATL.A.B functio n that ca lculates t he CDF of a Poisson ra ndom variab le.
function cdf=poissoncdf (alpha,x) Here we present the l\ll ATLAB code for the
%output cdf (i)=Prob[X<=x(i)] Poisson CD F. Since the sa m ple va lues of a
x=floor(x(:)); Poisson ra ndom variable X are integers , we
sx=O:rnax(x); observe that Fx(x) = Fx(lxJ) where lxJ ,
cdf=cumsum(poissonprnf(alpha,sx)); equivalent to the 1\11.A.TLAB funct ion floor (x),
%cdf from 0 to max(x) denotes the largest integer less than or equal
okx=(x>=O);/.x(i)<O > cdf=O to x;.
x=(okx.*x);%set negative x(i)=O
cdf= okx.*cdf (x+1);
%cdf=O for x(i)<O
Let J\lf equal the number of hits in one minute (60 seconds). Note that M is a Poisson
[
>> poissoncdf (120,130) The 1\11.A.T LAB solution shown on the left executes the
ans= following math ca lcu lations:
0.8315
>> 1poissoncdf(120,110) 130
ans= P [M < 130] = L Pj\IJ(m,), (3. 96)
0.8061 rn= O
P [M > 110] = 1  P [M < 110]
110
= 1 L PNI(rn,) . (3.97)
1n=O
The progr arns described t11us far in t his section perform t he farniliar t ask of calcl1
lating a fur1ction of a single varia ble . Here, t he ft1nctions are P NIFs and CDFs . As
described in Section 2.5 , l\!IATLAB car1 also be l1sed t o sirr1ulat e experirnents. In this
sect ion we present M ATLAB progra rns t hat gener at e dat a cor1forrr1ir1g t o farr1ilies of
discrete r ar1dom variables . W hen rnar1y samples are generat ed by t hese prograrns,
the relative frequer1cy of d ata. ir1 an event in t he sarr1ple space converges to t he prob
ability of t he event. As in Chap ter 2, vve t1se rand() as a. SOl1rce of ran dorr1ness.
Let R = rand ( 1). R ecall t11a t rand ( 1) sirnt1lat es a r1 experirnent t hat is equally
likely t o prodt1ce a ny real nt1rr1ber in the interval [O>1]. We \vill learn in Chapter 4
that t o express t 11is idea in m at herr1atics , \Ve say that for any interval [a, b] c [O, 1],
F or exarr1ple>P [0.4 < R < 0.53] = 0. 13. Novv suppose \Ve wish to gener at e samples
of discret e r andorn variable J( \vit h SK = {O>1, ... }. Since 0 < FI<(k  1) <
FK(k) < 1, for all k> we ob serve t hat
(3.99)
T 11is fact leads to t 11e following a pproach (as shovvr1 in pseudocode) t o us ing
rand() t o produce a sarnp le of r andom variable J( :
Generate R = rand(1)
Find k* E SK such that F K(k* 1) < R < F K(k* )
Set J( = k *
[
A M .A.TLAB ft1nction that uses rand() in this v.ray sirr1ulates an experirr1ent that
produces sarnples of randorr1 variable K. Generally , t:his implies t hat before we car1
produce a sarr1ple of randorn variable K , "''e need to generate t he CDF of K. \Ve
can reuse t he work of t his computation b}' defir1ing our 1\11.A.TLAB fur1ctior1s such as
geometricrv (p, m) t o generate rn sarnple values each t irr1e. "\!Ve now preser1t t r1e
details associat ed witr1 ger1erating bir1ornial random variables.
Example 3.40
Write a function that generates 'IT/, samples of a binomial (n,,rJ) random variable.
Generating binornial randorn v ariables is easy because the range is simply {0, ... , ri}
and t he rninirr111m v alue is zero. The M .A.TLAB code for geometricrv, poissonrv,
and pascalrv is slight ly rr1ore complicated becat1se vve need t o generate eno11gh
t erms of t r1e CDF t o enst1re t r1at vie fir1d k* .
T able 3.1 contair1s a collection of ft1nctions for an arbitrary probability rnodel and
t he six families of randorr1 variables int roduced ir1 Section 3.3. As in Exarr1ple 3.35,
the functions in t he first ro'iv car1 be used for an}' discret e randorr1 variable X vvitr1
a finite sarnple space. Tr1e arg11ment s is t he vector of sarr1ple \ralues s ,i of X, and p
is t r1e corresponding vector of probabilit ies P [s,i] of those sample valt1es. For P l\!IF
and CDF calculations, x is t r1e \rector of numbers for vvr1ich t he calculation is t o
be perforrned. In t he function f ini teserv, m is t r1e n11mber of rar1dorn sarnples
returned by the function. Each of t he final six ro'ivs of the table contair1s for one
fa mily t he pmf f\1nction for calc11lating va1t1es of t he P l\!IF, t r1e cdf function for
calctllating va1t1es of t he CDF , and t he rv ft1nction for ger1erating rar1dom sarnples.
In each function description, x denotes a colt1mr1 \rect or x = [ x 1 ~Drn J' . T r1e
pmf ft1nction output is a \rect ory st1cr1 t hat Yi = Px(x,i) . T r1e cdf function 011t p11t
is a vector y sucr1 t r1at Yi = Fx(x,i) The rv ft1r1ction output is a vector X =
[
?.
's..
0....
0.2 ~
a'
!\)
Q)
0.2 ~
a'
!\)
er::
il)
0.2
0:::
0 0 0
0 I 2 3 4 5 0 I 2 3 4 5 0 1 2 3 4 5
y y y
P l\/IF Py(y) Sample Run 1 Sarnple Rur1 2
Figure 3.2 The P l\IIF of Y and the relative frequencies found in tvvo sample runs of
voltpower(100). Note t h at in each run, the r elative frequencies are close to (but not
exactly equal to) t he corresponding PMF.
[Xi X1n J' such that each X i is a sarr1ple value of the r andom variable X. If
m, = 1, t11en the output is a sir1gle sarnple value of randorr1 v ariable X.
We preser1t an additional exarnple , partly because it dernor1strates som e useful
MATLAB fur1ctions, and also becat1se it shov.rs hov.r to generate the relative frequen
cies of randorr1 sarr1ples.
= = Example 3.42
> > sx= [ 1 3 5 7 3] ; f ini tepmf () accounts for multiple occ urrences
>> px=[0.1 0.2 0.2 0.3 0.2]; of a sa m p ie va Iue. In the exam p ie on the left,
>> prnfx=finiteprnf(sx,px,1:7);
>> prnfx' pmfx(3)=px(2)+px(5)=0.4.
ans =
0.10 0 0.40 0 0.20 0 0.30
It m a}' seern unnecessar}' and perhaps even b izarre to allow t11ese rep eated v alues .
Hov.rever , we see in the next example that it is quite convenier1t for derived r ar1dom
variables y = g(X) vvith t he property t11at g(x;i) is the sam e for rr1ultiple x;,i
0.15 x; =l , 2, 3, 4,
Px (x) = 0.1 x; = 5,6,7, 8, y=
105X  5X 2 1<

x < 5.
 I
of sarr1ples :c( l ), x(2), ... , J';('n) of a rar1dorn variable X \vill converge t o E (X] as n,
becomes large. For a discrete llniforrn (0, 10) randorr1 \rariable X , use MAT LAB to
exarr1ine this convergence.
(a) For 100 sample va1t1es of X , plot t he sequence rn,1 , rn,2 , ... , 'JT1,100 . R epea.t t his
experiment five times, plotting all five 'JTl,n cur\res or1 common axes .
(b ) Repeat part (a ) for 1000 sarr1ple values of X.
Problems
Difficulty: Easy Moderate D ifficu lt Experts Only
3.2.1 The random variable fl has P l\!IF result is eit her a hom e run ('vit h probabil
ity q = 0.05) or a strike. Of course, t hree
71
_ { c(l / 2) n, = 0 , 1, 2, strikes and Casey is out .
P N (Tl,) 
0 otherwise. (a) W h at is t he p robabili ty P [H ) t hat
Casey hits a ho1ne run?
(a) What is t he value of t h e constant c? (b) For one atbat, 'vhat is t he Pl\IIF of fl,
(b) W hat isP [N< l )? t he number of t imes C asey s\vings his
bat?
3.2.2 The random variable V has Pl\IIF
3.2.5 A tablet computer t r a ns mits a file
2 over a \Vifi link to a n access point. D epend
Pv(v) = { cv v = 1, 2, 3 ,4,
ing on t he s ize of t he file, it is t r ansmitted
0 other,vise.
as N packets where N has PMF
PROBLEMS 107
a 1 a nd 1 given t hat any free throv1 goes (a) Draw a tree d iagram t hat describes the
in 'vith probability p, independent of any call setup procedure.
other free t hrow . (b) If all transmissions are indepe ndent
3.2.7 You roll a 6sided die repeatedly. and the probability is p that a SETUP
Starting with roll i = 1, le t Ri denote the message will get through, 'vhat is the
result of roll i. If Ri > i, t hen you will roll PMF of K , the number of messages
again; otherwise you stop. Let N denote trans1nitted in a call attempt?
t he number of rolls. (c) \i\fhat is the probability that the phone
(a) What is P [N > 3]? will generate a busy signal?
(b) F ind the PlVIF of J\T. (d) As manager of a cellular phone system,
you 'vant the probability of a busy sig
3.2.8 v ou are manager of a t icket agency nal to be less than 0.02. If p = 0.9,
t hat sells concert t ickets. You assume that 'vhat is the minimum value of n neces
people 'vill call three times in a n attempt sary to achieve your goal?
to buy t ickets and then give up. You vvant
to make sure that yo u are able to serve at 3.3.1 In a package of lVI&Ms, Y, the
least 953 of t he people 'vho 'vant t ickets. number of yellow M&~1Is , is uniformly d is
Let p be the probability that a caller gets tributed bet,veen 5 and 15.
t hrough to your t icket agency. \i\fhat is the (a) \tVhat is t he P~!IF of Y?
minimum value of p necessary to meet your (b) \i\fhatisP[Y<lO]?
goal?
( c) \i\fhat is P [Y > 12] ?
3.2.9 In the t icket agency of Prob
(d ) \iVhat is P [8 < Y < 12]?
le1n 3.2.8, each telephone ticket agent is
available to receive a call w ith probability 3.3.2 In a bag of 25 ~1I&Ms, each piece
0.2. If a ll agents are busy when someone is equally likely to be red, green, orange,
calls, t he caller hears a busy signal. '\i\fhat blue, or bro,vn, independent of t he color of
is the minimum number of agents that you any other piece. F ind the the PMF of R,
have to hire to meet your goal of serving the number of red pieces. \i\fhat is the prob
953 of t he custo1ners 'vho 'vant t ickets? ability a bag has no red M&~lfs?
3.2.10 Suppose w hen a baseball player 3.3.3 \i\fhen a conventional paging system
gets a hit, a single is twice as likely as a transmits a message, the probability that
double, 'vhich is twice as likely as a triple, the message w ill be received b y t he pager
'vhich is t'vice as likely as a home run. Also, it is sent to is p. To be co nfident that a
t he player's batting average, i.e., the prob message is received at least once, a system
ability the player gets a hit, is 0.300. Let B transmits t he message n, t imes.
denote the number of bases touched safely
(a) _Assuming all transmissions are inde
during an atbat. For example, B = 0 vvhen
pendent , 'vhat is the PMF of K, the
t he player makes an out, B = 1 on a single,
number of times t he pager receives the
and so on. \i\fhat is t he f> lVIF of B?
same message?
3.2.11 \i\fhen someone presses SEND on (b) Assume p = 0.8. What is the minimum
a cellular phone, t he phone attempts to set value of n that produces a probability
up a call by transmitting a SET.U P message of 0.95 of receiving the message at least
to a nearby base station. The phone waits once?
for a response , and if none arrives wit hin
0.5 seconds it tries again. If it doesn't get a 3.3.4 You roll a pair of fair dice unt il
response after n, = 6 tries, the phone stops you roll "doubles" (i.e., both dice are the
transmitting messages and generates a busy same). \iVhat is t he expected number, E[N],
signal. of rolls?
[
3.3.5 \i\fhen you go fishing, you attach 1n K , t he number of t ickets you buy up to
hooks to your line. E ach t im e you cast you r and including your fift h 'vinning t icket.
line, each hook will be sv;,rallo,ved b y a fis h (b) L is t he number of fli ps of a fair coin u p
'vit h probability h, independen t of whether to and including t he 33rd occu rrence of
any other hook is s'vallowed. What is t he tails. \t\f hat is t he P MF of L ?
PMF of I<, t he number o f fish t hat are (c) Star ting on d ay 1, you buy one lottery
hooked on a single cast of t he line? t icket each day. Each t icket is a winner
3.3.6 Any t ime a child t hrows a F risbee, 'vit h probability 0.01. Let JV! equal t he
t he child's dog catches t he Frisbee wit h number of t ickets you buy u p to and in
p robability p, independen t of whet her t he cluding your first winning t icket. \i\fhat
Fr isbee is caught on any previous t hrow. is t he P MF of M?
\i\f hen t he d og catches t he F risbee, it runs
3.3.10 The number of buses t hat arrive at
a'vay 'vit h t he Fr isbee, never to be seen
a b us stop in T minutes is a P oisson random
again. The child cont inues to t hro'v t he
variable B wit h expected value T /5.
Fr is bee u nt il t he d og catch es it . Let X
d enote t he number of t imes t he F risbee is (a) \t\fhat is t he P~1IF of B , t he number of
t h rown. buses t hat ar rive in T minutes?
PROBLEMS 109
3.3.13 In a bag of 64 "holiday season" interval, an arriving jet lands \Vith proba
M&~/[s, each ~1I&M is equally likely to be bility p = 2/ 3, independent of an arriving
red or green, independent of any other jet in any other minute. Such an arriv
M&M in the bag. ing jet blocks any \Vai t ing jet from taking
(a) If you randomly grab four M&Ms, 'vhat off in that oneminute interval. However, if
is the probability P [E] t hat you grab an there is no arrival, then t he \Vait ing jet at
equal number of red and green M&~l[s? the head of t he line takes off. Each takeoff
requires exactly one minute.
(b) What is t he PMF of G, the number of
(a) Let L 1 denote the number of jets that
green ~![&Ms in the bag?
land before the jet at the front of t he
( c) You begin eating randomly chosen line takes off. Find the P~IIF PL 1 ( l).
~![&Ms one by one. Let R equal the
(b) Let W denote the number of minutes
number of red M&~/[s you eat before
you \Vait until your jet takes off. Find
you eat your first green M&M. \i\!hat is
P[vV = 10]. (Note that if no jets land
the PMF of R?
for ten minutes, then one waiting jet
3.3.14 A radio station gives a pair of con \vill take off each minute and vV = 10.)
cert t ickets to the s ixth caller w ho kno,vs (c) What is the PMF of vV?
the birthday of t he performer. For each
3.3.17 Suppose each day (starting on day
person 'vho calls, the probability is 0.75 of
kno,ving the performer's birthday. All calls
1) you buy one lottery t icket vvith probabil
ity 1/ 2; othervvise, you buy no t ickets. A
are independent.
ticket is a \vinner with probability p inde
(a) What is the PMF of L, the number of pendent of the outcome of all other t ickets.
calls necessary to find t he \Vinner? Let Ni be t he event that on day i you do
(b) What is the probability of finding t he not buy a t icket. Let Wi be the event that
winner on the tenth call? on day i, you buy a winning ticket. Let L i
be the even t that on day i you buy a losing
( c) \i\fhat is the probability that the sta
ticket.
t ion will need nine or more calls to find
a winner? (a) \!\That are P [vV33], P[L81], and P[Ngg]?
(b) Let J{ be the number of the day on
3.3.15 In a packet voice communications \vhich you buy your first lottery t icket.
system, a source transmits packets contain F ind t he P~l[F PK( k).
ing d igitized speech to a receiver. Because (c) F ind the PMF of R, the number of los
transmission errors occasionally occur, an ing lottery t ickets you have purchased
ackno,vledgment (ACK) or a negative ac in m days.
kno,vledgment (NAK) is transmitted back
to the source to indicate the status of each ( d) Let D be t he number of t he day on
received packet. \i\!hen the transmitter gets 'vhich you buy your jth losing t icket.
a NAK , t he packet is retransmitted. Voice \i\fhat is PD(d)? Hint: If yo u buy your
packets are delay sensit ive, a nd a packet jth losing ticket on day d, ho\v many
can be transmitted a maximum of d times. losers did you have after d  1 days?
If a packet transmission is a n independent 3.3.18 The Sixers and the Celtics p lay
Bernoulli trial with success probability p, a best out of five playoff series. The se
'vhat is the P~l[F of 'I the number of t imes
1
,
ries ends as soo n as one of the teams has
a packet is transmitted? won three games. Assume that either team
3.3.16 At Newark a irport, your jet joins a is equally likely to win any game indepen
line as the tenth jet \vaiting for takeoff. At dently of any other game played. F ind
Ne,vark, takeoffs and landings are synchro (a) T he P~1IF PN(n) for the total number
nized to the minute. In each oneminute 1'l of games played in the series;
[
~
prove the binomial theorem for any a > 0 k < 1,
and b > 0. That is, show t h at FK(k) = { (1p) lkJ k > 1.
3.4.2 The random variable X has CDF 3.5.1 Let X have t he uniform P~!IF
PROBLEMS 111
(b) What is E [CJ, t he expected value of C? (b) \tVhat is P [X > E[X]]?
3.5.3 3.5.11 K is t he geometric (1/11) random
variable.
(a) The number of trains.] that arrive at
t he station in t ime t minutes is a Pois (a) What is P[K = E[K]]?
son random variable '~i th E [.J] = t. > E[ I<]]
(b) \tVhat is P [I<
F ind t such that P[.J > O] = 0.9. (c) W hat is P[K < E[K]]?
(b) The number of buses I< t hat arrive at
t he station in one hour is a Poisson ran 3.5.12 At a casino, people line up to pay
dom variable w ith E [K] = 10. F ind $20 each to be a contestant in t he fo llow
P [K = lO]. ing ga1ne: The contestant flips a fair coin
repeated ly. If s he flip s heads 20 t imes in
( c) In a 1 ins interval, the number of hi ts a row, s he walks away w ith R = 20 mil
Lon a \i\f eb server is a Poisson random lion dollars; other,vise she 'valks away 'vith
variable 'vith expected value E[L] = 2 R = 0 dollars.
hits. What is P [L < 1]?
(a) F ind the Ptv1F of R, t he re,vard earned
3.5.4 You simultaneously flip a pair of fair by t he contestant.
coins. Your friend g ives you one do llar if (b) The casino counts "losing contestants"
both coins come up heads. You repeat this w ho fail to 'vin the 20 million do llar
ten t imes and your friend gives you X dol prize. Let L equal the number of los
lars. F ind E [X ], t he expected number of ing contestants before t he first winning
dollars you receive. \tVhat is t he probability contestant. What is t he PMF of L?
t hat you do '\vorse t han average"? (c) Why does t he casino offer t his game?
3.5.5 i\ packet received by your s1nart
3.5.13 Give examples of practical appli
phone is errorfree 'vith probability 0.95, in
cations of probability t heory that can be
dependent of any other packet.
inodeled by t he follo,ving PMFs. In each
(a) Out of 10 packets received, let X equal case, state an experiment, t he sample space,
t he number of packets received 'vith er the range of the random variable, t he Pl\1F
rors. \i\fhat is t he PMF of X? of the random variable , and t he expected
(b) In one hour, your s martphone receives value:
12,000 packets. Let X equal t he num (a) Bernoulli
ber of packets rece ived with errors. (b) Binomial
\i\fhat is E[X]?
(c) Pascal
3.5.6 F ind t he expected value of t he ran (d) Poisson
dom variable Y in Problem 3.4.1. l\1ake up yotu o'vn examples. (Don't copy
3.5.7 F ind t he expected value of the ran examples from the text .)
dom variable X in Problem 3.4.2. 3.5.14 Find P[K < E [K]] when
3.5.8 F ind t he expected value of t he ran (a) K is geometric (1/3).
dom variable X in Problem 3.4.3. (b) J{ is binomial (6, 1/ 2).
3.5.9 Use Definit ion 3.13 t o calculate the (c) K is Poisson (3) .
expected value of a bino1nial ( 4, 1/ 2) ran (d) J{ is d iscrete uniform ( 0, 6).
dom variable X.
3.5.15 Suppose you go to a casino wit h ex
3.5.10 X is the discrete uniform (1, 5) ran actly $63. At t his casino, t he only game is
dom variable. roulette and t he only bets allo,ved are red
(a) W hat is P [X = E [X]]? and green. The payoff for a w inning bet
[
is the amount of the bet. In addition, the E[D)? Hint: Consider making a ran
'vheel is fair so that P [red] = P[green) = dom decision; if the host opens a suit 
1/ 2. You have the follo,ving strategy: F irst, case 'vith i dollars , let ai denote the
you bet $1. If you \Vin the bet, you quit probability that you s\vitch.
and leave the casino 'vith $64. If you lose,
you then bet $2. If yo u w in, you quit and 3.5.17 'Y ou are a contestant on a T V
go home. If you lose, you bet $4. In fact , game show; there are four ident icallooking
'vhenever you lose, you double your bet un suitcases containing $100, $200, $400, and
til either yo u \Vin a bet or you lose all of $800. You start the game b y rando1nly
your money. However, as soon as you win choosing a suitcase. Among the three 1ln
a bet, yo u quit and go home. Let Y equal chosen suitcases, the game sho\v host opens
the amount of money that you take home. the suitcase that holds the median amount
F ind Py(y) and E [Y). \i\10 uld you like to of money. (For example, if the unopened
play this game every day? suitcases contain $100, $400 and $800, the
host opens the $400 suitcase.) The host
3.5.16 In a TV game sho,v, there are three
then asks you if you want to keep your suit
identicallooking suitcases. T he first suit
case or switch one of the other remaining
case has 3 do llars, the second has 30 dol suitcases. For your analysis, use the follo\v
lars and the third has 300 do llars. You ing notation for events:
start the game by randomly choosing a suit
case. B et1Deen the t1110 7lnchosen s1litcases, Ci is the event that you choose a suit
the game show host opens the suitcase \Vi th case \Vith i dollars.
more money. The host then asks you if oi denotes the event that the host
you \Vant to keep your suitcase or S\Vitch opens a suitcase with i dollars.
to t he other remaining suitcase. _After you
make your decision, you open your suitcase R is the reward in dollars that you
and keep the D dollars inside. Should you keep.
switch suitcases? To ans,ver this question,
(a) You refuse t he host 's offer and open the
solve the follo,ving subproblems and use the
suitcase you first chose. Find the PMF
follo\ving notation:
of R and the expected value E[ R].
Ci is the event that you first choose
(b) You ahvays S\vitch and randomly
the suitcase \Vith i dollars.
choose one of the t\vo remaining suit 
oi denotes the event that the host cases \vith equal probability. 'You re
opens a suitcase \vith i dollars. ceive the R dollars in this chosen suit
In addit ion, you may w ish to go back and case. Sketch a tree d iagram for t his
review the l\/Ionty Hall problem in Exam experiment, and find the PMF and ex
ple 2.4. pected value of R.
(a) Suppose you never s\vitch; you a l,vays (c) Can you do better than either a l,vays
stick w ith your original choice. u se a S\vitching or al\vays staying with your
tree d iagram to find the I>MF Pn(d) original choice? Explain.
and expected value E[D).
3.5.18 'You are a contestant on a TV
(b) Suppose you always switch. u se a tree
game sho,v; there are four ident icallooking
diagram to find the P lVIF Pn( d) and ex
suitcases containing $200, $400, $800, and
pected value E [D ).
$1600. You start the game b y randomly
(c) Perhaps your rule for switching should choosing a suitcase. Among the three un
depend on ho\v many dollars are in the chosen s1litcases, the game sho\v host opens
suitcase that the host opens? \i\That the suitcase that holds t he least money.
is the optimal strategy to maximize The host then asks you if you \Vant to keep
[
PROBLEMS 113
your suitcase or sv;ritch one of the other re 'veightlifting work. What mass m, in
maining suitcases. For the follo,ving analy t he range 1 < m, < 100 should she use
sis) use the following notation for events: to maximize her probability of 'vinning
Ci is the event that you choose a suit t he inoney? For t he best choice of m,
case 'vith i dollars. 'vhat is the probability that s he 'vins
the inoney?
oi denotes the event t hat t he host
opens a suitcase 'vith i dollars. 3.5.22 At t he gym, a weigh tlifter can
R is the re,vard in do llars that you bench press a maximum of 200 kg. For a
keep. mass of m kg, (1 < m, < 200), the max
imum number of repetitions she can com
(a) You refuse the host's offer and open the
plete is R, a geometric random variable
suitcase you first chose. F ind the PMF
with expected value 200/1n.
of Rand the expected value E[ R).
(a) In terms of the mass m,, what is the
(b) You switch and randomly choose one
of the t'vo remaining s11itcases. You re PMF of R?
ceive the R dollars in this chosen suit  (b) \i\fhen she performs one repetition, she
case. Sketch a tree d iagram for this lifts t hem, kg mass a height h = 4/9.8
experiment, and find th.e P lVIF and ex meters and t hus does work 71J = m,gh =
pected value of R. 4m Joules. For R repetitions , she does
W = 4m,R Joules of 'vork. \i\fhat is
3.5.19 Let binomial random variable X 11 t he expected work E[W) that she w ill
denote the number of successes in n, complete?
Bernoulli trials 'vith success probability p. ( c) A friend offers to pay her 1000 dol
Prove t hat E[X11 ) = 'np. Hint: Use the fact lars if she can perform 1000 Joules of
that I:~ ~ Pxn_ 1 (1~) = 1. 'veightlift ing 'vork. \i\fhat mass m, in
3.5.20 Prove that if X is a nonnegative the range 1 < 1n < 200 should she use
integervalued random variable, then to maximize her probability of winning
the money?
00
PROBLEMS 115
\i\fhat is the expected number E[T] of win 3.7.B A new cell ular phone billing plan
ning tickets in fifty years? If each win costs $15 per mont h plus $1 for each minute
ning t icket is 'vorth $1000, what is the ex of use. If the number of minutes you use
pected amount E[R] collected on these win the phone in a month is a geometric ran
ning t ickets? Lastly, if each t icket costs $2, dom variable v1ith expected value l /p, 'vhat
'vhat is your expected net profit E [Q]? is t he expected monthly cost E[ C J of the
phone? For 'vhat values of p is this billing
3.7.4 Suppose an NBA basketball player
plan preferable to the billing plan of Prob
shooting an uncontested 2point shot will
lem 3.6.6 and Problem 3.7.7?
make t he basket with probab ility 0.6. How
ever, if you foul t he shooter, t he shot 'vill be 3.7.9 A particular circuit works if all 10 of
missed, but t'vo free thro,vs will be a'varded. its component devices work. Each circuit
Each free thro'v is an independent Bernoulli is tested before leaving the factory. Each
trial 'vith success probability p. Based on working circuit can be sold for k dollars, but
the expected number of points the shooter each nonworking circuit is worthless and
'vill score, for what values of p may it be mus t be t hrown away. Each circuit can be
desirable to foul the shooter? built with either ordinary devices or ultra
reliab le devices. An ordinary device has a
3.7.5 It can take up to four days after failure probability of q = 0.1 and costs $1.
you call for service to get yo ur computer An ultrareliable device has a failure proba
repaired. T he computer company charges bility of q / 2 but costs $3. i\.ssuming device
for repairs according to hov;; long you have failures are independent , s hould you build
to 'vait . The number of days D until the your circuit with ordinary devices or ultra
service technician arrives and the service reliable devices in order to maximize your
charge C, in dollars, are described by expected profit E[R]? I\.eep in mind that
your ans,ver 'vill depend on k.
2 3 4
0.4 0.3 0. 1 3.7.10 In the New Jersey state lottery,
each $1 t icket has s ix randomly marked
and numbers out of 1, ... , 46. A ticket is a 'vin
ner if t he six marked numbers match six
90 for 1day service,
numbers dra,vn at random at t he end of a
70 for 2day service, week. For each t icket sold, 50 cents is added
C=
40 for 3day service, to the pot for the w inners. If there are k
40 for 4day service. winning t ickets, the pot is d ivided equally
among the k winners. Suppose you bought
a winning t icket in a week in which 2ri tick
(a) What is t he expected waiting time ets are sold and the pot is n dollars.
,n = E[D]?
(a) \t\lhat is the probability q that a ran
(b) What is the expected deviation dom ticket will be a winner?
E [D  ,n]?
(b) F ind the P~l[F of Kn, the number of
( c) Express C as a function of D. other (besides your o'vn) winning tick
(d) What is the expected v alue E [C]? ets.
(c) What is the expected value of Wn, the
3.7.6 True or False: For any random var prize for your winning ticket?
iable X, E [l / X] = 1/ E [X].
3.7. 11 If there is no winner for the lot
3.7.7 For t he cell ular phone in Pro~ tery described in Problem 3.7.10, then the
lem 3.6.6, express the monthly cost C as a pot is carried over to the next 'veek. Sup
function of M, the number of m inutes used. pose t hat in a given 'veek, an r dollar pot
\i\fhat is the expected month ly cost E[C]? is carried over from the previous 'veek and
[
2n, tickets sold. Ans,ver the following ques 3.8. 7 Sho\v that the variance of Y
tions. aX +b is Var[Y] = a 2 Var[X].
(a) What is the probability q that a ran 3.8.8 Given a rando1n variable X 'vi th ex
dom t icket 'vill be a \Vinner? pected value JJ,x and variance a~ , find the
(b) If you own one of the 2n, tickets sold, expected value and variance of
w hat is the expected value of V, the
value (i.e., the amount you win) of
t hat t icket? Is it ever possible that
E [V] > 1?
3.8.9 In realtime packet data transmis
( c) S u ppose that in the instant before the sion, the time between successfully received
t icket sales are stopped, you are given packets is called t he interarrival tim,e, and
t he opportunity to buy one of each pos randomness in packet interarrival t imes is
sible ticket. For what values (if any) of called .fitter. J itter is undes irab le. One
ri and r should you do it? measure of j itter is the standard deviation
of t he packet interarrival time. From Prob
3.8.1 In an experiment to monitor t\vo lem 3.6.5 , calculate the j itter ar. How large
packets, the PI\l[F of N, the number of video must the successful transmission probabil
packets, is it y q be to ensure that the jitter is less than
2 milliseconds?
1 2
0.7 0.1 3.8.10 Random variable K has a Pois
son (a) distribution. Derive the proper
F ind E [N], E[N2], Var[J\T], and a N . t ies E[K] = Var [K] = a. Hint : E[K2] =
3.8.2 F ind the variance of the random var E[K(I<  1)] + E[I<].
iable Yin Problem 3.4.l. 3.8.11 For t he delay D in Problem 3.7.5,
what is the standard deviat ion an of t he
3.8.3 F ind the variance of the random var
wait ing time?
iable X in Problem 3.4.2.
3.9.1 Let X be t he binomial (100, 1/ 2)
3.8.4 F ind the variance of the random var
random variable. Let E2 denote the event
iable X in Problem 3.4.3.
that Xis a perfect square. Calculate P[E2].
3.8.5 Let X have t he bino1nial PI\l[F
3.9.2 Write a MATLAB function
(~)(1/2) 4
x=s hipwe ight8 (m) that produces m ran
Px(x) = dom sample values of the package \veight
X with PI\l[F given in Example 3.27.
3.9.3 u se the unique function to \vrite
(a) F ind the standard deviation of X.
a lVIATLAB script s hip cos tpmf . m that out
(b) What is P[x  ax< X < JJ,x +ax], puts the pair of vectors sy and py repre
t he probability t hat X is w ithin one senting t he PMF Py(y) of the shipping cost
standard dev iation of the expected Y in Example 3.27.
value?
3.9.4 For m = 10, m, = 100, and m =
3.8.6 X is the b inomia l (5, 0.5) random 1000, use I\IIATLAB to find the average cost
variable. of sending m, packages using the model of
Example 3.27. Your program input should
(a) F ind the standard deviation of X. have the number of trials m, as t he input.
(b) F ind 1=>[1),x  ax < X < JJ,x + ax], the The output should be Y = : I::n 1 Yi,
probability that X is \vithin one stan where Yi is the cost of the i th package. As
dard deviation of the expected value. m becomes large, 'vhat do you observe?
[
PROBLEMS 117
3.9.5 The Zipf (ri, n = ]_) random var ri = 10,000. How large should n, be to have
iable X introduced in Problem 3.3.12 is of reasonable agreement?
ten used to inodel the "popularity" of a col
3.9.7 Test t he convergence of Theo
lection of n objects. For example, a Web
rem 3.8. l<Dr n = 10, plot the PMF of Kn
server can deliver one of n Web pages. The
for (n,,p) = (10, 1), (n,,p) = (100, 0. 1), and
pages are numbered such t hat the page 1
(n,p) = (1000,0.01) and compare each re
is the most requested page , page 2 is the
sult \vith the Poisson (n) PMF.
second most requested page, and so on. If
page k is requested, then X = k. 3.9.8 use
the result of Problem 3.4.4
To reduce external net\vork traffic, an a nd the Random Sample Algorithm on
ISP gateway caches copies of the k most Page 102 to write a l\IIATLAB func
popular pages. Calculate, as a function of t ion k=geometricrv (p, m) that generates m,
n for 1 < n, < 1000, ho'v large k must be samples of a geometric (p) random variable.
to ensure that the cache can deliver a page
\Vith probability 0.75. 3.9.9 Find n*, the smallest value of ri
for which the function poissonpmf (n,n)
sho,vn in Example 3.37 reports an error.
3.9.6 Generate n independent samples of What is t he source of the error? \i\frite
the Poisson (5) random variable Y. For a function bigpoissonpmf (alpha,n) that
each y E Sy, let n,(y) denote the num calculates poissonpmf(n,n) for values of n,
ber of times that y was observed. T hus much larger than n,* . Hint: For a Poisson
l :yESy 'n(y) = n, and the re]ative frequency (n) random variable K,
of y is R(y) = n,(y) / n,. Compare the rela
~ ln(i)).
tive frequency of y against Py(y) b y plot
ting R(y) and Py(y) on the same grap h as PK(k) =exp (a+ kln(n) 
functions of y for n = 100, n, = 1000 and
[
Until novv) vie have studied discrete ran dorn variables . By defir1it ion, t r1e range of
a discrete random variable is a countable set of nurnbers . This chapter ar1al}rzes
randorn variables t hat ra nge over contint1ous sets of nl1mbers. A cont ir1t1ous set
of nurnbers, sornetirr1es referred t o as an iriter'val) contair1s all of t he real numbers
between tvvo lirnits. Marl}' experirnents lead t o ra ndorn variables vvit h a rar1ge
that is a continuous interval. Exarr1ples include rr1easl1ring T ) t he arrival t irr1e of a
particle (Sr = { tl O < t < oo} ); rneasl1ring v) t he volt age across a resistor (Sv =
{vi  oo < v < oo}); and rr1east1ring the phase a ngle A of a sinusoidal radio v.rave
(SA = {alO <a< 211} ). We will call T , V) arid A coritin,uous raridorn variables)
although we will defer a for rr1al definit ion t1nt il Sectior1 4.2.
Consistent vvith the axiorns of probability) vve assign nl1rnbers betvveen zero and
one t o all events (sets of elernents) in the sarnple space. A distinguishir1g feature of
t he models of contintlOllS randorn variables is that t he probabilit}' of eacr1 individt1al
outcorne is zero! To understand t his int uitively, consider an experirr1er1t in v.rhicr1
t he observation is the arrival t irne of t he professor at a class. Asst1me this professor
alvvays a rrives between 8 :55 a n d 9:05. We rnodel the a rrival t ime as a ra ndorn
variable T rninutes relative t o 9:00 o)clock. Therefore) Sr = {ti  5 < t < 5}. Think
abol1t predictir1g t he professor)s arrival t ime. The rnore precise t he prediction, the
lov.rer t he ch ance it will be correct . For example) yot1 migh t gt1ess t r1e interval
 1 < T < 1 rninute (8 :59 to 9:01 ). Your probability of being correct is higher
t h an if you gt1ess  0.5 < T < 0.5 rninl1te (8 :59:30 t o 9:00:30). As }'Ollr prediction
becomes rr1ore a nd rr1ore p recise, the probabilit}' that it will be correct get s srr1aller
and srnaller. T r1e cr1ance t hat t he professor will arrive v.rithin a ferr1t osecond of 9:00
118
[
   Example 4.1  
Suppose we have a whee l of c ircumference one meter and we mark a point on the
perimeter at the top of the wheel. In the center of the wheel is a rad ia I pointer
that we spin . After spinn ing the pointer, we measure the d istance, X meters, around
the circumfe rence of the wheel going clockwise from the marked point to the pointer
position as shown in Figure 4 .1. Clearly, 0 < X < l. Also, it is reasonable to be lieve
that if the spin is ha rd enough, the pointer is just as Ii kely to arrive at any pa rt of the
circle as at any other . For a given x; , what is the probabil ity P[X = x;] ?
o f the arc in w hi ch the po inter stops. Y is a discrete random variable w ith range
S y = {1 , 2, ... , ri} . Since al l parts of the wheel are equally li kely, all arcs have the
same probab ility. Thus the PMF of y is
From t he whee l on the r ight side of Fig ure 4.1, we can deduce that if X = x;, then
Y = jn,x;l , where t he notatio n Ial is defined as t he sma I lest in teger greate r tha n or
equal to a. Note tha t t he event { X = x} c {Y = jn,xl} , wh ich impl ies that
Just as in t 11e disct1ssion of t 11e professor arriving in class , sirnilar reasoning can
be applied to other experiment s to shovv t hat for any continuous r ar1dom variable,
t he probab ility of any ir1dividual outcome is zero. This is a fundamentally different
situation t han the one vve er1countered in Ollr stl1dy of discret e r andorn variables.
Clearl}' a probabilit y m ass function defined in t errr1s of probabilities of individual
outcornes has no rr1eaning in this context. For a cor1t inuous randorr1 variable, t he
ir1teresting probabilities apply to intervals.
[
The ke}' properties of the CDF , described in Theorerri 3.2 and Theorem 3.3, apply
to all ra ndorri \rariables. Graphs of all curriulative distribution funct ioris start at
zero on tlie left arid end at orie on tlie right. All are nondecreasing, a.nd, rnost irri
portaritly, the probability that the r andorri variable is in an iriterval is the difference
in t he CDF eva1t1ated at the erids of the interval.
.Theorem 4.Ir
For ariy ran,dorn variable X;
{a) Fx( oo) = 0 {b) Fx(oo) = 1
{c) P [x1 < X < x;2] = Fx(x2)  Fx(x1)
Although tliese proper t ies apply to any CDF, tliere is one irriporta.n t differerice
betvveeri the CDF of a discrete randorri variable arid the CDF of a cont inuous
r aridorri \rariable. R ecall that for a discrete r aridorri variable X, Fx(J";) lias zero
slope e\reryv.rhere except at values of x; wit h nonzero probability. At these poirits,
the function h as a discor1tiriuity in tlie forrn of a jl1rrip of rnagnitude Px(x) . By
contrast, the defining property of a cont iriuous random \rariable X is that Fx(x) is
a coritinuous function of X.
= Example 4.2
In the whee lspinni ng expe riment of Exa m ple 4.1 , f ind the CD F of X.
[
We begin by observing that any outcome x E Bx = (0, 1). This imp lies tha t Fx(x) = 0
for x < 0 , and Fx(:i;) = 1 fo r ;i; > 1 . To find the CD F for x between 0 and 1 we consider
the even t { X < ;i; }, with x; growi ng f rom 0 to 1. Each event cor responds to an arc
on the circle in Figure 4 .1. The arc is small when x ~ 0 and it includes nea r ly the
w hole circle when x; ~ 1. Fx(x) = P [X < x; ] is the probabi lity that the pointer stops
somewhere in the arc. T his probabil ity grows f rom 0 t o 1 as the arc increases to inc lude
the whole circle. Given our assumptio n that t he poin t er has no preferred stopping
places, it is reaso nable to expect the probabil ity to grow in proportion to the fraction
of the circ le occupied by the arc X < x . T his fract ion is simp ly x; . To be more formal ,
we can refer to Figure 4.1 and note that with the circle divided into n, arcs,
(4.5)
0 y < 0,
Fy(y) = k/n, (k  1)/n, <y<k/n,,k =l , 2, ... ,n,, (4. 6)
1 y > 1.
frixl
 1
   < Fx(x) <
rT/,X l
. (4.7)
'n, 1i
In Prob lem 4. 2 .3 , we ask the reader t o verify that lirr1n+CX) fn,x l /n, = x. Th is imp lies
that as ri + oo, both f ractio ns approach x . T he CDF of X is
1
Fx(x;) 0 x < 0,
0.5 Fx(x;) = x O <x;< l , (4.8)
0 1 x > 1.
0 0 .5 1 x
Quiz 4.2
The curn ulative d istribt1tion function of t11e r an dorr1 variable Y is
0 y < 0,
Fy(y) = y/4 O <y<4, (4.9)
1 y > 4.
Sketch t he CDF of y a nd calculat e t he follov.rir1g probabilit ies:
[
P2
Like the CDF , the PDF f'x( x) is a probability model for a contin
uo11s randorn variable X. fx(x;) is the derivative of the CDF. It is
proportional to the probability that X is close to x .
The slope of t he CDF contains t11e rr1ost interesting ir1forrr1ation about a contir1
uous r ar1dorr1 variable. T l1e slope at an:y point x indicates t he probability that X
is n,ear :i;. To understand this ir1tuit ively, consider t he graph of a CDF Fx(:i;) given
in Figure 4.2. Theorern 4.l (c) states that the probability that Xis in t11e interval
of vvidt11 ~ t o the right of x 1 is
(4.10)
Note ir1 Fig11re 4.2 that this is less t h an t he probability of the interval of widt11 ~
to t he right of x2 ,
(4.11)
The comparison rr1akes sense because both intervals 11ave t11e sarr1e length. If vve
r edt1ce ~ to focus ot1r attent ion on outcornes nearer and nearer to x 1 and x2, bot11
probabilities get sm aller. Hovvever, their relative values still depend on t11e aver age
slope of Fx(x) at the two points. This is apparent if v.re rewrite Eq11atior1 (4.10) ir1
the forrn
(4.12)
Here t11e fraction on t he right side is the average slope, and Equation (4.12) states
that t he probability t hat a, randorn variable is in a n interval near x 1 is the average
[
5 5
slope O\rer the interval t irnes the length of the ir1terva.l. B:y definition, t he limit of
the a\rerage slope as .6.  + 0 is t he deri,rative of Fx( ;r;) e\raluat ed at ~D 1 .
We conclude from t he discussion leading t o Equatior1 (4.12) that the slope of t he
CDF in a region near any nurnber x; is ar1 indicator of t he probabilit y of observing
the random variable X near x; . Just as t11e arr1ount of rnatter in a small volume is
the density of the matter t imes t he size of volurne, t he arr1ount of probabilit:y in a
srr1all region is the slope of the CDF t irnes the size of t he region. This leads t o the
t erm probability derisity, d efir1ed as t he slope of t he CD F.
j .x (x. ) = dFx(x) .
dx
This definition displays the cor1\rer1t ional notation for a PDF. T he narr1e of t11e
function is a lowercase f' \x.rith a subscript that is the narne of t11e randorr1 variable.
As wit h t he PMF and t he CDF , the argt1rr1er1t is a dl1mrrl}' variable: f x (x), f x(v,),
and f 'x( ) are all t 11e sarr1e P DF.
The P DF is a complet e probabilit y rnodel of a cor1t int1ous ra ndorr1 \ra riable.
vV11ile t11ere are other ft1n ctions that also provide cornplet e rr1odels (the CDF an d
the mornent generat ing f\1nct ion that vie study in C11apter 9), the PDF is t he rnost
t1seful. One reasor1 for t11is is t 11at the graph of t he P DF provides a good indication
of the likely values of observations.
 = Example 4.3
Figure 4.3 depicts th e PDF of a rando m varia ble X that descri bes t he vo ltage at th e
rece iver in a modem. W hat are proba ble va lues of X?
Note that th e re a re t wo places where the PDF ha s high val ues an d th at it is low
elsewhe re . Th e PDF ind icates that the random variabl e is likely t o be nea r  5 V
(correspo ndi ng t o the symbol 0 transmitted ) a nd near + 5 V (correspo ndin g t o a 1
tra nsmitted) . Va lues far from 5 V (due t o strong disto rtion ) are possible but m uc h
less li kely.
key role in calct1lating t11e expected value of a cont inuous randorri variable, t he
subject of the next section. Irnportarit properties of the PDF follovv directl}' frorn
Definition 4.3 arid t he properties of the CDF.
{a) f x(x) > 0 for all x, {b) Fx (x) = 1'= J x(u) du,
{c) 1: Jx (x) dx = 1.
Proof The first statem en t is t rue because Fx(:i;) is a nondecr easing function of x and
t herefore its d erivative, f x(.r,), is nonnegative. T he second fact follo,vs directly from t he
d efinit ion of fx(x) and t he fact t hat Fx(oo) = 0. The t hird statement follows from t he
second one and Theorem 4.1 (b) .
Given t hese proper t ies of t lie PDF , vie can pro\re the next t heorerri, vvhich relates
t he PDF to t he probabilit ies of events .
Eq11ation (4.1 4) is t1seft1l beca11se it per rriits t1s to iriterpret the integral of Theo
rem 4.3 as t he limiting case of a sum of probabilit ies of events {x < X < x + dx;} .
[
~= Example 4.4
For the experiment in Examples 4.1 and 4.2, find t he P DF of X and the probabil ity of
t he event {1/ 4 < X < 3/ 4}.
T aking the derivative of the CD F in Equation (4.8), f'x( x) = 0 whe n x; < 0 or x >l.
For x; between 0 and 1 we have f'x(x;) = dFx(x)/dx; = 1. T hus the PDF of X is
I
~
'
~ 0.5 1 O <x< l.
fx(x) =
 I
(4.1 5)
0 otherwise.
0
0 0.5 I
x
T he fact t hat the PDF is co nstant over the range of possible va lu es of X reflects the
fact that the pointe r has no favorite stopp ing places on t he circumfere nce of t he circle.
T o find the probability that X is between 1/ 4 and 3 / 4, we can use either T heorem 4 .1
or T heorem 4 .3. T hus
and equivalently,
3/4 f,3/4
P [1/ 4 < X < 3/ 4) =
f,1/4 f'x (x) dx; =
1/4
dx = 1/ 2. (4.17)
W hen t he PDF and CD F are b ot h knovvn , it is easier to llSe t11e CDF to find the
probabilit}' of an interval. Hovvever , in m an}' cases vve begin \vit h t he PDF , in \vhich
case it is l1sually easiest to use T 11eorerr1 4.3 directly . The alterr1ati\re is t o find the
CDF explicitl}' b}' rr1eans of T heorerr1 4.2 (b) and t hen t o use T heorerr1 4.1.
Example 4.5
Consider an experiment that consists of spinn ing the pointer in Examp le 4.1 three times
and observing Y meters, the maximum val ue of X in the t hree spins. In Example 8.3,
we show that the CDF of Y is
1
Fy(y) 0 y < 0,
0.5
Fy(y) = y3 0 < y < 1, (4.18)
1 y > 1.
0 0.5 I y
[
Find the PDF of Y and the probabi lity that Y is between 1/ 4 and 3/ 4.
We apply Definition 4.3 to the CD F Fy(y) . When Fy(y) is piecewise differentiable, we
take the derivative of each piece:
3~~
fy(y) 2
jy(y) = dFy(y) 3y 2 0 < y < 1,
I (4.19)
0 .....___ _..::;......_"''
dy 0 otherwise.
0 0.5 1 y
Note that the PDF has values between 0 and 3. Its integral between any pair of
numbers is less than or equal to 1. T he graph of fy(y) shows that there is a higher
probability of finding Y at the right side of the range of possible values than at the left
side. This reflects the fact that the maxim um of three spins produces higher numbers
than individual spins. Either Theorem 4.1 or T heorem 4.3 can be used to ca lculate the
probability of observing Y between 1/ 4 and 3/ 4:
Note that this probability is less than 1/ 2, which is the probability of 1/ 4 < X < 3/ 4
calculated in Example 4.4 for one spin of the pointer.
For this rar1dom variable X , the probabilities of tr1e fot1r set s a.re
So vve see t hat t he nature of a r1 inequality ir1 t h e definit ion of an event does not
affect the probability v.rhen we examine cont in11ot1s r andom v ariables. \i\Tit l1 discret e
r andorr1 v ariables, it is critically irr1portant t o exarnine t 11e inequality carefull:y.
If we compare other cl1ar acteristics of discrete and contin11ous randorn variables,
"''e find t 11at v.rith discret e r andorn v ariables, rnany facts are expressed as surns. W ith
continuous randorn variables, the correspondir1g fact s are expressed as integrals. For
exarr1ple, when X is discrete,
Quiz 4.3
Rar1dorn variable X has probabilit}' der1sity fur1ctior1
x > o.
f'x (x) =  I
(4.23)
0 other\vise.
is a surn of t he possible values Yi, each multiplied by its probability. For a cont inuous
r andorr1 variable X , this definit ion is inadeq11ate beca11se all possible values of X
ha;ve probabilit}' zero. Ho"'ivever, "''e can develop a definit ion for the expect ed value
[
(4.25)
"'' here the notatior1 la J denotes t he largest integer less thar1 or equal to a. Y is an
approxirnation to X in that Y = k~ if arid or1l:y if k~ < X < k~ + ~. Since t he
r ange of Y is Sy = {... ,  ~ , 0, ~ ' 2~ , ... } , the expected val11e is
00 00
As~ approaches zero and the intervals under consideration groV\r srr1aller , Y more
closely approximat es X. Furtherrr1ore, P [k~ < X < k ~ + ~] approaches f x(k ~)~
so that for srn all ~ '
00
In the lirr1it as ~ goes to zero, t11e surn converges to the integr al in Definit ion 4.4.
E [X] = J: x f x(x) dx .
When we consider Y, the discrete approxirr1at ion of X, t11e int11it ion dev eloped
in Section 3.5 st1ggests tl1at E[Y] is "'' h at "'' e w ill observe if "'' e add up a v ery
large r1urr1ber ri, of ir1depender1t observations of Y and divide by ri,. This sarr1e
intuition holds for t he cor1t inuous random variable X. As ri, + oo , the a;verage
of ri, independent sarr1ples of X "''ill approach E [X]. In probabilit}' theory, this
observation is kr10vvn as t11e La'tJJ of Large Nv,rnbers, Theorem 10.6.
Example 4.6>===
In Example 4.4, we fo un d t hat t he st opping point X of t he spi nning wheel experi ment
was a uniform rando m variab le with PD F
1 O <x;< l ,
f x (x) = (4.28)
0 otherw ise.
E[X] = foo
00
x;fx(x) rlx = f
lo
1
xrlx; = 1/ 2 meter. (4.29)
[
W ith no preferred stopping points on the circ le, t he average stopp ing poin t of the
pointer is exactly halfway arou nd t he circle.
c:::== Example 4. 7
In Example 4.5, find the expected va lue of the maximum stopping point Y of the three
spins:
E [Y) = f
 oo
00 y j'y (y) dy = f 1 y(3y 2 ) dy =
Jo
3/ 4 meter. (4.30)
Example 4.8
Let X be a un iform random variable w ith PDF
1 0 <1'; < 1,
f x (x) = (4.31 )
0 otherwise.
   Theorem 4.4
T he expected 'ualv,e of a fv,n,ction,, g(X ), of raridorn variable X is
]\/Iar1y of the properties of expect ed va1t1es of discrete random variables also apply
t o cont inuous randorn \rariables . Definition 3.15 and Theorerr1s 3.11 , 3.12, 3.14, and
3. 15 apply to all randorr1 variables. All of t hese relationsl1ips are V\rritten in terrr1s of
expected values in t11e follov.rir1g t heorerr1, vvhere we use bot h notations for expected
value, E [X] a nd ,x, t o rnake the expressions clear arid cor1cise.
   Theorem 4.5
F or ariy raridorn variable X ,
(a) E [X  ,x ] = 0, {b) E[aX + b) = a E[X) + b,
[
The rr1ethod of calcl1lating expected vall1es dep ends on t he t:ype of r ar1dom var
iable, discret e or cont inuous . T 11eorerr1 4.4 stat es t hat E [X 2 ] , the mean square value
of X, and v ar[X] are t he integr als
Var[X] = J: ( 2
x  /J, x) f"x (a:;) dx . (4.32)
Ol1r interpretation of exp ected v alues of discr et e r an dorn variables carries over t o
cont inl1ous random variables. First , E[ X ] r epr esents a t ypical value of X , a n d
t he variar1ce describes the dispersion of outcornes relative to t11e expected value.
Second, E [X] is a best gt1ess for X in the sense t hat it minirnizes t11e rr1ean square
error (MSE ) and Var [X ] is the 11SE associat ed v.rit h the guess . Furt her rr1ore, if "''e
vievv t h e PDF f x( x) as t h e density of a rnass distributed or1 a line, t hen E [ X ] is
t he center of rr1ass.
   Example 4.91  
Fi nd the variance and standard deviatio n of t he pointer position in Exa mple 4.1.
E [X 2 ] = f
oo
00
x
2
f x(x) dx =
lo
f
1
x; 2 dx; = 1/3 m 2 . (4.33)
In Example 4.6, we have E[X = 1/2. T hus Var[X] = 1/3  (1/ 2) 2 = 1/ 12, a nd the
standard deviation is O'x = v ar[X] = 1/ vTI = 0.289 meters.
Example 4.10
i::::::==
Find the variance and standard dev iation of y , t he maximu m po inter position aft er
three sp ins , in Examp le 4.5.
We proceed as in Examp le 4 .9. We have fy(y) from Exa mple 4 .5 and E['Y] = 3/ 4
from Exa mp le 4.7:
(4.34)
Quiz 4.4
The probability density ft1nction of the randorr1 variable Y is
3y 2 I 2 1 < y < 1 ,
f y(y) = (4.36)
0 other\vise.
Section 3.3 int roduces several farr1ilies of discrete ra r1dom variables t hat arise in a
"'ride variety of practical applications. Ir1 t his section, \Ve introduce t hree irr1portant
fa rr1ilies of cont ir1uo11s random variables: uniform , exponent ial, and Erlang. \!\fe
devote all of Section 4.6 t o GatlSsian ra r1dorr1 variables. Like the farr1ilies of dis
crete randorr1 variables, tl1e PDFs of the rnernbers of each famil}' all have the sarr1e
rr1athernatie<.il forrri. They differ only in t:he values of or1e or two pa.rarr1eters. vVe
have already encountered an exarnple of a cont inuot1s ?J,'niforrn ra/ndorn '/Jariable ir1
t he wheelspinr1ing experirr1ent . The general definition is
Expressior1s t hat are synonyrnous "''ith X is a 'Un,if orrn ran,dorn '/Jaria ble are X is
un,iforrnly distrib'uted and X has a 'U'nif orrn distri bution,,
If X is a uniforrn ra r1dom v ariable t here is ar1 equal p robabilit}' of fin ding an
outcome x; in any interval of length ~ < b  a wit hin Sx = [a, b) . \"!\fe can use
Theorerr1 4.2 (b), Theorerr1 4.4, and T11eorerr1 4.5 t o derive the follovving propert ies
of a llniforrn randorn variable.
[
(0 x <a,
Th e GDF of X is Fx(x;) = (x;  a)/(b  a) a<x<b,
1 x > b.
Th e expected 'ual'/j,e of X is E (X ] = (b + a)/2.
Th e '/Jarian,ce of X is Var (X ] = (b  a) 2 / 12.
Example 4.11
i::::::==
Th e phase angle , 8, of the sig nal at t he input to a modem is uni fo rmly distributed
between 0 and 27r rad ians. What are the PDF, CDF, expected va lue, and variance of
8?
From the prob lem statement, we identify the parameters of the un iform (a, b) random
variable as a = 0 and b = 27r. Th erefore the PDF a nd CDF of 8 are
0 () < 0 ,
1/ (27r) 0 < () < 27r.
1e (e) =  I
Fe(B) = B/(27r) 0< x;< 27r, (4.37)
0 otherwise,
1 x > 27r.
The relatior1ship betv.reen t 11e farr1ily of discrete uniforrn rar1dom variables and t 11e
farr1ily of contir1uous uniform randorn variables is fairly direct. The follov.rir1g theo
rerr1 expresses the relatior1sl1ip forrnally .
   Theorem 4. 7
L et X be a 'IJ,rl,iforrn (a, b) ran,dorn '/Jariable; vJhere a arid b are both in,tegers. L et
K = IXl. Then, K is a discrete ?J,n,iforrn (a + 1, b) ran,dorn '/Jariable.
This expression for PK(k) conforms to Defini t ion 3.8 of a discrete uniform (a+ 1 b) P MF.
x >O
fx(x) =  '
0 other'uJis e,
1JJhere the pararneter A > 0.
Example 4.12
The probabi lity that a telepho ne call lasts no more tha n t minutes is often modeled as
an exponential CDF.
1...
0 otherwise.
0 ~___........_ ___
5 0 5 t
What is the PDF of the duration in minutes of a te lep hone conversation? What is the
probabi lity that a conversation wi ll last between 2 and 4 minutes?
(4.39)
Example 4.13
In Example 4 .12, what is E[T], t he expected d urat ion of a te lephone call? W hat are
the variance and standard deviation of T? What is the probability that a ca ll duration
is within 1 sta ndard deviation of the expected ca ll duratio n?
Usin g the PDF f'r(t) in Example 4 .12, we calculate the expected duration of a call:
100 t 1 et/
E [T) =
J oo
()()
tfr(t) dt =
0 3
3 dt . (4.40)
E [T] =  tet/ 3 00
+ 100 et/ 3 dt = 3 minutes. (4.41)
0 0
[
E [T 2 ] =  t 2 et/ 3 CX) + f (X) (2t) et/ 3 rlt = 2 f (X) tet/ 3 rlt. (4.43)
o lo lo
With the knowledge that E[T ] = 3, we observe that .f0CX) tet/ 3 rlt = 3 E [T] = 9. T h us
E[T2 ] = 6E [T] = 18 and
To derive general expressions for the CDF , t he expected value, and the variance
of ar1 exponential rar1dom variable, we apply T11eorerr1 4.2 (b), Theorern 4.4, and
Theorern 4.5 t o the expor1ential PDF in Definition 4.6.
1 e.A~r; J; > 0,
The GDF of X is Fx(x) =
0 other1uise.
Th e ex;pected valv,e of X is E [X] = l / ;\.
Th e varian,ce of X is Var [X] = 1/ ;\ 2.
The follovving theorem shows the relations11ip between the farr1ily of exponential
randorr1 variables and the farnily of geometric r andorn variables.
Theorem 4.9
If X is an, ex;pon,en,tial (;\) ran,dorn variable, then, K IXl is a geornetric (IJ)
ro:ndorn variable v1ith '[J = 1  e> .
k = 1 ) 2, . . .
(4.47)
otherwise,
\Vhich con fo rms to D efinit ion 3.5 of a geom etric (p) r a n dom variable wit h p = 1  e > .
Example 4 .14
Phone company A charges $0.15 per minute for telephone ca lls . For any fraction of
a minute at t he end of a cal l, they charge for a fu ll minute. P hone Company B also
charges $0. 15 per minute. However, Phone Compa ny B ca lcu lates its charge based on
the exact d u ration of a ca II. If T , t he du ratio n of a ca 11 in minutes, is an exponentia I
( .\ = 1/ 3) random variab le , what are the expected revenues pe r ca ll E[RA] and E[RB]
for companies A and B?
Because T is an exponential random va riable, we have in T heorem 4.8 (and in Exam
ple 4 .13) E [T ) = 1/ .\ = 3 mi nutes per ca ll. Therefore , for phone compa ny B, wh ich
charges for t he exact duration of a ca ll,
E [RA) = 0.1 5E [K ) = 0. 15/p = (0. 15) (3. 53) = $0 .529 per ca ll. (4. 49)
In T heorern 9.9, "''e s11ovv t11at t he sum of a set of ir1dependent identically dis
t ributed exponential rand orr1 variables is ar1 Erlan,g rar1dom variable.
x > 0.
f x(;i;) = (n,  1) !  I
0 other'tuise,
1JJhere the pararneter A > 0) an,d the pararneter n, > 1 is an, in,teger.
The par arr1eter ri, is ofter1 called t he order of an E rlar1g r an dorr1 va1iable. Prob
lern 4. 5.1 6 out lines a p rocedur e to verify that t he integral of t he Er lar1g PDF over
all ;i; is 1. The E rlang ('n, = 1, .\) r ar1dom v ariable is ident ical t o t h e exponent ial
(.\) r ar1dom variable. J ust as t he exponer1t ial ( .\) r andorr1 variable is related t o t 11e
[
Table 4.1 F ive probabilit y m odels all describing t he san1e pattern of a rrivals at t he Phones
n1art stor e. 'rhe expected a rrival rate is .A = 0.1 customers/ n1inute. W hen we n1onitor
arrivals in discrete onen1inute intervals, t he probability vve observe a n onempty in terval
(v.rit h one or n1ore a rrivals) is p = 1  e >. = 0.095.
geornetric (1  e> ) r andorr1 v ariable, the Erlang ( n,, .\) continuous r andorr1v ariable
is relat ed to t he P ascal (n,, 1  e> ) discret e r andom variable.
B}' corr1paring Theorern 4.8 and Theorerr1 4. 10, vie see for X , a r1 Erla ng (ri , .\)
randorr1 variable, arid Y , a,r1 exponen t ial (A) r ar1dom v ar iable , t 11at E[ X ] = n, E['Y]
and Var [X ] = ri Var[Y]. Ir1 the follovving theorern, "''e car1 also connect Erlang and
P oisson r andorr1 variables.
[
1
 '\:"'n1 (>.:i;)ke >x >0
Fx(x) = 1  FK> x (ri  1) = Lt k = O k! x;  '
0 other111ise.
Problerr1 4.5.18 outlines a. proof of Theorerr14.11. Theor err1 4.11 states that t11e
probability that the Erlan g (n,, A) r andorn variable is < x is the probabilit:y that
the Poisson (AX) r andom v ariable is > n, because t he sum in Theor err1 4.11 is the
CDF of the Poisson (Ax) rar1dom variable evaluated at n,  1.
The rr1athernatical relationsl1ips betweer1 t11e geornetric, P ascal, exponential, Er
lang , and Poisson r andom variables derive frorr1 the vvidelyused Poisson, process
rnodel for arriva.ls of Cl1storners t o a ser vice facility. Formal definit ions and theo
r ems for t11e Poisson process appear in Sectior1 13.4. The arriving customers car1
be, for exarnple, shoppers at the Phonesm art store, packets at an Interr1et router ,
or r equests to a \ i\Teb server . In this model, the r1l1rr1ber of custorners that arrive
ir1 a Trninute tirne period is a Poisson (AT) r ar1dom v ariable. Under continuous
rnonitoring , the time tha,t we wait for one a rrival is an exponential ( A) r ar1dom
variable and t he t irne \rve wait for n, arri\rals is an Erlang (n,, A) randorn \rariable.
On t11e other hand , wher1 vve monit or arrivals in discr ete onerninute inter\rals, the
nurnber of inter\rals we \r..rait until "''e observe a nonempty interval (vvith one or rr1ore
arri\rals) is a geornetric (p = 1  e>.) random \rariable arid the r1l1rnber of inter\rals
"'' e \r..rait for 'n r1onerr1pty intervals is a P ascal ( n,, [J) random variable. Table 4.1
surnmarizes these properties for experirnents t11at rr1onitor custorr1er arrivals to t he
Phonesrnart store.
Quiz 4.5
Cont inuOllS rar1dom varia,ble X 11as E[X] = 3 a nd Var [X] = 9. Fir1d t he PDF ,
f x(x) , if
(a) X is an exponential ra ndorr1 \rar (b) X is a continuous uniforrr1 r a n
iable, dorn \rariable.
(c) X is an Erlang rar1dorn variable.
0.6 0.6
f'x( x;) f 'x (x;)
0.4 0.4
0.2 0.2
)
0 '____;_ ____. \
2 0 2 4 6 2 0 2 4 6
.x x
(a) , = 2, (J = 1/ 2 (b) = 2, (J = 2
Figure 4.5 Tv.ro examples of a Gaussia n ra ndon1 variable X \:vit h exp ected value p, and
standard deviation a.
1JJhere the pararneter , can, be an,y real 'nv,rnber an,d the JJararne ter (J > 0.
~v1 ar1y statistics t exts t1se t:he notation X is N [,, (J 2 ] as shorthand for X is a
Ga'tJ,ssiari (11,, (J ) ran,dorn variable. In t his not ation, t he N denotes n,orrnal . The
gr aph of f 'x( x) h as a bell s hape, "''h ere t he cer1ter of t he bell is x = , and (J reflects
the widt 11 of the bell. If (J is srr1a.ll, t11e bell is narTOV\', vvit h a h igh , pointy peak. If (J
is large, t he bell is wide, "''ith a lo"''' fiat peak. (T11e heig11t of the peak is 1 / (J,/2;.)
Figt1re 4.5 contains t wo ex arnples of Gat1ssian P DFs v.rith /J, = 2. In Figure 4. 5(a),
(J = 0.5, and in F igt1re 4.5 (b) , (J = 2. Of course, the area under an}' Gaussiar1 PDF
00
is .[ 00 f'x(x;) dx = 1. Furth errnore, t he p ar arneters of t h e PDF ar e the expected
value of X and t 11e stand[trd deviation of X.
Theorem 4.12
If X is a Ga'tJ,SSian, (11,, (J ) raridorn variable,
E [X] = 11,
The proof of Theorem 4.12 , as well as the proof t hat t he a rea under a Ga ussian
PDF is 1, err1ploys integr ation by parts and other calc11h1s techniques . We leave
them as an exer cise for the reader in Problerr1 4.6.13.
[
Theorem 4.13
If X is Ga/ussian, (,, a ), y = aX +b is Gaussiari (a + b,aa ).
The t heorern st ates that a.n}' linear trar1sforrnatior1 of a Ga ussian randorr1 varia ble
produces another G aussian r andom variable. T 11is t 11eorerr1 allov.rs us t o relate the
propert ies of an arb itra ry G aussian r ar1dom variable t o the properties of a sp ecific
randorr1 v ariable.
Theorern 4.12 indicat es t 11at E[Z] = 0 a n d Va.r [Z ] = 1. The t ables tha t we llSe
to find ir1tegr a ls of G at1ssia n PDFs contain v alues of Fz(z), the CD F of Z. \ Ve
introduce t 11e specia l nota.tion <I> ( z) for t his function.
<I> (z ) = 1 jz 2
e 'U, / 2 du.
y2;  CX)
Given a t a ble of val11es of <I> ( z), vve use t he follov..ring t heorerr1 to fir1d probabilit ies
of a Gaussian randorr1 variable vvit h p ar arnet ers , and a.
Theorem 4.14
If X is a G a'ussian, (,, a ) raridorn variable, the GDF of X is
x
Fx(:i;) = <I> ( a
,) .
T he probability that X is in, t he in,t erval (a, b] is
Equation (4 .50) ind icates that z = (46  61 )/10 =  1.5. Therefore your score is 1.5
standard deviations less than the expected value.
T o find probabilit ies of Ga.t1ssian r ar1dom variables, vve t1se t he values of <I> ( z) pre
sented ir1 Table 4. 2. Note t hat t his table cor1tains ent ries onl}' for z > 0. For
negative valt1es of z, \Ve apply the follovving property of <P (z) .
Fig11re 4.6 disp la}'S t he symrnetry properties of <I> ( z). Both gr ap11s contain t h e
standa rd norrr1al PDF . In Figure 4.6 (a), t h e sh aded ar ea under t h e PDF is <I>( z).
Since the area under the PDF equals 1, t he 11nshaded area t1nder t 11e PDF is 1  <I> (z) .
In F igt1re 4.6 (b), the shad ed area or1 the right is 1  <P (z) and t h e sha ded ar ea on
the left is <I> ( z). This gr a.p11 dernor1str at es t hat <I> ( z) = 1  <P (z) .
Example 4.16
If X is the Gaussian (61 , 10) random variable , what is P[X < 46)?
[
Applyi ng Th eorem 4.14, T heorem 4.15, and the result of Exa mple 4.15 , we have
T h is suggests that if your test score is 1.5 st and a rd deviations below t he expected va lue,
you are in the lowest 6.7% of the popu lation of test takers.
Example 4.17'
If X is a Ga ussia n (11, = 61 , 0" = 10) rando m variable, what is P[51<X < 71]?
Applying Equation ( 4.50), Z = (X  61) / 10 and
61
{51 < x < 71} = {  1 < x 10
 < 1} = {  1<z < 1} . (4. 52)
The solution to Exarnple 4.17 reflects t he fact t 11at in an experimer1t v.rith a G aussiar1
probabilit}' model, 68.3% ( abot1t tvvo t 11irds) of the ot1tcomes are wit11in 1 standard
deviation of the expected value. Abot1t 95% (2<1> (2)  1) of the 011tcornes are wit hir1
tvvo st andard deviations of the expected value.
T ables of <P (z) are useful for obtaining nurr1erica.l values of integrals of a Ga11ssian
PDF O\rer intervals near t 11e expected value. Regions farther thar1 t hree standard
deviations frorn the expected \ralue (corresponding t o lzl > 3) a r e in t he tails of
the PDF . When lzl > 3, <I>(z) is very close t o one; for exarr1ple, <1> (3) = 0.9987 and
<I>( 4) = 0.9999768. T 11e properties of <P (z) for extrerne values of z a re apparent in
the stan,dard 'norrnal cornplernen,tary GDF.
Q(z) = 1
P [Z > z] = y!2;
27r
1 z
00
e 'cJ,2 12 du = 1  <P (z ).
Althot1gh we rr1ay regard both <1>(3) = 0.9987 a nd <I>( 4) = 0.9999768 as being \rery
close t o one, vve see in T able 4.3 t hat Q(3) = 1.35 103 is alrnost tvvo orders of
rnagnit ude larger than Q (4) = 3.17 10 5 .
   Example 4.18:  
ln an optica l fiber transmission system , the probabi lity of a bit error is Q(v:;T2), where
r is t he signa ltonoise ratio. What is t he minimum va lue of r t hat produces a bit error
rate not exceeding 106 ?
[
3.18 7.361 0  4
3.58 1. 72 10  4
3.98 3.45 10''" 4 .38 5.93 10  G 4 .78 8.76 1 0  7
4
3.1 9 7.11 10  4
3.59 l .65 1 0  3.99 3.30 10  5 4 .39 5.671 0  G 4 .79 8.34 1 0  7
3.20 6.87 10  4
3.60 l .591 0  4
4 .00 3.1 7 10''" 4 .40 5.4 110  G 4 .80 7.93 1 0  7
4
3.21 6.64 10  4
3.61 l .531 0  4 .01 3.04 10  5 4 .4 1 5.1 71 0  G 4 .8 1 7.55 1 0  7
3.25 5.77 10  4
3.65 1.3110  4
4 .05 2.56 10''" 4 .45 4 .29 1 0  G 4 .85 6.1 7 1 0  7
3.26 5.57 1 0  4
3.66 l .26 1 0  4
4 .06 2.45 10''" 4 .46 4 .1 0 10  G 4 .86 5.87 1 0  7
3.27 5.38 10  4
3.67 l .2110  4
4 .07 2.35 10''" 4 .47 3.9110  G 4 .87 5.58 10  7
4
3.28 5.1 9 10  4
3.68 l . l 71 0  4 .08 2.25 10  5 4 .48 3.73 1 0  G 4 .88 5.30 1 0  7
3.29 5.0110  4
3.69 1. 121 0  4
4 .09 2.1 6 10''" 4 .49 '3 . ;)r.:5) .1 0  G 4 .89 5.04 1 0  7
3.30 4.831 0  4
3.70 l .08 1 0  4
4 .1 0 2.07 10''" 4 .50 3.40 1 0  G 4 .90 4.79 1 0  7
3.31 4.66 1 0  4
3.71 l .041 0  4
4 .11 1.98 10''" 4 .51 3.241 0  G 4 .91 4.55 1 0  7
3.32 4.50 10  4
3.72 9.96 1 0  ii 4 .1 2 l.89 10  5 4 .52 3.09 1 0  G 4 .92 4.331 0  7
3.33 4.34 1 0  4
3.73 9.571 0  ii 4 .1 3 l.8110  5 4 .53 2.95 1 0  G 4 .93 4.111 0  7
3.34 4.1 9 10  4
3.74 9.20 1 0  " 4 .14 1. 74 10''" 4 .54 2.8 110  G 4 .94 3.91.10  7
3.35 4.04 1 0  4
3.75 8.841 0  ii 4 .1 5 1.66 10''" 4 .55 2.68 1 0  G 4 .95 3.7110  7
3.36 3.90 10  4
3.76 8.50 10  ii 4 .1 6 1.59 10''" 4 .56 2.56 10  G 4 .96 3.52 10  7
3.37 3.76 1 0  4
3.77 8. 16 10  ii 4 .1 7 l.52 10  5 4 .57 2.441 0  G 4 .97 3.35 1 0  7
3.38 3.62 1 0  4
3.78 7.841 0  ii 4 .18 1.46 10''" 4 .58 2.321 0  G 4 .98 3.18 1 0  7
3.39 3.49 1 0  4
3.79 7.531 0  ii 4 .1 9 1.39 10''" 4 .59 2.22 1 0  G 4 .99 3.02 1 0  7
Keep in mind that Q(z) is t he probability that a Gaussian random variable ex
ceeds its expect ed v all1e b:y more t han z st andard deviations. We can observe frorn
T able 4.3, Q(3) = 0.0013. This mearis that the probability t liat a Gaussian randorri
variable is rriore than t hree standard deviations above its expect ed value is approxi
rnat el:y orie in a t liousand. In coriversation we refer to the event { X  11,x > 3ax } as
a threesigrna e'ven,t. It is unlikely to occur. Table 4.3 indicates t liat t lie probability
of a 5a event is on t he order of 10 7 .
Quiz 4.6
X is the Ga ussian (0, 1) randorn variable and y is t he Gaussian (0, 2) raridorn
variable. Sket ch the P DFs f x( x) a rid j'.y (y) on t he same axes and find:
(a) P [ 1 < X < 1], (b) P [ 1 < y < 1],
(c) P [X > 3. 5], (d) P [Y > 3. 5].
Figure 4. 7 As E + 0 , d f.(x) a pproaches t he delta function 6 (::r) . For each E, t h e a rea under
t h e curve of d ( x) equals 1.
The rriatherriatical problerri witli Defiriitiori 4.12 is that de (x) has no limit at ;x: = 0.
As indicat ed in Figure 4.7, de (O) just gets bigger a nd bigger as E + 0. Although
this rriakes Definition 4.12 sorriev.rliat unsatisfactor}', t lie usefltl properties of the
delta function are readil}' derrionstrat ed vvhen 6(x) is approxirnated by de (x;) for
very srriall E. We nov.r present some proper t ies of the delta function. '\'Ve st at e these
propert ies as t lieorerris even though t hey are riot theorerris in the tlsual sense of this
t ext because we cannot prove therri. Inst ead of t lieorerri proofs, we refer to de (x)
for small values of E t o indicat e vvhy t:he properties hold.
Although dE(O) blovvs up as E+ 0, t he area under dE (x ) is tlie integral
JE/21
J
OO
dE (x ) dx =  dx = 1. (4. 54)
 oo  E/2 E
That is, the area urider de.(x) is always 1, rio rriatter liovv srriall the value of E. '\'Ve
conclude that t he area under 6(x;) is also 1:
J oo
 oo
6(x; ) dx = 1. (4. 55)
[
  Theorem 4.16==::::::i
For an,y con,tinv,o?J,s j?J,riction, g(x);
J oo
()()
g(;i; )<5(x;  x;o) rlx; = g(x;o)
Theorem 4.1 6 is often called t he siftin,g property of t11e delta fur1ction. \Ve car1
see that E quation (4.55) is a specia l case of t he sift ing property for g(x) = 1 a rid
xo = 0. T o underst ar1d T h eorern 4.16 >cor1sider t he integral
(4. 56)
On t11e right side, we have t he a\rerage value of g( x) over the ir1terval [x; 0  E/ 2, x 0 +
E/ 2]. As E + 0, t his a\rerage value ml1st con\rerge to g(x 0 ).
The delta functior1 has a close connection t o t he unit step function.
0 x < 0)
v,(x;) =
1 x > o.
T o 11nderstand Theorem 4.17, v.re observe t hat for any x; > 0, \Ve can choose E < 2x
so that
j :r:
()()
de(v)dv = l. (4. 57)
Th11s for an}' x ':I 0 , in t he lirr1it as E+ O> J:r:oo dc(v) rlv = v,(x) . Not e t ha t we
ha\re not yet considered x = 0. In fact >it is not cornpletel}' clear vvh at t h e \ralue
of J 00 <5 (v) dv s hould be. R easor1able arg11rner1ts can be rnade for O> 1 / 2> or 1.
0
vVe have adopted the conver1tion that .f 0 00 <5 (x;) dx = 1. vVe will see t h at t his is a
p art icl1larly convenier1t c11oice \vhen we reexarnine discrete randorr1 variables.
[
Frorn Defir1itior1 4.3, we t a ke t 11e derivative of Fx( x) t o find t 11e PDF f x (x) . Refer
ring t o Equation (4.58), t l1e PDF of t he discrete r andom variable X is
W hen t he PDF ir1cludes delta fur1ctions of t h e forrn o(x;  x;i), v.re say ther e is
an irr1pulse at xi . W hen \Ve gr a ph a PDF f x(x; ) t 11at contains ar1 irr1pulse at xi, vve
draw a vertical arrovv labeled by the constant that multiplies t he irr1pulse. We dra\v
each arro\v represent ing a.n irr1pulse at t he sarr1e height because the PDF is alvvays
infinite at each such point . For exarnple, t11e graph of f'x (x) frorr1 Equation (4.60)
lS
fx(x)
'.
x, x,
Using d elta functior1s in t11e PDF , \Ve can appl}' t he for rr1t1las in t his ch ap ter
t o all r andorn variables. In t he case of discrete ra ndom variables, t hese formulas
are equi\ralent t o t he ones presented in C11apter 3. For example, if X is a discrete
randorr1 variable, Defir1itior1 4.4 becorr1es
E [X] = 1= L
CX)
x
~r:i ESx
Px (x;) O( x  Xi ) dx . (4.61 )
B}' vvriting the integr a l of the surn as a s um of ir1tegr als a nd us ing t h e sift ing
propert}' of t:he delta fur1ction ,
E [X] = L
x i ESx
1=
CX)
x Px(x;) O(x  x,i) dx; = L
x i ESx
xiPx(x;i), (4.62)
[
   Example 4.191  
Suppose Y takes on the va lues 1, 2, 3 with equa l probab ility. The PMF and the corre
spond ing CDF of y are
0 y < 1,
1/ 3 y = 1, 2, 3, 1/ 3 1 < y < 2,
Py(y) = Fy(y) = (4.63)
0 otherwise, 2/ 3 2 < y < 3,
1 y > 3.
Using the unit step funct ion v,(y), we can write Fy(y) more compactly as
1 1 1
Fy(y) =  v,(y  1) +  v,(y  2) +  'u,(y  3). (4. 64)
3. 3 3
The PD F of Y is
dFy (y) 1 1 1
Jy(y) = =  o(y  1) +  o(y  2) +  o(y  3) . (4. 65)
dy 3 3 3
We see that the discrete random variable Y can be represented graph ical ly either by a
PMF Py(y) with bars at y = 1, 2, 3, by a CDF with jumps at y = 1, 2, 3, or by a PDF
fy(y) with impulses at y = 1, 2, 3. These three representations are shown in Figure 4.8.
The expected value of Y can be calculated either by summing over the P MF P y(y) or
integrating over the PDF fy (y) . Using the PDF, we have
E [Y] = J: yjy(y) dy
When Fx(x) has adiscon t ir1uity at x, vve llSe Fx(x+) a n d Fx(:i;) to denote t he
llpper a nd lov.rer limits at ;i; . That is,
Using t his notation , vve ca,n say t hat if t he CDF Fx(x) 11as a jump at :i; 0 , then 1x(x)
h as a n irr1pulse at ;i; 0 v.reigl1ted b}' the heigh t of t he discont ir1uity Fx(xci)  Fx(x0).
~= Example 4.20
For the random variab le Y of Example 4.19 ,
(4. 68)
   Theorem 4.18
For a ran,dorn '/Jariable X, 1J.Je ha'/Je the f ollo'uJin,g eq11,i'IJalen,t staternen,ts:
(a) P [X = :i;o] = q {b) Px(xo) = q
(c) Fx(xci)  Fx(:i;0) = q {d) 1x(xo) = qb(O)
In E x arr1ple 4. 19, v.re sa,v.r t hat j'.y(y) consists of a series of impulses. T 11e value
of fy(y) is either 0 or oo . By con trast, t he PDF of a con t inuous r a nd om v ariable
has nonzero, finite values over intervals of ;i; . I r1 t he nex t exarn ple, vve encour1ter a
r andorr1 variable t hat h as con t inuOllS parts a nd irnpulses.
Example 4.21  
0bserve someone dialing a telephone and record the duration of the call. In a simple
model of the experiment , 1 / 3 of the cal ls never begin either because no one answers or
the line is busy. The duration of these cal ls is 0 minutes. Otherwise, with probability
2/3 , a ca ll duration is un iformly distributed between 0 and 3 minutes. Let Y denote
the cal l duration. Find the CDF Fy(y) , the PDF fy(y), and the expected va lue E(Y].
Let A denote the event that the phone was answered. P (A] = 2/3 and P (Ac] = 1/3 .
Since y > 0 , we know that for y < 0 , Fy(y) = 0. Similarly, we know that for y > 3,
Fy(y) = l. For 0 < y < 3 , we apply the law of total probability to write
(4. 69)
[
When A c occurs, Y = 0, so that for 0 < y < 3, P[Y < ylAc] = l. When A
occurs , the cal l duration is uniform ly distributed ove r [O>3], so that for 0 < y < 3,
P [Y < ylA] = y/3. So, f or 0 < y < 3,
T he complete CDF of y is
1
Py(y) 0 y < 0,
1/3
Fy(y) = 1/ 3 + 2y/ 9 0 < y < 3,
0 1 y > 3.
0 1 2 3 y
Consequently, the corresponding PDF fy(y) is
1/3
fy (y) '
2/9 b(y)/3 + 2/ 9 0 < y < 3 ,
j'y (y) =
0 otherwise.
0
0 I 2 3 y
For the mixed random variable y , it is easiest to calculate E [Y] usi ng the PDF :
1 13 2 2
J
CX) y2 3
E [Y ] = y b(y) dy +  y dy = 0 +  : = 1 minute. (4. 71 )
CX) 3 0 9 9 2 0
When X is discrete or rr1ixed, the PDF f x(x) contains or1e or rnore d elta
functions.
[
Quiz 4.7
The curnulative distribt1tion function of rar1dom variable X is
0 ;i; <  1,
Fx(x) = (x + 1)/4  1 < ;i; < 1, (4.72)
1 ;i; > 1.
Sket ch t11e CDF and find t he follovving:
(a) P [X < 1] (b) P[X < 1]
(c) P[X = 1] (d) t 11e PDF f x(x)
4.8 l\ilATLAB
Builtin J\II ATLAB functior1s, eit her alone or v.rith additional code,
can be t1sed t o calculate PDFs and CDFs of several randorn variable
farr1ilies. The rand and randn functions simulate experirnents
that gener ate sarnple values of continuot1s t1niform (0 , 1) r ar1dom
variables and Gaussiar1 (0 , 1) r andorr1 variables, respectively .
Probability Functions
T able 4.4 describes J\!I.A.TLAB ft1nctions related to four families of cont inuot1s randorn
variables introduced in t his cha pter: t1niform, exponential, Erlang, and Gaussian.
The functior1s calculate d irectly t 11e CDFs arid PDFs of t1niform and exponent ial
randorr1 v ariables.
function F=erlangcdf (n,lambda,x) For Erlang and Gaussiar1 rar1dorn variables,
F=1.0poissoncdf(lambda*x,n1); t he PDFs car1 be calculated directly but t he
CDFs require numerical integration. For Er
lang randorn variables, erlangcdf uses T 11eor ern 4.11. For t he Gaussian CDF ,
vve use t 11e builtir1 M ATLAB error function
'.J;
2
1
2
erf( x) = r::. e v, du . (4.73)
y7r 0
<P(x) = 1
2
+ 1 erf ( x )
2 J2 ) (4.74)
vvhich is hovv we irr1plerner1t t 11e J\11.A.T LAB function phi (x). In each function
descript ion in T able 4.4, x denot es a vector x = [x; 1 Xrn ]'. The pdf function
out put is a vect or y suc11 t h at Yi = f'x(x;i) . The cdf function out put is a vect or y
[
such that y,i = Fx(xi) T11e rv f\1nction output is a vector X = [X1 X rn] '
such that each X ,i is a san1ple value of the random variable X. If m, = 1, then the
output is a single sarnple va1t1e of randorn variable X.
Random Samples
Nov.r t11at we have introduced continuot1s randorr1 variables, v.;re can say t11a.t the
bt1iltin f\1nction y=rand (m, n) is M.A.TLAB 's approxirr1ation t o a uniforrn (0 , 1)
r a.r1dorr1 variable. It is ar1 approxirr1ation for two reasons. First, rand produces
pset1dora.ndorn r1urr1bers; the nt1rr1bers seern randorr1 bt1t a.re actually the ot1tpt1t of
a deterrr1inistic a.lgorithrri. Second, rand prodl1ces a dot1ble precision fioatir1g poir1t
nt1rr1ber , represented in the computer by 64 bits. Thus J\IIATL.A.B distir1gt1ishes r10
rr1ore than 264 unique dot1ble precision fioa.tir1g point nt1rr1bers . B}' corr1parision,
there are uncour1tably infir1ite real numbers in (0 , 1 ). Even t11ough rand is riot
randorr1 and does not have a continuOllS range, we car1 for all practical pl1rposes use
it as a source of independent sample values of the uniforrr1 (0 , 1) randorr1 variable.
We ha;ve alread}' ernployed t11e rand fur1ction to generate randorr1 sarr1ples of
t1niform (0, 1) randorr1 var iables. Corrver1iently, MATL.A.B also ir1cludes the bt1iltir1
function randn to ger1erate rar1dom sarr1ples of standard r1ormal random variables.
4.2.1 T he cumulative distribut ion func 4.2.4 The CDF of random var iable W is
t ion of random var iable X is
(o <  5,
71)
0 x <  1, w."i
 5< 71) < 3
8  '
Fx (x) = (x + 1)/2 l <x< l , Fw('111) = l 3 < 71J < 3,
4
1 x > 1. 1 + 3(w  3)
3 <111<5,
4 8
1 71) > 5 .
u se t he PDF to find
(a) W hat is c?
(a) t he constant c,
(b) W hat is P [V > 4]?
(b) P [O < X < 1],
(c) \i\1hatis P [3 <V< O] ?
( c) P [1/ 2 < x < 1 ; 2J,
( d) \ i\f hat is t he valu e o f a, su ch t hat
(d) t he CD FF:x:(x) .
P [V >a] = 2 /3?
4.3.2 The cumulative distribut ion func
4.2.3 In t his pro bl em , we verify t hat t ion of random variable X is
limntoo In,x l / n, = x .
(a) Ver ify t hat n;i; < fnxl < n,x + 1. (0 x < 1 ,
Fx(x)= ~ (1; +l)/2 l <x< l ,
(b) Use part (a) t o sho'v
l1 x > 1.
lim
n too
rn,x l /n, = x.
Find t he P DF f x(x) of X.
( c) Use a similar a rgumen t to show t hat 4.3.3 F ind t he PDF f u(7J,) of t he r andom
limntoo ln,x J/ ri = x . variable U in Problem 4.2.4.
[
PROBLEMS 155
4.3.4 For a constant parameter a > 0, a 4.4.3 Random variable X has CDF
Rayleigh random variable X has PDF
0 x < 0,
2 2
2  a x /2
x > 0, Fx(x) = x/ 2 0 < x < 2,
j .xx
( ) _ a xe
{ 0 otherwise. 1 x > 2.
\i\fhat is the CDF of X?
(a) \!\That is E [X]?
4.3.5 Random variable X has a PDF of
t he form fx(x) = ~f1(::r) + ~f2(x) , 'vhere (b) \iVhat is Var[X]?
.( ) _
j2 x 
{c2ex x > 0, f y ( y) = { yI 2 o < y ~ 2,
0 oth erwise. 0 other,v1se.
\i\fhat conditions must c1 and c2 satisfy so What are E[Y] and Var[Y]?
that f x( x) is a valid PDF? 4 .4.5 T he ctunulative d istribution func
4.3.6 For constants a and b, random var tion of the random variable Y is
iable X has PDF
0 y <  1,
2
. ( ) _ {
j xx 
a:i; + b:i; 0 < 1; < 1, (y + 1) /2 l<y<l ,
0 other,vise. 1 y > 1.
\i\fhat conditions on a and b are necessary \iVhat are E[Y] and v ar[Y]?
and sufficient to guarantee that fx(x) is a
valid PDF? 4 .4.6 T he cumulat ive d istribution func
t ion of random variable V is
4.4.1 Random variable X has PDF
0 v < 5 ,
1/4 1 < x < 3, F v ( v) = (v + 5) 2 I 144  5 < v < 7)
f X (,,,'" ) 
{0 other,vise. 1 v > 7.
Define the random var iab le Y by Y
h(X) = X 2 . (a) \!\That are E[V] and v ar['!]?
(a) Find E [X] and Var[X ]. (b) \iVhat is E [V 3 ]?
(b) F ind h(E[X]) and E[h(X)].
4 .4. 7 T he cumulat ive d istribution func
( c) F ind E[Y] and Var[Y].
t ion of random variable U is
4.4.2 Let X be a continuous random var
< 5,
71,
iable 'vith PDF
5 < 'IJ, < 3,
()  {1/8 l<x<9, 3 <11, < 3,
f xx 0 other,vise. 3 <11. < 5,
11. >5.
Let Y = h(X) = 1/ v'x.
(a) Find E[X] and Var[X ].
(a) What are E[U] and Var[U]?
(b) F ind h(E[X]) and E[h(X)].
(b) \iVhat is E [2 ]?
( c) F ind E[Y] and Var[Y].
[
PROBLEMS 157
minute. (Note that these plans measure value 1/ A has n,t h moment
your call duration exactly, \vit hout round
ing to t he next minute or even second.)
If your longdistance calls have exponent ial
distribution with expected value T minutes, Hint: Use integration by parts (Ap
\Vhich plan offers a lo,ver expected cost per pendix B, Math Fact B.10).
call?
4.5.20 This problem outlines the steps
4.5.16 In this problem we verify t hat a n needed to show that a nonnegative contin
Erlang (ri, .A) PDF integrates to 1. Let the uous random variable X has expected value
integral of the n,th order Erlang PDF be de
fo
00
In 
_1= .An:::e n 1e  >x
(n, _ 1)I. dx.
0 (a) l:<"br any r > 0, show t hat
First, show directly t hat the Erlang l=> DF 00
(a) Let Xn denote an Erlang (ri, A) random 4.6.1 The peak temperature 7.,, as mea
variable. Use t he definition of t he Er
sured in degrees Fahrenheit, on a July
lang PDF to show that for any x > 0 ,
day in New Jersey is t he Gaussian (85, 10)
random variab le. What is P [T > 100],
P[T < 60], and P[70 < T < 100]?
erfc(z) = J;. f, 00
e x
2
dx . Fy(y) =
y
/_ oo f y(u) d?L =
1
2
+ erf(y) .
Sho'v t hat (b ) O bser ve t hat Z = J2y is Gaussia n
~ erfc ( 72) .
(0 , 1) and sho'v t hat
Q(z) =
PROBLEMS 159
(a) Use the substit u t ion x = ( 'ID  )/a to The average probability of bit error, a lso
show that kno\vn as the bit error rate or BER, is
1=
1
r.; /_
v 211
00
 oo
e
.,
 x / 2
dx. P e= E [Pe(Y)] = 1_: Q(v12if)fy(y) dy .
4.6.15 In mobile radio communications, F ind the expected value E[R'] and vari
the radio channel can vary randomly. In ance Var[R'].
particular, in communicating \Vith a fixed (c) Explain why selling the straddle m ight
transmitter po,ver over a "Rayleigh fading"
be attractive compared to selling just
channel, the receiver signal tonoise ratio Y the put or just the call.
is an exponential random variable with ex
pected value '"'( . fvloreover, \Vhen Y = y , the 4.6.17 Continuing Problem 4.6.16, sup
probability of an error in decoding a trans pose you sell the straddle at t ime t = 0 a nd
mitted bit is Pe(Y) = Q( J2y) where Q() is liquidate your posit ion at t ime t , generat
t he standard normal complementary CDF. ing a profit (or per haps a loss) R'. Find the
[
PROBLEMS 161
To compare t his ap proximat ion to Q(z) 1 use If we gener ate a large number n, of sam p
l\II ATLAB to gr aph les of random variable X, let ni denote t he
number of occurrences of t he event
_ Q(z)  Q (z)
)
e(z  Q(z) . {i ll < x < (i + 1) fl } .
Chapter 3 and Chap ter 4 an alyze exper irr1ents in vvhich an outcome is one nl1rr1
ber. Begirrr1ing vvit h this c11apter , we an alyze exper irnents in which an outcorr1e is
a collection of nurnbers. E ach r1urr1ber is a sarnple value of a randorr1 variable. T11e
probab ility rr1odel for sucr1 an experirnent contains t11e propert ies of the ir1dividual
randorr1 variables arid it also cor1tains the relat ionships among the r andorn v ariables.
Chapter 3 considers only d iscrete randorr1 \rariab les and Chapter 4 considers or1ly
continuous r andorn variables. The preser1t ch ap ter considers all r andorn variables
because a 11igl1 proportion of the definitions and t 11eorerr1s appl}' t o both discret e
and continl1ous random \rar iab les. Hov.rever , just as vvith ir1dividual r andom vari
ables, the details of r1umerical calculatior1s depend or1 v.rhether random variables are
discret e or continuOllS. Conseql1ent ly, vve find t hat rnany forrr1ltlas corne in pairs.
One forrnula, for discrete r andorn \rar iab les , contains sums , and the other formula ,
for contir1uous randorn variables, contair1s ir1tegrals.
Ir1 this c11apter , \Ve cor1sider experiments that produce a collection of r andorn
variables, X 1 , X 2 , ... , X n, v.rher e n, ca n be an}' ir1teger. For most of this cha pter ,
v.re stlld}' 'n = 2 randorr1 va,riables: X arid y . A p air of randorr1 variables is enough
to shov.r t 11e important cor1cepts and useful problernsol\ring t echr1iques. Moreo\rer,
the definitions arid theorerns we introduce for X and Y gener alize to n, randorn
variables . These generalized definitions appea,r n ear the end of t his c11a pter in
Sect ion 5.10.
vVe also note t11at a pa,ir of r ar1dom variables X and Y is the sam e as t he t\vo
dirr1ensior1al vector [X YJ '. Sirr1ilar ly, the r andom variables X 1 , ... , X n ca r1 be
1
v.rritten as t 11e n, dirr1er1sior1al vector X = [ X 1 X n] Since t h e corr1ponents
of X are rar1dorn variables, X is called a ran,dorn 'Uector. ThllS t 11is chapter begins
our study of randorn \rectors . This subj ect is contir1ued in Chapter 8, which uses
t echniques of linear algebra t o develop furt her t he properties of random vectors .
We b egin h ere \vi th t11e definition of F x, y (;i; , y) , t he join,t c'u,rn:tJ,lati've distri b11,
tion, f11,n,ction, of t vvo r andorn var iables , a generalization of t 11e CDF introduced in
162
[
Example 5. 1
We would like to measure random variable X, but we instead observe
y = x + z. (5.1 )
In an experiment t liat prodt1ces one raridom variable, events are points or iritervals
on a line. In a ri experiment that leads t o two ra ndorri v ariables X and Y , eacli
out corne (x, y) is a point in a plane and events a re poirits or a reas in the pla ne.
J11st as t lie CDF of one ra ndorn variable, Fx(;r:), is the probability of the interval
to the left of x, t lie joint CDF Fx ,Y(x,y) of t'\vo randorri variables is t he probabilit:yr
of the a rea below and to t he left of (x, y) . This is t he infinite region that inch1des
the shaded area in Fig11re 5. 1 and e\rerything below arid t o the left of it .
[
The joint CDF is a complete probability rr1odel. The not ation is an extension of
the r1otation convention a dopted in Chapter 3. T11e subscripts of F , sep arat ed by
a cornrr1a, are the narnes of t he tvvo r andorn variables . E ach r1arne is an upper case
letter. vVe llSl1ally vvrite t h e arguments of t h e functior1 as the lower case let ters
associat ed \vith the randorr1 variable r1arnes.
The joint CDF has properties that are direct conseql1ences of t he definit iori. For
exarr1ple, \Ve note that t he event { X < x} st1ggest s that Y can ha,re any \ralue so lor1g
as t he condit ion on X is rr1et. T11is corresponds t o t he joint event {X < x;, Y < oo}.
Therefore,
Fx(x;) = P [X < x;) = P [X < x,Y < oo) = lirr1 Fx ,Y (x;,y ) = Fx,Y (x,oo) . (5.2)
y+oo
vVe obtain a sirnilar result v.rl1en v.re consider the event {Y < y }. The follo,ving
theorern st1rnrna rizes sorne b asic properties of the joint CDF.
Although its definit ion is sirnple, vve rarely use the joint CDF t o study probability
[
  Example 5.2 
X yea rs is the age of ch iId ren e ntering first grade in a school. Y yea rs is the age of
chi ldren entering second grade . The joint CDF of X and Y is
0 x < 5)
0 y < 6,
(1';  5)(y  6) 5<1'; < 6, 6 <y< 7,
Fx,y (:i;,y) = (5.3)
y 6 x> 6>6 <y< 7,
x 5 5 <1'; < 6,y> 7 ,
1 otherwise.
0 x;<5 , 0 y < 6,
Fx (x) = x  5 5 < x < 6, Fy(y) = y  6 6 < y < 7, (5.4)
1 x;> 6 , 1 y > 7.
Refe rring to Theorem 4 .6, we see from Equation (5.4) that X is a contin uous uniform
(5, 6) rando m variable and y is a continuous uniform (6>7) ra ndom variable.
In this ex a rr1ple, \Ve n eed to r efer t o s ix differ e nt r egions in t 11e x , y pla ne and
three differen t forrnulas t o express a proba bility rr1odel as a joint CDF. Section 5.4
introduces the joint proba bility d ensit}' function as a r1other representation of the
probability m od el of a pa ir of ra ndorn v ariables f x ,y(x;, y ). For childrens' ages X
a nd Yin Exa rr1ple 5.2>"''e '\vill sl10V1r in Example 5.6 t h at the CDF Fx,y(x, y) irr1plies
that t 11e joir1t PDF is t he simple expression
To get ar1other idea of the complexity of llSing t 11e joint CDF, try prov ing t 11e
following t11eorern, which expresses the probability that an outcorne is ir1 a rect a r1gle
in t 11e X , Y pla ne in t erms of t 11e joint CDF.
   Theorem 5.2'  
P [x;1 < X < x 2, Y1 < Y < Y2] = Fx,Y (:r:2, Y2)  Fx,Y (x2, Y1)
 Fx ,Y (x 1,Y2) + Fx,Y(x1,Y1) .
[
The st eps r1eeded to prove t he theorem are outlined in Problern 5.1. 5. The theorerr1
sa}'S that to find t11e probability that ar1 outcome is in a rectangle, it is necessary
to evaluate t he joir1t CDF at all four corners . vVh er1 t11e probability of interest
corresponds t o a nonrect a,r1gl1lar area, using the joint CDF is even more complex.
Quiz 5.1  =
Express t11e follovving extrerne values of the joir1t CDF Fx,Y(x;, y) as nl1mbers or in
terms of t he CDFs Fx(x;) and Fy(y).
(a) Fx,y(  oo, 2) (b) Fx,Y( oo, oo)
(c) Fx,y(oo,y) (d) Fx,y(oo,  oo)
Px,Y(x,y) = P [X = x;, Y = y) .
For a pair of discrete ra,r1dom variables, the j oint PMF Px ,Y(x, y) is a complet e
probability rnodel. For a,ny pair of real ntrrnbers, t h e PMF is t he probability of
observir1g t11ese r1urnbers. The r1otation is consistent v.rith t h at of the joint CDF.
The upper case Sl1bscripts of P , separated by a cornma, a re t he n am es of t h e tv.ro
r andorr1 variables. We usually 'ivrite the argurnents of the f\1nctior1 as t11e lo'iver case
let ters associated witl1 the randorn \rariable names . Corresponding t o S x, t11e range
of a sir1gle discrete randorr1 variable, we use t11e notation Sx,Y to denote the set of
possible values of t11e p air ( X, Y). That is,
There are various vvays t o represer1t a joint P1!{F. We use t11ree of t11err1 in the
follovvir1g exa rr1ple: a graph, a list , and a t able.
   Example 5.3  
Test two integrated circuits one after t he other. On each test, the possible outcomes
are a, (accept) and r (reject). Ass ume t hat all circu its are acceptable with probability
0. 9 and that the outcomes of successive tests a re in dependent. Count t he number of
acceptable circuits X and count t he number of successful tests Y before you observe
the first reject. ( If both tests are successful, let y = 2.) Draw a tree d iagram for the
experiment and find the joint P MF Px,Y(x, y) .
a ra X = l ,Y = O
S = {a a,, a, r, r a, , rr} . (5 .7)
r
Observing the tree diagram, we compute
r rr X = O,Y = O
Each outcome speci fi es a pair of values X and y . Let g(s) be the function that
transforms each o utcome s in t he sample space S into the pair of random va riab les
(X , 'Y). Then
g(aa) = (2 , 2) , g(a,r) = (1, 1), g(ra) = (1, 0) , g(rr) = (0, 0). (5.10)
For each pair of values x, y, Px,Y(x, y) is the sum of the probabilit ies of t he outcomes
for which X = x; and Y = y. For example, Px,y(l , 1) = P [ar].
Note that all of the probabilities add t1p to 1. This reflects the second axiorr1
[
L L Px,y (x , y) = l. (5.11 )
:i:ESx yESy
Theorem 5.3
For discret e r an,dorn variables X an,d Y an,d an,y se t B in, t he X , Y plan,e, the
pr obability of the e'verit { (X, Y) EB } is
P [B] = L Px,Y(x,y) .
('.1;, y ) EB
Example 5.4
Continu ing Example 5.3, find t he pro babil ity of t he event B t hat X, t he number of
acceptable ci rcu its, eq ua ls Y, the number of t ests before o bserving t he first failure.
Therefore,
If vve vievv x;, y a.s t he outcome of ar1 experiment , then Theorern 5.3 sirnpl}' says
tha t to fir1d the proba bility of a n e\rent , vve s um O\rer a ll the outcorr1es ir1 t h a t
event . In esser1ce, Theorern 5.3 is a restaternent of T heorerr1 1.5 in t erms of r andorn
variables X and Y and joint PMF Px,Y(x, y) .
[
x
x
Quiz 5.2
The joint PMF PQ ,G( q, g) for rar1dom variables Q and G is giv er1 in the follovving
table:
P. c(q,g) g= O g= 1 g= 2 g= 3
q= 0 0.06 0.18 0.24 0.12
q= 1 0.04 0.12 0.16 0.08
5 .3 Marginal P MF
For discrete random variables, the rnarginal PMFs Px( x;) and
Py(y) are probability models for t11e individual randorr1 variables
X a.nd Y but they do not provide a cornplete probability model
for the pair X , Y.
   Theorem 5.4
For discrete ra/ndorn variables X an,d Y 'UJith jo'irit PM.F Px ,Y(x;, y),
We not e that bot h X and y have range {O, 1, 2}. Theorem 5.4 g ives
2 2
Px (0) = L Px,Y (0 , y) = 0.01 Px( l) = L Px,Y(l ,y) = 0.1 8 (5. 14)
y=O y=O
2
Px (2) = L Px,Y (2 , y) = 0. 81 Px(x;) = 0 x; =J 0, 1, 2 (5. 15)
y=O
Referr ing to t he table representat ion of Px,Y( x, y), we observe t hat each val ue of Px(x)
is t he resu lt of addin g all the entries in o ne row of t he table. Simil arly, the formul a
for t he P M F of Yin T heorem 5.4 , Py(y) = 2=xESx Px,Y(x;, y), is the sum of a ll t he
ent ri es in one column of the table. We display Px(x) and Py(y) by rewriting the table
and placing t he row sums and colum n sums in t he margins.
Px.y(;r; , y) y= O y= l y = 2 Px(x )
x; = 0 0.01 0 0 0.01
.T,  1 0.09 0.09 0 0.18
.T,  2 0 0 0. 81 0. 81
Py (y) 0.10 0.09 0. 81
[
T hus the column in th e right margin shows Px(x;) and the row in the bottom margin
shows Py(y) . Note that the s um of all the ent ri es in th e bottom margin is 1 and
so is the sum of a 11 the entries in t he right margin. This is simply a verifi cation of
T heorem 3.l(b), which states that the PMF of a ny random variable must sum t o 1.
PH B (h,) b) b= 0 b= 2 b= 4
h, = 1 0 0.4 0.2
(5.16)
h, = 0 0.1 0 0.1
h, = 1 0.1 0.1 0
Definition 5.3 and Theorem 5. 5 derrioristrate t liat tlie joint CDF Fx,Y(x, y) an d t he
joint PDF f x ,Y(x;, y ) represent tlie same probability rriodel for raridom variables X
[
and Y . In t lie case of one rand om variable> vve fo t1nd in Chapter 4 t hat t he PDF
is typicall}' rriore useful for problerri solvirig. The advant age is even stronger for a
pair of random variables.
Referring to Equation (5.3) for the joint CDF Fx,y(:i;, y) , we must eva luate the partial
derivative 8 2 Fx ,y(;r;, y)/ f};,r;f}y for each of the six regions specified in Equation (5.3).
However, 8 2 Fx ,y(;r;, y)/ 8x8y is nonzero only if Fx.Y(:i;.y) is a function of both x and
y . In this example, only the region {5 <:i; < 6, 6 <y < 7} meets this requirement.
Over this region,
82 8 8
fx y(:i;, y) = ,. [(:i;  5)(y  6) ) =  [:i;  5) [y  6) = 1. (5 .18)
' 8x8y ox &y
Over a ll other regions , the joint PDF f x,Y(x, y) is zero.
Of course, riot every functiori f"x ,y(;r;> y) is a joint P DF . Proper t ies (e) arid (f) of
T heorern 5.1 for the CDF Fx ,Y(x , y) irriply corresponding properties for the PDF .
Given an experirrient tl1at produces a pair of corit in11ous ra.ndorn variables X and
Y, an e\ren t A corresporids to a region of the X, Y plane. The probability of A is
t he double integral of f x ,y(:i;, y) over the region A of t he X> Y plane.
Example 5.7
Random variables X and Y have joint PDF
Find the constant c and P [A) = P[2 < X < 3, 1 < Y < 3).
[
The large rectangle in the diagram is t he a rea of nonzero probabi lity. T heo rem 5.6
states that t he integral of the joint PDF over th is recta ngle is 1:
y
A
1= 110
5
0
3
cdydx = 15c. (5.20)
p [Al = r r
3
J_
12 11 15
3
dv dv, = 2I 15. (5.21 )
This probability mode l is an examp le of a pair of random variables unifo rmly d is
tributed over a rectangle in the X , Y pla ne.
The follovving ex arr1ple derives t 11e CDF of a pair of randorr1 variables t hat has
a joint P DF t h at is easy to 'ivrite rr1atherr1atically. The purpose of the extur1ple is
t o int roduce t echniques for analyzir1g a rnore corr1plex probabilit}' rnodel than the
one in Example 5.7. T ypically, 'ive ext ract ir1teresting ir1forrr1ation from a rnodel b}'
integrating t11e PDF or a fur1ction of t he P DF O\rer sorne region in the X , Y plane.
Ir1 perforrning t his integration, t11e rr1ost difficult t ask is to ider1t ify t11e limits. The
PDF in t he exa rr1ple is very simple, just a const an t O\rer a triangle in t11e X , Y
pla r1e. However, t o evalui1te its ir1tegral over the region in Figure 5.1 vve need t o
consider five different sit uations dependir1g on the \ralues of (x;,y). T he soh1tion
of t he ex arnple dernonstr at es the point that t11e PDF is usually a more concise
probability rr1odel t h at offers rnore insights int o the nature of ar1 experiment than
the CDF.
 Example 5.8
=
Fi nd the joint CDF Fx,Y(x;, y) whe n X and Y have joint PD F
y
1
fx)'(x,y)=2
\ f'x ,Y(x;,y) =
2
0
O <y<x< l >
otherwise.
(5.22)
y y
.'( x
:: ::...
}.'
x x
I
Figure 5.3 Five cases for the CDF Fx, Y(x, y) of Example 5.8.
[
The remaining situation to consider is shown in Figure 5.3d , when (x , y) is to the right
of the triangle of nonzero probabi lity, in which case the integra l is
Fx ,Y (x , y) = 1" 1 1
2 dudv = 2y  y 2 ( Figure 5.3d) (5 .25)
In Exarnple 5.8, it takes careful study to verify t hat Fx,y(;r: , y) is a valid CDF
that satisfies the properties of Theorern 5.1 , or e\ren that it is defined for all \ralues
x and y . Comparing the joir1t PDF vvit11 the joint CDF , we see that the PDF
indicates clearl}' t11a.t X , Y occurs with equal probability in a ll areas of t11e sarne
size in the t riangl1lar region 0 < y < x < 1. T11e joint CDF corr1pletel}' hides this
sirnple, importa nt propert y of the probability model.
In the previous example, t11e t riar1gl1la.r s11ape of t11e area of nonzero probability
dernanded our careful attention. In the next exarr1ple, the a rea of nonzero prob
ability is a rectangle. Hovvever, t11e area corresponding to the e\rer1t of interest is
rnore corr1plicated.
[
..
I I 1
..
I
..
. .
. . ... o I
..
...
.
.
0.5
2
0
0 0.5 1 2 0 y
1.5
x
Example 5.9
As in Example 5.7, random variables X and Y have joint PDF
1/ 15 O <x;<5, 0 <y< 3,
(5.27)
0 otherwise.
~)
Y>X 3 3
P [A] = ( ( ( dydx (5 .28)
Jo Jx l o
3 3
3 x (3  ::r; ) 2 3


1 0
dT  
15 ,. ,  30 0
 
10 (5 .29)
In this exarr1ple, it makes little difference wl1ether vve ir1tegrate first over y and
then over x or the otl1er '\Va.Jr around. In general, 110'\vever , an initial effort to
decide the simplest vva:y to integrate over a region can avoid a lot of complicated
rnathematical maneu'irering in perforrning the integration.
Quiz 5.4
The joint probability density f\1nction of randorr1 variables X and y is
5 .5 Marginal P OF
(5.31)
We use Theorem 5.8 to find the margina l PDF f x(x;) . In the fig ure that accompa nies
Equation (5 .33) below, the gray bowlshaped regio n depicts those values of X and Y
for wh ich f'x ,Y(x , y ) > 0. W hen x <  1 or when x > 1, f x ,Y(;r; , y ) = 0, a nd t herefore
fx( x) = 0. For  1 <x< 1,
y X =x
1
5y 5(1  x 4 )
1 x 1
f x(x;) =
1
x2
 rly
4
=
8
. (5.33)
[
...._ 0.5
~
~ 5(1  x 4 )/ 8  1 <
 :r <
 1
f x(x) = )
(5.34)
0 otherwise.
0
1 0 1
x
For the margina l PDF of Y, we note that for y < 0 or y > 1, j'y(y) = 0. For
0 < y < 1, we integrate over the horizontal bar marked y = y. Th e boundaries of the
bar are x; =  Jf; and x = JY.
Therefore, for 0 < y < 1,
y
JY 5y 5y J;= ./Y
 1 y 112 y 1121
f y(y) =
JY 4 J
 dx =  x
4 J;=fi;
= 5y 312 / 2. (5 .35)
Quiz 5.5
The joint probability density function of randorr1 variables X and y is
Applyir1g the idea of independence to rar1dorn variables, v.re say that X and Y are
independent randorr1 variables if and or1ly if the events {X = x} and {'Y = y} are
independent for all ;i; E Bx and all y E Sy . Ir1 terms of probability m ass fur1ctions
and probability density functions, ""'e have the follovving definit ion.
Example 5.11
Are the childrens' ages X a nd y in Exa mple 5.2 independent?
In Exam ple 5.2, we derived the CD Fs Fx(x) and Fy(y), wh ich showed t hat Xis uniform
(5, 6) and Y is un iform (6, 7) . Th us X a nd Y have margina l P DFs
1 5< 
;i; < 6.
I
1 6 <x<
 7 )
(5.38)
0 otherwise, 0 otherwise.
Because Definition 5.4 is an eqt1ality of functior1s , it mt1st be trt1e for all va1t1es of
x and y .
It is easily verified that f"x ,y(:i;, y) = f'x (:i;)j"y(y) for a ll pa irs (x, y), and so we conclude
t hat X and Y are independe nt .
Since f u,v(v,>v) looks similar in form to f'x ,Y(x ,y) in the previous example , we might
suppose that U and V can also be factored into marginal PDFs f'u(v,) and f'v(v) .
However , this is not the case. Owing to the triangular shape of the region of nonzero
probability, the marginal PDFs are
0 otherwise, 0 otherwise.
Clearly, U and v are not independent . Learning U changes our knowledge of V. For
example, learning U = 1/ 2 informs us that P[V < 1/ 2] = 1.
In t hese two exarriples, vve see that the regiori of nonzero probability plays a
crucial role in deterrnining whether randorn variables are independent . Once again>
vve empliasize that to infer that X and Y are independerit , it is necessary to verify
the functional equalities in Defiriition 5.4 for all ;_r; E Bx and y E Sy . There are
rnany cases in v.rhicli sorne events of the forrri { X = x} and { 'Y = y} are iridependent
and others are riot independent. If this is the case> the randorn variables X and Y
are not independent.
In Exarnples 5.12 and 5.13 , \Ve are giveri a joint PDF arid asked to determine
vvhether the randorri varia bles ar e independent . By contrast , iri rnan}' applications
of probability>the n ature of an experiment leads to a rnodel in whicli X and y are
independent. In these ap1Jlications we exarnine an experirrierit and determine tliat
it is appropriate t o rriodel a pair of r a ndorn \rariables X and Y as independent.
To analyze tlie experirnent, vve start \A.Tith the PDFs f x(i;) a nd j'y(y), and then
construct tlie joint PDF f'x,Y(x, y) = f x(x)fy(y) .
Example 5 .14
Consider again the noisy observation model of Example 5.1. Suppose Xis a Gaussian
(0 , ax) information signa l sent by a radio transmitter and Y = X + Z is the output
of a lownoise amplifier attached to the antenna of a rad io receiver . The no ise Z is
a Gaussian ( 0, az) random variable that is generated within the receiver. What is the
joint PDF f'x,z(i;, z)?
1 1 ( x.; + r~ )
fx ,z (i;,z) = f'x(i;) f'z(z) = !727ie ax C7z . (5 .42)
27r y axaz c::=====
[
==Quiz 5. 6==:::2
(A) Randorn variables X and y in Exarnple 5.3 and randorn variables Q arid G
in Quiz 5.2 have joint PMFs :
There are man:yr sitl1ations iri which vie observe two randorn variables and use
their values t o comptlte <:a. new randorri variable. For example , v..re can model the
arnplit ude of the signal transrnitted by a r adio st ation as a r aridorn variable, X.
vVe can rnodel the attentlation of the signal as it travels to the antenna of a rrioving
car as ariother randorn variable, Y. Iri this case the amplitude of the sigrial at tlie
r adio receiver in t he car is t he randorn variable W = X / y .
Forrnally, v.re h ave the follovving situation. We perforrn an experirrierit and ob
serve sarnple v alues of t \v O random variables X and Y. Based on our knowledge
of the experiment , \Ve have a probability rnodel for X arid Y ernbodied in a joint
P11!F Px ,Y(x;,y) or a join t PDF 1x,Y(x,y) . After perforrrring the experiment , \Ve
calculate a sarnple vall1e of the randorn variable W = g(X, Y). VV is referred to
as a derived randorn varia,ble. Tliis section ident ifies irnportant properties of the
expected value, E [W]. The probability rnodel for VV, ernbodied in P'v\1('1D) or f w('w),
is tlie subject of Chapter 6.
As witli a ft1nction of one random variable, we can calct1late E [W] directly frorri
Px,y(x, y) or f'x ,y(x, y) w ithout deriving Pvv('w) or 111v(w) . Corresporidirig to The
orerris 3.10 and 4.4, vve h a;ve:
[
\ 'Ale can break t he double summat ion into n, 'veighted double summations:
By Theorem 5.9 , t he ith double sum1nat ion on t he righ t side is E[gi(X, Y)]; t hus,
To complete t he proof, 've expr ess t his integral as t he s um of n integr als and recognize
t hat each of t he new integr a ls is a weighted expected value, a,i E[gi(X, Y)].
In v.rords, Theorerr1 5.10 says t:hat t he expected va.lt1e of a linear corr1bination equals
the linear combinatior1 of tr1e expected va lues . We will have rr1any occasions to
apply this t heorem. The follovvir1g theorern describes t h e expected surn of t\vo
r andom variables, a sp ecial case of Tr1eorerr1 5 .10.
E [X + y ) = E [X) + E ['Y) .
[
This theorem irnplies that "if..Te can find the expected surr1 of tv.ro randorn variables
frorn the separate probability rr1odels : Px( x) and P y(y) or f'x( x) and fy( y) . '\''!\! e do
not n eed a cornplete proba,bilit:y rnodel err1bodied in Px,y(x , y) or f x,y(x, y).
By contrast, the variar1ce of X + Y depends on t11e entire joint P JVIF or joint
CDF:
\\!e observe that each of t he three terms in the preceding expected values is a function of
X and Y. Therefore, Theor em 5.10 implies
The expression E [( X  ,x) (Y  ,y) ] in the fin al t erm of Theor err1 5 .12 is a. pa
r a rneter of the probabilit:y model of X arid y . It reveals importar1t properties of
the relations11ip of X a nd Y. This quar1tity a ppe<"1rs over and over in practical
applications, and it 11as its ovvn r1a me, covarian,ce.
Example 5.15
A com pany website has t h ree pages. Th ey requ ire 750 kilobytes, 1500 kilobytes, and
2500 kilobytes for transmissio n. T he t ransmissio n speed ca n be 5 M b / s for exte r
n a I req uests or 10 M b / s for internal req uests. Requests arrive random ly from in side
and o utside t he co mpany in dependently of page lengt h , wh ich is also random. T he
probability models for t ransm isio n speed , R, and page length, L, are:
( 0.3 l = 750 ,
0.4 r = 5 )
PL (l) = ~ 0.5
l = 1500,
PR(r) = 0.6 T = 10 ) (5.49)
lO otherwise, l~ 2 l = 2500
otherwise.
)
W ri te an expressio n for t he transm issio n t ime g(R, L ) seco nds. Derive the expected
transm ission time E[g(R, L )] . Does E[(g(R, L )] = g(E[R] , E[L])?
T he tra nsm issio n time T seconds is the the page length (in kb) d ivided by the trans
[
8
~ PL(l) l
1000
r
~
l
= 1000
8 (0.4 0.6)
5 + 10 (0.3(750
)+ (
0.5 1500) + 0.2 (
2500))
= 1.652 s. (5.50)
By comparison, E [R] = l:r r PR(r) = 8 M b/ sand E[L] = l:t lPL(l) = 1475 kilobytes.
T his im plies
_ 8E [L] _ r.::
g(E [R] , E [L])  lOOOE[R]  1.470 s # E [g(R,L) ] . (5.51)
Sometirnes, t he notation CJxy is used to deriote the cov arian ce of X and y . '\'!Ve
have alread:y learried tliat the expect ed value pararneter, E[X], is a typical value of
X arid that tlie variance parameter, Var[X], is a single riurriber that describes hovv
sarnples of X terid to be spread around tlie expect ed value E [X ]. In an an alogous
"''ay, t he covariarice parameter Cov[X, y ] is a sirigle number that describes hov.r the
pair of random variables X arid Y vary together .
The key to understanding CO\rariarice is the r andom variable
Since Co,r[X, Y] = E[vV], vve observe t h at Cov[X, Y] > 0 tells us that t h e typical
values of (X  1;,x) (Y  ,y) are positive. Hovvever , this is equivalent to saying tliat
X  11,x and Y  y typically ha\re the sarne sigri. That is, if X > 1;,x t hen "''e would
[
Note t liat the covariance has units equal t o tlie product of t he units of X arid Y.
Thus, if X lias units of kilograms a rid y h as units of seconds, t heri Cov[X , Y ] has
tlnits of kilograrriseconds. By contrast , Px ,Y is a dirrierisionless ql1a nt ity t liat is
not affected by scale ch anges.
Theorem 5.13
A A
If X = aX + b a'nd Y = cY + d, then,
{a) px y = p x ,y ,
)
{b) Co\r[X , Y ] = ac Co\r[X , Y].
[
2
...
....
2
....  . . .
2
.
.:t ...
......
..,,~.
0
'
9'C
0 ........
~
.. .
~
,...
fr.1 .:.
,,,..
~
.1.
.::._....~'". . .
'.t ld .
0
.~~
.,1J41:
.r!..!W ~
:t~: :
2 2 2 . f.
2 0 2 2 0 2 2 0 2
x x x
(a) PX ,Y =  0.9 (b) PX,Y = 0 (c) PX ,Y = 0.9
Fig ure 5.5 Each graph has 200 samples, each n1arked by a dot, of t he randon1 variable pair
(X , Y ) such t hat E[ X] = E [Y] = 0, Var[X] = Var [Y] = 1.
The proof st eps are out lir1ed in Problem 5.8.9. Related t o this insensitivity of PX, Y
to scale ch ariges, an irriportant propert:y of the correlation coefficient is t h at it is
bounded by  1 and1:
Proof Let a~ and a~ d eno te t he varia nces of X and Y , and for a constan t a, let W =
X  aY. Then ,
Since Var[W] > 0 for a n y a, we h ave 2a Cov[X, Y] < v ar[X ] + a 2 Var[Y]. C hoosing
a = a x/ a y yields Cov[X , Y] < a y a x, 'vhich implies px,y < 1. Choosing a= ax/ a y
yields Cov[X , Y] > ay a x 1 which implies px, Y > 1.
vVhen PX,Y > 0, we say triat X and y are positively correlated, arid vvhen Px, Y < 0
vve Sa}' X arid Y are 'negatively cor;elat ed. If IPx ,Y I is close to 1, say IPx,Y I > 0 .9 ,
then X and Y are highly correlated. Note that .high correlation can be posit ive or
negative. Figl1re 5.5 shO\S outcornes of indeperident t rials of an experirnent that
produces randorri variables X and y for ra ndorn varia ble p airs vvith (a) riegative
correlation, (b) zero correlation, and (c) positive correlatiori. Tlie following theorern
derrioristrates that IPx ,Y I = 1 when there is a liriear relat ionsliip between X and y .
[
Definition 5 .7 Correlation
T he correlation of X an,d y is r x ,Y = E[XY]
The follovving tlieorerri coritairis useful relatioriships arriong three expected v alues :
the covariance of X and Y , tlie correlation of X arid Y , and the variance of X + y .
Proof C rossmult iplying ins ide t he expected value of Defini t ion 5.5 yields
Note t hat in the expression E[,y X], JLY is a constant. Referring to Theorem 3.12, we
set a = JLy and b = 0 to obtain E [y X] = JLy E[X] = ,y ,x . The same reasoning
demonstrates t hat E[,x Y ] = JLX E[Y] = x ,y. Therefore,
The other r elat ions hips follo'v direct ly from t he definitions and Theor em 5.12.
 = Example 5.11
For the int egrated circ uits tests in Examp le 5.3, we fo und in Exa mple 5.5 that t he
proba bi lity model fo r X and Y is given by the fo llowing mat rix.
Px,Y(x , y) y= O y= l y= 2 Px(x)
:r; = 0 0.01 0 0 0.01
:r; = 1 0.09 0.09 0 0.18
:r; = 2 0 0 0.81 0.81
Py (y) 0.10 0.09 0.81
2 2
rx,Y = E [X .Y] = L L xyPx,Y(x;, y) (5.59)
x =Oy=O
= (1) (1 )0.09 + (2) (2)0.81 = 3.33. (5.60)
T 11e te1TI1s orthogorial and urico'rrelated describe r ar1dorr1 v ariables for vv11icl1
rx,Y = 0 a nd ra r1dom variables for which Cov[X, y ] = 0 respectivel}'
This terrr1inolog}' , v.rhile widely used, is som ewhat cor1fusing, since orthogon,al rr1ear1s
zero correlation and v,n,correlated rr1ea ns zero covariance.
[
  Theorem 5.11  
For iridepe'Tl,de'Tl,t ra'Tl,dorn 'variables X arid Y ;
(a) E[g(X )h,(Y) ] = E[g(X)]E [h (Y)],
{b) rx ,Y = E[X.Y] = E[X] E[Y] ;
( c) co,r[X , Y] = p X ,Y = O;
{d) Var[X + y ] = Var [X] + Var'["Y],
Proof V/e presen t t he proof for discrete r andom variables. By replacing P ~l[ Fs
and sums
'vit h P DFs and integra ls \Ve arrive at essent ially t he same proof for cont inuous random
variables. Since Px ,y(1;, y) = Px(1;)Py(y),
If g(X ) = X , a nd h(Y) = Y, t his equation implies rx,Y = E [XY] = E [X] E [Y]. This
equation and Theorem 5.16(a) imply Cov [X, Y] = 0. _As a result, Theorem 5.16 (b) implies
Var[X + Y ] = Var [X] + Var [Y]. F urt hermore, px,Y = Cov[X , Y ]/ (crx cr y) = 0.
T liese r esults all follovv directl}' from t he j oint P 11:F for independen t r andorri
variables. We observe that T lieorerri 5 .17 ( c) st at es that i'Tl,dep e'Tl,de'Tl,t raridorn 'vari
ables are 'LJ.'Tl,correlated. SN"e vvill h a,re rriany occasions to refer to t liis propert}' It is
important t o knovv t hat \hile c o,r[x ,Y] = 0 is a necessary proper t}Tfor indep eri
dence, it is riot sufficient . Ther e are rriany p airs of uncorrelated randorn \rariables
that are 'Tl,ot independent.
We recall from Exa mp le 5.1 that t he signa l X is Gaussian (O, ox), t hat the noise Z is
Gaussian (O , oz) , and that X and Z are independent. We know from Theore m 5.17(c)
[
Since E[X] = E [Z] = 0, Th eorem 5.11 te lls us t hat E['Y] = E[X] + E[Z] = 0 and
T heorem 5. 17)(b) says t hat E [X Z] = E [X ] E[Z] = 0. T his pe rmi ts us to write
pX , y =
Cov [X, Y] o1 Io~ (5.65)
l + ox I O"z2
:=~=~=
2
JVar[X ] Var[Y]
We see in Example 5.18 that the cov ariance betV\reen tlie transrriitted signal X and
the received sigrial Y depends on the ratio o1 /
o~. This ratio, referred to as tlie
sign,alto rioise rat'io, lias a strong effect ori comrriunication quality. If /o~ << 1,o1
the correlation of X and Y is "''eak arid the noise dorninates tlie signal at the receiver.
Learning y, a sarnple of the received s igrial, is not very helpful iri deterrnining
the corresporiding sample of the transrnitted sigrial, x . On the other hand, if
o1 /o~ >> 1, the trarismitted sigrial domiriates the noise and Px,Y ~ 1, an indication
of a close relationsliip betV\reen X and Y. vVhen there is strong correlation betV\reen
X arid Y , learnirig y is ver}' lielpful in deterrnining x .
== Quiz 5. 8___;;=~
(
:J: /J, )()
2
_ 2px.Y( :1; ,x) (y 1J,y) + (u,v) 2 1
exp r
rrx rrxrry rry
l 2 ( 1  p~, y ) J
f X ,Y (x , Y) = ;:::===
27rCJ xCJy V11  P2X ,Y
Figl1re 5.6 illustrates the bivariate Gaussian PDF for ,x = /J,y = 0, CJx = CJy =
1, arid tliree va lues of PX,Y = p. vVhen p = 0) tlie joirit PDF lias the cirCtllar
syrnmetry of a sorribrero. When p = 0.9 , tlie joint PDF forms a r idge over the line
x = y , and when p =  0.9 there is a ridge over the lirie x =  y. T he ridge becorries
increasingl}' steep as p + 1. Adj acerit to each P DF , we repeat the graplis in
Figtlre 5.5; each grapli shov.rs 200 sarriple pairs ( X , Y) drav.rri frorn tliat bivariate
Gaussian PDF. "\e see that the sarriple pairs are clustered in the region of the x, y
plarie "'' here the PDF is large.
To exarnine rnatliernatically tlie properties of tlie bivariate Gaussiari PDF , v.re
define
(5.67)
[
... .. . . .. ., . . .
. p = 0.9
.. ..
0.3 .. ..
.
.. . ..... . .. 2
~. 0.2
..
.
~ 0
~
~ 0. l
2
0
1 I 2 )' 2 0 2
x
. . . .. .. .. . . ... . ..
.. p=O
.  .. ..
0.3 ..
. .. .... ... . .. . . . .. ' 2
. ., ... .
fi/J
.,,.1_,\~~
~. 0.2 ..
' '
0 . . r~
.
~..:.
.~
~"
.:. ...:":?' ..
.. ..
.=i....
'..i. 0. I
,
,...
2
r 2 0 2
x
... . . .. ... .. . .. , ..
. ..
.. p = 0.9
. : ...
0.3 .. .
.. . .. . .. .. ... . . ...
~. 0., 
~
~0. 1
0 2 2 0 2
.x x
Fig ure 5.6 T he J oint Gaussian P DF f x ,Y(:i;, y) for x = ,y = 0, a x= a y = 1, and t h ree
values of p x,Y = p. Next to each PDF, we plot 200 san1ple pairs (X , Y) generated wit h t hat
PDF .
and m anipulate t 11e form11la in Definit ion 5.10 t o obtain the follov.ring expression
for t he joir1t Ga t1ssian PDF:
(5 .68)
Eq11ation (5.68) expresses f x,y(x,y) as the product of tv.ro Ga ussian PDFs, one
vvith par arr1eters ,x and a x and t l1e ot l1er v.rith p ar a rr1eters jJ,y a nd 0y. This
forrnula plays a ke:y role in t he proof of the follovving theorem.
[
_

1
In": e
 (x  J.Lx ) 2 / 2ai 1_= _ 1
In": e
 ( y  jl,y(x )) 2 / 20~ d
y (5.69)
axv2rr  oo a yy211
1
The next t heorem identifies Px ,Y in Definit ion 5.10 as tlie correlatiori coefficient of
X arid y .
The proof of Tlieorerri 5.1 9 involves algebra t liat is more easily digested vvit h sorrie
insight from Chapter 7; see Section 7.6 for t lie proof.
Frorn Tlieorern 5.1 9, \Ve observe t hat if X and y are t1ncorrelated, t hen Px,Y = 0
and , b}' eva.lt1ating tlie PDF in Definit ion 5.10 with p x ,Y = 0, we have f x ,Y(x;, y) =
f x( ;i;) j "y(y) . Thl l S vve lia ve t he followirig t heorerri.
Theorem 5.20
Bivariate G a'IJ,Ssian, ran,dorn variables X arid Y are 'uricor~related if an,d on,ly if they
are in,deperiderit.
then, T1V1 an,d vV2 are bivariate Ga/u,ssian, ran,dorn 1;ariables S'IJ,Ch that
Theorerr1 5,21 is a special case of Theorem 8, 11 vvhen vve h ave n, = 2 jointl:yr G aussian
r a ndom va riables . We orr1it the proof sir1ce the proof of Theorem 8.11 for ri joir1tly
G a ussiar1 rar1dom variables is, vvith sorne knovvledge of linear algebra , sirr1pler. The
r eqt1irern ent that t11e eqt1a.tions for vV1 arid W 2 be "linearly ir1dep endent" is linear
a lgebra terrr1inology that excludes d eger1er ate cases su ch as T 1 = X + 2Y and
W 2 = 3X + 6Y w here VT/ 2 = 3W1 is just a scaled replica of vT/ 1 .
Theorern 5 ,21 is pov.rerful. Even the partial result that Wi by itself is G a ussia n
is a nor1t riv ial conclt1siori. vVher1 a n experiment produces linear combir1ations of
G a ussiar1 ra ndom varia bles, knovving t ha t these corr1bir1ations a re G aussia n sirr1pli
fies t11e an a lysis b ecat1se a ll vve n eed to do is calculate the expected v alues , variar1ces,
a nd covariar1ces of the ot1tputs in order to derive probabilit:yr models.
 = Example 5.19
For the noisy o bservation in Examp le 5.14, find the PDF of Y = X + Z.
Since X is Gaussian ( 0, ox) and Z is Gaussian (0, az) and X and Z a re independent,
X and Z are joint ly Gaussian. It fo llows fro m Theorem 5.21 that Y is Gaussian with
E[Y] = E[X] + E[Z] = 0 and varia nce a~ = o1
+a~ . Th e P DF of Y is
Example 5.20
Continu ing Example 5. 19, fi nd the joi nt PD F of X and y whe n ox = 4 and oz = 3.
From T heorem 5.21, we know that X a nd Y are bivariate Gaussian . We also know that
,x = ,y = 0 and that y has varia nce a ~ = o1
+a~ = 25 . Substituting a x = 4 a nd
oz = 3 in the fo rmula fo r the correlation coefficient derived in Exa mple 5.18, we have
Px ,Y = (5. 71 )
Definition 5.1 1 is concise and general. It provides a corr1plete probability rr1odel re
gardless of \vhether any or all of the X i are discrete, continl1ous, or mixed. However,
the joint CDF is usually riot corrvenient to use in analyzir1g practical probabilit}'
rr1odels . Instead , \Ve use the joint PMF or the joint PDF.
. (, , . ) _ an F x 1 , ... , x n (::r i , . . . , x n)
j X 1 ,..., Xn X 1 , , Xn  (} x ... Ox
. 1 . n
Theorerr1s 5.22 arid 5.23 indicate t.hat t he joint PNIF and the joint PDF have prop
erties t hat are ger1er alizations of t11e axiorns of probability.
Theorem 5.22
If X 1, ... , Xn are discrete ran,dorn variables 11rithjoir1,t Plv!F Px 1 , ... ,x n(;r;1, ... ,~r;n)J
(a) Px1 , ... ,x .,,(x;1, ... ,xn) > O)
{b) L L Px 1 , ... ,x n(x1, ... , Xn) = 1.
Theorem 5.23
If X1 , ... , Xn are con,tin/1J,01J,S ran,dorn variables 'tuithjoin,t PDF f x 1 , ... ,x n(x;1, ... , Xn);
{a) fx 1 , ... ,Xn (x 1 , . , x;n) > 0 7
oo
1
:r;n
oo
f'x 1 , ... , X n ( 'lJ, 1' ' 'U, n) r17J,1 r17J,n '
{c) J J 00
oo
.
00
oo
f'x 1 , ... ,x n(x; 1, ... , Xn) rlx;1 rlx;n = 1.
Theorem 5.24
The probability of ari even,t A ex;rJressed 'iri terrns of the ran,dorn 'uariables X 1 , ... , X n
is
Discrete: p [A] =
(x1 , ... ,xn) EA
Continuous: P [A] = j j f x,, ... ,x . (xi, ... , Xn) dx1 dx2 ... dxn .
A
Althol1gh v.re 11ave writ ten the discrete version of Theor em 5.24 vvith a single
surnmation, v.re rnust rerr1ember that in fact it is a ml1ltiple s11m over the ri variables
X1, ... , x;n
[
Table 5.1 T he PMF Px ,Y,z(x,y,z) and t he events A and B for Example 5. 22.
Example 5.21
Consider a set of n, independent trials in which there are r possible outcomes s 1 , ... , Sr
for each trial. In each trial , P [s,i] = '[J,i . Let N ,i equa l the number of times that outcome
Si occurs over n, tria ls . What is t he joint PM F of N 1 , ... , Nr ?
The solution to t his problem appears in Theorem 2.9 and is repeated here :
(5 .73)
The down loads a re independent trials, each with three possib le outcomes: L = 1,
L = 2, and L = 3. Hence , the probability model of t he number of down loads of each
[
Px y
' '
z(x,y ,z) = (.4 )
x,y,z
(l):i; (1

3
) (l)z 
2
71
:
6
(5 .74)
T he P MF is displayed numerically in T able 5.1. The fina l column of t he table ind icat es
that the re are t hree outcomes in eve nt A a nd 12 outcomes in event B. Addi ng t he
probabilit ies in t he two events, we have P[A] = 107 /432 and P[B] = 8/9.
In anal:yzir1g an exper im ent , vve rnight v.rish t o study sorne of t 11e randorr1 vari
ables and ignor e other ones . To accornplish t h is, \Ve car1 derive rnarginal P 1![Fs
or rr1argir1al P D Fs t hat ar e prob abilit}' models for a fraction of t he r andom vari
ables in t he complete experirner1t . Consider ar1 experirner1t vvit11 fol1r randorr1 vari
ables l;Tl ,X , Y, Z. The probability m odel for t he experirnent is the joir1t P 1![F ,
Pw, x ,Y,z('w, x, y, z) or t he j oir1t PDF , f'w,x ,Y,z(w, x, y , z) . The followir1g t heorerns
give examples of rr1argir1al P 1![Fs arid PDFs.
   Theorem 5.26
For a join,t P DF f w, x ,Y,z('w,x,y,z) of con,tin,uous ran,dorn variables W ,X ,Y ,Z ,
sorne rnarg'irial PDFs are
Theorerns 5.25 and 5.26 can be generalized in a straig11tforward way to any rr1arginal
P1![F or rr1argir1al PDF of a.n arbitrary nurnber of randorn variables. For a probabil
it y model described by the set of r ar1dorn variables {X 1 , ... , X n}, each nonempty
strict subset of t hose ra n dorn variables has a rnarginal probability rnodel. There
ar e 2n Sl1bsets of {X 1 , ... , Xn} After excluding t he entire set an d t 11e r1ull set 0,
vve fir1d that there ar e 211'  2 m arginal probability models.
[
f y,,Y. (yi, 'Y4) = 1: 1: f y., ,Y4 (Yi, ... , Y4) dyz dy3 . (5 .76)
In the foregoing integral, the hard part is ident ifying the correct limits . Th ese lim its
will depend on YI and y4. For 0 <YI < 1 and 0 < y4 < 1,
(5 .77)
(5 .79)
T he complete expression is
Exarnple 5.22 dernonstrates that a fairly sirnp le experirnent car1 generate a joint
P MF that, in table forrn , is perhaps st1rprisingl}' long. Ir1 fact, a practical experi
rner1t often generates a joir1t P MF or PDF t hat is forbiddingly cornp lex. The irn
portar1t exception is an experimer1t that prod11ces n, independent randorr1 variables .
The follov.ring defir1it ion ex ter1ds the defir1it ion of ir1dependence of two r andorn vari
ables. It stat es that X 1 , ... , X n ar e indeper1dent w11en t he joint P J\l.IF or PDF can
be factored int o a prodt1ct of n, rr1argir1al P J\l.IFs or PDFs.
[
Discrete:
Discrete:
Example 5.24
The random variables X 1 , . .. , Xn have the joint PDF
Let A denote the event that rnaxi X ,;, < 1/ 2. Find P [A).
We can solve th is problem by applying Theorem 5.24:
(5 .84)
As grows , the probability that the maximum is less than 1/ 2 rapidly goes to 0.
ri
We note that inspection of the joint PDF reveals that X 1 , ... , X4 are iid continuous
un iform (0, 1) random variables. T he integration in Equat ion ( 5 .84) is easy because
independence implies
5.11 MA1.,L A B
vVe start v.rit h the case when X and Y ar e finite r ar1dom variables vvith ranges
In t his case, vve can t ake advantage of J\IIATLAB techr1iques for surface plots of g(x, y)
over t he x, y plane. Ir1 J\IIATLA.B, vie represent Sx and Sy by t hen, elernent vector
sx and rn, elernent vector sy. T he function [SX , SY] =ndgrid (sx, sy) produces the
pair of n, x rn, rr1atrices,
SX =
IX! ~11 SY =
IYI Y;nl (5 .88)
lx.n ;i;.n J l;l Y~nj
We refer to rnatrices SX arid SY as a sarnple space grid because the}' a re a grid
r epresentation of the joint sam ple space
Example 5.25
An Internet photo developer website prints co m pressed photo images. Each im age file
conta ins a va ri ablesized image of X x y pixels described by the joint PMF
For ra ndom variables X, y , write a script imagepmf .m tha t defines the samp le space
grid matrices SX, SY, and PXY.
In the script imagepmf .m, the matrix SX has [800 1200 1600]' for each column and
SY has [400 800 1200] for each row . After runn ing imagepmf . m, we ca n inspect
t he variables:
%irnageprnf .rn >> irnageprnf ; SX
PXY=[0.2 0.05 0.1; sx =
0.05 0.2 0.1; 800 800 800
0 0.1 0.2]; 1200 1200 1200
[SX,SY]=ndgrid( [800 1200 1600], ... 1600 1600 1600
[ 400 800 1200]) ; >>SY
SY =
400 800 1200
400 800 1200
400 800 1200
Example 5.26
At 24 bits (3 bytes) pe r pixel, a 10:1 image compression factor y ields image f iles with
B = 0.3x y bytes . Find the expected value E(B] and the PMF PB(b) .
In part icular, the 'i th pair , SX(i) ,SY(i) , '\vill occur wit11 probability PXY(i). The
ot1tput xy vvill be a n rn, x 2 m atrix such that each rovv represer1ts a sam ple pair
x, y .
T he funct ion imagerv uses the imagesize .m script to define the matrices SX, SY,
and PXY. It then ca lls the finiterv .m funct io n . Here is the code imagerv .m and a
sample run :
function xy = imagerv(m); >> xy=imagerv(3)
imagepmf; xy 
S= [SX ( : ) SY ( : ) ] ; 800 400
xy=finiterv(S,PXY(:),m); 1200 800
1600 800
[
Example 5.27 car1 be generalized to produce sarnple pairs for an:y discrete random
vari able pair X , y . Hovvever , giver1 a collection of, for exarnple , rn, = 10, 000 sarr1ples
of X , y , it is desirable to be able to check v.r11ether the code generates the sarnple
pairs properly. In particula.r ' V\Te vvish to check for eacr1 ;_r; E x and y E y vvhether s s
the relative frequency of x;, y in rn, sam ples is close to Px ,Y(x;, y) . Ir1 the follo\ving
exarr1ple, vie develop a program t o calculate a rr1atrix of relative frequencies that
corresponds t o the rr1atrix PXY.
Example 5.28
Given a list xy of sample pa irs of random variables X , Y with JVI.A.TLAB range grids
SX and SY, wr ite a l\IIATLAB function fxy=freqxy (xy, SX, SY) that ca lculates the
relat ive frequency of every pa i r x, y . T he outp ut fxy shou ld correspond to the matrix
[ SX ( : ) SY ( : ) P XY ( : ) ] .
MATLAB provides the fur1ctior1 stem3 (x, y, z) , "'' here x , y, a nd z are length n,
vect ors, for visualizing a b iv ariate P l\![F Px,Y(x;, y) or for visualizir1g relative fre
quencies of sample values of a pair of random variables. At eac11 position x (i) , y (i)
on the xy plane, the function draws a stem of heig11t z (i).
The script imagestem. m generates the fol lowing re lat ive frequency stem plot .
'/.imagestem.m . ' . . . . .. t o t t f I o I
.' ..
imagepmf; .. . . .
xy=imagerv(10000); 0.2
fxy=freqxy(xy,SX,SY);
stem3(fxy(:,1), ... 0.1 t o t t o I o t I
I I o I o
1600
xlabel ('\it x'); 0 ~::::_:____:..:
. ~___..
1200 800
ylabel('\it y'); 400 0 0 x
y
[
For continuot1s randorr1 \ra riables , MATLAB can be 11seful ir1 a variet:y of ways . Sorr1e
of these are obvious. For exarnple, a joint PDF f x,Y(x, y) or CD F Fx,Y(x, y) can
be \rieV\red using the function p lot3. Figure 5.4 was generated this way. Howe\rer ,
for ger1er atir1g sarr1ple pairs of continuot1s randorn variables, there are no general
techniques such as the sarnple space grids we err1ployed vvith discr ete rar1dorn vari
ables .
W hen \rve introd11ced continuo11s randorn \ra.r iables in C11apter 4 , we also intro
duced farriilies of vvidel}' used randorr1 \rariab les . In Section 4.8, we provided a
collection of MATLAB ft1n ctions s11ch as x=erlangrv (n, l ambda , m) to generate m
sarnples frorr1 the corresponding PDF. HoV\rever , for pairs of continuous r andom
variables, "''e int rodt1ced only or1e family of probability rr1odels, r1amel}' the bi\rar i
ate G aussian randorr1 variables X and y . For t h e bivariate Ga ussia.r1 model, we
can use Theorem 5.21 and the randn funct ion to generate sample values. T11e
J'
cornrnand Z=randn(2, 1) ret 11rns t he vector Z = [Z1 Z 2 \r.rhere Z 1 and Z 2 are
iid G aussian (0, 1) randorn \rar iables . Next "''e forrr1 the lir1ear corr1binat ions
W1 = 0"1Z1 (5 .91a)
Frorn Theorerr1 5.21 we knoV\r t hat W 1 and l;T/ 2 ar e a bivariat e Gaussian pair . In
addit ion, from t11e forrr1 u las given in Theorerr1 5.21 , \rve can sho\r.r t h at E [lV1]
E[W2] = 0, Var[W1] = O"r , Var ['VT12 ] = O"~ and p 11v 1 , w2 = p. This implies t hat
(5.92)
is a pair of bivariate G aussian randorr1 \rariables "'' ith E [X ,i ] = /J,i , v ar[X,i ] = O"f,
and px 1 ,x 2 = p. We irr1plem ent this algorithm t hat transforrns t 11e iid pair Z 1, Z 2
ir1to t11e bivariate Gaussiar1 pair X 1, X 2 in the MATLA.B functior1
We observe t hat this ex arr1ple vvith Px ,Y = 0.5 shov.rs r ar1dorn variables t hat are
less correlated than t he ex arnples in F igure 5.5 vvith IPI = 0.9.
We note t 11at b ivariat e G aussian r andorr1 varia bles are a sp ecial case of n,
dimensional Gaussian r ar1dorn vectors, v.r11ich are int roduced in Chapter 8. Based
on linear algebr a techniques ,C11apter 8 introduces t he gaussvector function t o
gen er ate sarnples of G aussian r andorn vectors t h at gener alizes gauss2rv to ri di
rnens1or1s.
Beyond bivariat e Gaussian pairs, t here exist a variet y of t ec11niques for generat
ing sarnple vah1es of p airs of cont ir1uous r ar1dom variables of specific t}rpes . A basic
approach is t o gener ate X based on t he m a rginal PDF 1x(x) and t hen gener ate
Y llSing a condit ional pro bability rnodel t hat depends or1 t he val11e of X. Condi
t ional probability rnodels and MAT LAB techniques t11at emplO}' t hese rr1odels are
the subject of C11apter 7.
Problems
Difficulty: Easy Moderate D ifficu lt + Experts Only
PROBLEMS 207
5.1 .4 R andom variables X and Y have (b) \tV hat is P [Y < X]?
CDF Fx(x) and Fy(y) . Is F(x , y) = ( c) \t\l hat is P [Y > X ]?
Fx(x)Fy(y) a valid CDF? Expla in your an
s,ver .
(d ) \tV hat is P [Y = X]?
(e) W hat is P [X < 1]?
5.1 .5 In t his p roblem , \V e prove Theo
r em 5.2. 5.2.3 Test t\vo integrated circuits. In each
test , t he probabilit y of r ejecting t he circuit
(a) Sketch t he following even ts on t he X , Y
is p, independent of t he other test. Let X
plane:
be t he number of r ej ects (eit her 0 or 1) in
A= {X < X1 ' Y1 < y < y2}' t he fi rst test and let Y be t he number of
r ejects in t he second test . F ind t he joint
B = {::r 1 < X < x2 , Y < y1},
P l\!IF Px,y(1;, y) .
C = {x1 < X < x2 , Y1 < Y < y2} . 5.2.4 F or two independen t flips of a fair
(b) Express t he p r obabilit y of t he events coin, let X eq ual t he total n um ber of tails
A, B, a nd A U B U C in term s of t he and let Y equal t he n um ber of head s on t he
joint CD F Fx ,Y(x, y) . last flip. Find t he joint Pl\!IF Px ,Y(x, y) .
(c) Use t he observation t hat even ts A , B , 5.2.5 In F igure 5.2, t he axes of t he figures
and Car e mut ually exclusive to prove are labeled X and Y because t he figtues
Theorem 5. 2. d epict possible values of t he ra ndom vari
ables X and Y . Ho,vever , t he figure at t he
5.1 .6 Can t he following function be t he end of Examp le 5.3 depicts Px,y(x , y) on
joint CDF of random variables X and Y? axes labeled 'vi t h lovvercase x and y . Should
t hose axes be labeled wit h t he upper case X
1 e  (x + y ) x>O , y> O, and Y? Hint : R easonab le arguments can
F(1;, y) = { O otherwise. be m ade for both views.
5.2.6 As a gener alization of Example 5.3,
5.2.1 R andom variables X and Y have t he consid er a test of n circuits such t hat each
joint P l\!IF circuit is accep tab le wit h probabilit y p , in
dependen t of t he outcome of any other test.
P.x ,y ( x,y) = {
cxy x = 1, 2, 4; y = 1, 3, Show t hat t he joint Pl\!IF of X, t he number
0 other\vise. of acceptable circuits, and Y , t he number
of acceptable circuits found before observ
ing t he first rej ect , is
(a) W hat is t he value of t h e constant c?
(b) W hat is P [Y < X]? Px ,Y (;i;, y)
( c) \i\f hat is P [Y > X ]? (nx y y I)Px(l  p)n x 0 < y < ;i; < n ,
n
(d) W hat is P [Y = X]? P x = y = r1,,
( e) \ i\1 hat is P [Y = 3]? 0 other,vise.
5.2.7 \i\fith t'vo minutes left in a fi ve first reject is found. F ind the joint PMF
minute overtime, t he score .is 0 0 in a Rut PK,x(k , x).
gers soccer inatch versus Villanova. (Note
5.3.1 Given the random variables X and
that the overtime is NOT s11,dden death)
Yin Problem 5.2.1, find
In the nexttolast minute of t he game, ei
ther (1) Rutgers scores a goal with prob (a) The marginal PMFs Px(x) and Py(y),
ability p = 0.2, (2) 'lillanova scores with (b) The expected values E[X) and E [Y ],
probability p = 0.2, or (3) neither team
scores with probability 1  2p = 0.6. If nei (c) The standard deviations O' x and O'y.
5.2.8 Each test of an integrated circuit Find the marginal PMFs Px(:i;) and Py(y)
produces an acceptable circuit 'vith proba and the expected values E[X) and E [Y).
bility p, independent of t he outcome of the
5.3.5 Random variables N and J{ have
test of any othe.r circuit. In testing n, cir
the joint P~IIF
cuits, let J{ denote the number of circuits
rejected and let X denote the number of ac k = l , . .. ,n;
ceptable circuits (either 0 or 1) in t he last n =l ,2, . ..
5.2.9 Each test of an integrated circuit Find t he marginal PMFs PN(n,) and P 1<(k).
produces an acceptable circuit 'vith proba
5.3.6 Random variables N and K have the
bility p, independent of t he outcome of the
joint P~!lF
test of an y other circuit. In testing n, cir
cuits, let J{ denote the number of circuits k = 0 ,1 ,. . . ,n;
n = 0 ,1, .. .
rejected a nd let X denote t he number of
acceptable circuits that appear before t he other,vise.
[
PROBLEMS 209
5.5.6 Over the circle X 2 + Y 2 < r 2, ran stra,vberry supplier is 300 miles away. An
dom variables X and Y have the l")DF experiment consists of monitoring an order
and observing vV, the weight of t he order,
f X ,Y(X, y) = { ~ lxyl ;r x2+112 < r2 , and D, the distance the shipment must be
otherwise. sent. The following probability model de
scribes the experiment :
(a) What is the marginal PDF fx(1;)? van. choc. stra\v.
small 0.2 0.2 0.2
(b) What is the marginal PDF fy(y)? big 0.1 0.2 0.1
5.5.7 For a random variable X , let Y = (a) What is the joint PMF Pw,D('l11, d) of
aX + b. Show that if a > 0 then px,Y = 1. the weight and the distance?
Also sho'v that if a < 0, t hen px ,Y = 1. (b) F ind the expected shipping distance
E[D].
5.5.8 Random variables X and Y have
joint PDF ( c) Are W and D independent?
PROBLEMS 211
first H. L et X2 equal the number of addi 5.7. 1 Continuing Problem 5.6.1, the price
t ional flips up to and including the second per kilogram for shipping the order is one
H. What are Px 1 (x1) and Px 2 (x2). Are X1 cent per mile. C cents is the shipping cost
and X 2 independent? F ind Px 1 ,x2 (x1, x2). of one order. What is E[ CJ?
5.6.5 X is the continuous uniform (0, 2) 5.7.2 Continuing Problem 5.6.2, the price
random variable. Y has the continuous uni per mile of shipping each box is one cent per
form (0, 5) PDF, independent of X. \i\fhat mile the box travels. C cen ts is the price of
is the joint PDF f~'<,Y(x, y)? one shipment. What is E [C], the expected
price of one shipment?
5.6.6 X1 and X2 are independent random
variables such that X i has PDF 5.7.3 A random ECE sophomore h as
height X (rounded to the nearest foot) and
x > 0, GPA Y (rounded to the nearest integer).
other,vise. These random variables have joint PMF
. , y) = {k+ 3x
fx,Y(x
2  l / 2<x<l/ 2,
 l / 2<y<l/ 2,
Find E [X + Y] and Var[X + Y].
0 otherwise. 5.7.4 X and Y are independent, iden
t ically distributed random variables \vith
Pl\1F
(a) W hat is k?
(b) What is the inarginal PDF of X? 3/4 k=O,
(c) \i\fhat is the marginal PDF of Y? Px(k)=Py(k)= 1/4 k=20,
0 otherwise.
(d) Are X and Y independent?
Find t he follo,ving quantities:
5.6.8 X1 and X2 are independent, iden
t ically distributed random variables with E [X] , \!ar[X],
PDF E [X + Y], Var[X + Y], E [XY2xYJ .
5.7.13 R andom variables X and Y have 5.8.2 For t he random variables X and Y
joint PDF in Problem 5.2.1, find
PROBLEMS 213
(d) The correlation coefficient, p x ,Y, (c) The correlation, r x,Y = E[ XY],
(e) The variance of X + Y, Var[X + Y]. (d) The covariance, Cov[X, Y ],
(Refer to the results of Problem 5.3.l to an (e) The correlation coefficien t, p x,Y.
swer some of t hese questions.)
5.8.7 For X and Y with P l\/IF Px ,y(x, y)
5.8.3 For the random variables X and Y given in Problem 5.8.6, let W = min(X, Y)
in Problem 5.2.2 find and V = max( X , Y). F ind
(a) The expected value of vV = 2XY ' (a) The expected values, E [W ] and E[V],
(b) The correlation, r x ,Y = E[XY] , (b) The variances , Var [vV] and Var[V],
(c) The covariance, Cov[X, Y], (c) The correlation, rw ,v,
(d) The correlation coefficient, p x ,Y, (d) The covariance, Cov[W, VJ,
(e) The variance of X + Y, Var[X + Y]. (e) The correlation coefficient, p w ,v.
(Refer to the results of Proble1n 5.3.2 to an 5.8.8 Random variables X and Y have
swer some of t hese questions.) joint PDF
5.8.4 Let H and B be the random vari
ables in Quiz 5.3. F ind TH,B and Cov[H, B J. . ,Y(x,y)
fx = { 1/ 2 1 <:i; <y <l,
0 other,vise.
5.8.5 X and Y are independent random
variables with PDFs Find r x ,Y and E[ex+Y ].
le x/ 3 x > 0, 5.8.9 This problem outlines a proof of
fx(x) = { ~ Theorem 5.13.
other,vise,
l e y/2 (a) Show that
y > 0,
fy(y)= ~
{ other\vise. X  E[X] = a(X  E [X]),
Y E[Y] = c(Y E [Y]).
(a) F ind the correlation r x ,y.
(b) Use part (a) to shovv that
(b) F ind the covariance Cov[X, Y].
5.8.6 The random variables X and Y have Cov [x, Y] = acCov [X, Y].
joint Pl\!IF
( c) Show that Var[X] = a,2 v ar[X] and
..l..
Var[Y] = c 2 Var[Y ] .
4 e lG
( d) Combine parts (b) and ( c) to relate
12
...l.
e
..l.. Px,Y and px,Y 
3 lG
0
(1  p)n 1 p/n k = 1, ... , ri;
0 1 2 3 4 rl, = 1,2, ... ,
0 otherwise.
F ind
Find the marginal Pl\!IF P jv (n) and the ex
(a) The expected values E[X] and E[Y], pected values E[N], v ar [N], E[N2 ], E [I<],
(b) The variances Var [X ] and Var[Y], Var[ I<], E[N + K], r 1v,K, Cov[N, I<].
[
5.9.1 Random variables X and Y have (b) The PDF f x ,Y(x, y) is unifor1n over the
joint PDF 50 cm circular target.
 (x 2 / 8 )  ( 2 / 18) (c) X and Y are iid Gaussian (Jl, = 0, a=
fx,Y(x, y) = ce , Y .
10) random variables.
\i\fhat is the constant c? Are X and Y in
5.9.7 A person's white blood cell (WBC)
dependent?
count W (measured in thousands of cells
5.9.2 X is the Gaussian (p, = 1, a = 2) per microliter of blood) and body temper
rando1n variable. Y is t he Gaussian (Jl, = ature T (in degrees Celsius) can be mod
2, a = 4) random variable. X and Y are eled as bivariate Gaussian rando1n variables
independent. such that W is Gaussian (7, 2) and T is
Gaussian (37, 1). To determine \vhether a
(a) What is the PDF of V = X + Y?
person is sick, first t he person's tempera
(b) What is the PDF of vV = 3X + 2Y? ture 'J' is measured. If T > 38, then the per
son's WBC count is measured. If vV > 10,
5.9.3 TR.U E OR F i\.LSE: X 1 and X 2 are
the person is declared ill (event I).
bivariate Gaussian random variables. l:<"br
any constant y, there exists a constant a (a) Suppose W and T are uncorrelated.
such that P[X1 + aX2 < y) = 1/ 2. What is P[I)? Hint: Draw a tree di
agram for the experiment.
5.9.4 X1 and X2 are identically dis
(b) No\v suppose W and T have correla
tributed Gaussian (0, 1) random variables.
t ion coefficient pw,'r = 1/ J2. F ind the
Moreover, they are jointly Gaussian. Under
condit ional probability P[IIT = t] that
'vhat condit ions are X1 , X2 and X1 + X2
a person is declared ill given t hat the
identically distributed?
person's temperature is T = t.
5.9.5 Random variables X and Y have
joint PDF 5.9.8 Suppose yo ur grade in a probabil
it y course depends on your exam scores X 1
( 2x 2  4xy+4y 2 )
j .x ,.y ( ::i,, , y ) _ ce  . and X2. The professor, a fan of probability,
releases exam scores in a normalized fash
ion such that X 1 and X 2 are iid Gaussian
(a) What are E[X) a nd E [Y)? (Jl, = 0, a = J2) random variables. Your
semester average is X = 0.5(X1 + X2).
(b) F ind the correlation coefficient px, y.
(a) You earn an A grade if X > 1. \i\fhat
(c) \i\fhat are v ar [X ) and Var[Y)?
is P [A)?
(d) What is the constant c? (b) To improve his SIRS (Studen t Instruc
( e) Are X and Y independent? t ional Rating Service) score, the profes
sor decides he should award more A's .
5.9.6 An archer shoots an arro\v at a Now you get an A if max(X1 , X2) > 1.
circular target of radius 50 cm. The ar \i\fhat is P [4) no,v?
ro'v pierces the target at a random posi
(c) The professor found out he is unpop
t ion (X, Y), measured in centimeters from
ular at ratemyprofessor. com and de
the center of the disk at position (X, Y) =
cides to a\vard an A if either X > 1 or
(0, 0). The bullseye is a solid black circle
max(X1, X2) > 1. No\v what is P [A)?
of radius 2 cm, at the center of the target.
Calculate t he probability P [BJ of the event (d) u nder criticism of grade inflat ion from
t hat the archer hits the bullseye under each t he depart1nent chair, the professor
of the following models: adopts a new policy. An A is a\varded
ifmax(X1,X2) > 1 andmin(X1,X2) >
(a) X and Y are iid cont inuous uniform
0. N O\V 'vhat is P [A)?
(50, 50) random variables.
[
PROBLEMS 215
5.9.10 u nder what conditions on the con 5.10.2 When ordering a personal com
stants a, b, c, and d is puter, a customer can add the follo,ving fea
tures to t he basic configuration: (1) addi
t ional memory, (2) flat panel display, (3)
professional software, and (4) wireless mo
a joint Gaussian PDF? dem. A random computer order has fea
ture i with probability Pi = 2 i indepen
5.9.11 Show that the joint Gaussian PDF dent of other features. In an hour in 'vhich
f x ,y(x, y) given by Definition 5.10 satisfies three computers are ordered, let Ni equal
5.10.3 The random variables X 1, ... , X 11 (a) the PDF of'/= min(X1 , X2 ,Xs),
have the joint PDF (b) the PDF of W = max(X1 ,X2, Xs).
1 0 <Xi < 1; 5.10.8 In a race of 10 sailboats, t he finish
fx 1,. . . ,X n ( 1;1, . .. , Xn) = i = 1, . .. , 71, , ing t imes of all boats are iid Gaussian ran
0 otherwise. dom variables with expected value 35 min
utes and standard deviation 5 minutes.
F ind
(a) What is t he probability that the win
(a) The joint CDF, Fx 1 , ... ,xn(x1, ... , Xn), ning boat \Vill finish the race in less
(b) P[min(X1,X2,Xs) <3/4]. than 25 minutes?
5.10.4 Are 1'l1, N2, Ns, N4 in Prob (b) \tVhat is the probability that the last
lem 5.10.l independent? boat w ill cross the finish line in more
than 50 minutes?
5.10.5 In a compressed data file of 10,000
bytes, each byte is equally likely to be any (c) Given this model, vvhat is the proba
one of 256 possible characters bo , ... , b255 b ility t hat a boat \Vill finish before it
independent of any other byte. If Ni is the starts (negative finishing t ime)?
nu1nber of times bi appears in the file, find
the joint P JVIF of No, ... , N255 Also, \vhat 5.10.9 Random variables X1 , X2 , ... , X n
is t he joint PMF of 1'lo and N 1? are iid; each X j has CDF Fx(:i;) and P DF
f x( :i;). Consider
5.10.6 In Example 5.22, \Ve derived the
joint P JVIF of the the number of pages in L n = min(X1 , ... , X n)
each of four downloads: Un= max(X1 , ... , X n)
PROBLEMS 217
After opening suitcase i, you accept 5.1 1.1 For random variables X and Y in
the amount X i if X i> T i . Example 5.26, use l\IIATLAB to generate a
Otherwise, you reject suitcase i and list of the form
open suitcase i  1.
X1 Y1 Px ,Y(x1, Y1)
If you have rejected suitcases n, down X2 Y2 Px ,Y ( x2, Y2)
through 2, then you must accept the
amount X 1 in suitcase 1. Thus the
threshold Ti = 0 s ince you never re
ject the amount in the last suitcase. that includes all possible pairs (x, y).
There are rr1any situations in vvhic11 vve observe one or more r andom variables and
use t heir values t o corr1pute a nevv randorr1 variable. For exarnple, vvhen voltage
across an ro ohrn resistor is a r ar1dom variable X , t he povver dissipat ed in t11at
resistor is Y = X 2 /r0 . Circuit desigr1ers need a probability model for Y t o ev aluate
the power consl1rr1ptior1 of t he circuit. Similarly, if t 11e arr1plitude (current or volt
age) of a r adio sign al is X , the received signal povver is proportional t o y = X 2 .
A probability rnodel for Y is essential in evaluatir1g the perforrnar1ce of a radio re
ceiver. T11e ot1t put of a lirr1iter or rectifier is anot11er r ar1dom variable t 11at a circuit
designer rr1ay need t o an alyze.
R adio syst ems also provide practical exarnples of ft1nctions of two randorr1 vari
ables. For exarr1ple, we can describe t11e arr1plitude of the sigT1al t r ansrnitted by
a r adio station as a randorr1 variable, X. We can describe t he attenuation of t h e
sigr1al as it t r avels to t h e anter1na of a rnoving car as anoth er r andom v ariable,
Y. Ir1 t his case the a mplit ude of t h e signal at t11e r adio r eceiver in t h e car is t h e
randorr1 variable vV = X / Y. Ot 11er practical exarr1ples appear in cellular telephon e
base stations v.rith tv.ro antennas . T11e arr1plitudes of t11e sigr1als arriving at t he tv.ro
antennas are rnodeled as r andorn variables X arid y . The radio receiver connected
t o t 11e t vvo ar1tennas can use the received sigr1als in a variet y of ways.
It can choose the sigr1al wit h t 11e larger arnplit ude a nd ignore t11e other one.
Ir1 t 11is case, the receiver produces t 11e r andorn varia ble W = X if IXI > IYI
and vV = y , ot11erwise. This is an exarnple of select'ion, di'versity cornb'iri'irig.
The receiver can add t 11e two signals and use W = X + Y. This process is
referred t o as equal ga'iri cornbin,in,g becat1se it t reat s both signals eqt1a1ly.
A third a lternative is t o corr1bine t 11e tv.ro signals unequally in order t o give
less v.reight t o the signal considered t o be more dist orted. Ir1 this case W =
aX + b"Y. If a and b are opt irnized , t he receiver perforrns rnax;'irnal ratio
cornb'in,in,g.
218
[
vVe perforrri ari experirrient and observe a sarriple value of tvvo r aridom vari
ables X arid Y. Based on ot1r kriowledge of tlie experirnent, vie have a proba
bility rnodel for X arid Y ernbodied in a joint PMF Px,Y(x;, y) or ajoirit PDF
fx ,Y(x,y) . After perforrning the experirrient, we calculate a sarriple value of
the randorri variable W = g ( X , y ).
vVhen X arid Y are discrete randorri variables, 5 1,v, the range of W , is a countable set
corresponding to all possible values of g ( X , y ). Therefore, Wis a discrete randorn
variable and lias a P1!{F P1t\1('ID). '\N"e can a pply Tlieorerri 5 .3 to find Pw('w) =
P[W = 'W). Since {W = 'W} is ariother riame for tlie event {g(X, y ) = tu}, vve obtairi
P1('w) b}' adding tlie values of Px,Y(x;, y) corresporiding to the x, y pairs for vvliich
g(x, y) = 'W.
has PMF
Pw('uJ) = Px,y(:i;,y) .
( :r; , y ) :g ( :r;, y) =w
;i; = 40 = 60
A firm sends out two ki nds of newsletters. One kind
;i;
l= 1 0.15 0.1 conta ins on ly text and grayscale images and req ui res
l= 2 0.3 0.2 40 cents to print each page. T he other kind co ntains
l= 3 0.15 0.1 color pictures t hat cost 60 cents per page . Newslet
ters can be 1, 2, or 3 pages long . Let the rando m
variable L represent t he length of a newslet ter in pages . SL = {1, 2, 3}. Let the ran 
dom variable X represent t he cost in cents to pr int each page . Bx = {40, 60}. A f te r
observing many newsletters, t he firm has derived the probabil ity model shown above .
Let W = g(L , X) = LX be the tota l cost in cents of a newsletter. Find the range S 1 "1
and the PM F P 1"1('u1).
Example 6.2
In Example 4.2, lV centim eters is the locat ion of the pointer o n the 1meter circum
ference of the circle. Use the solution of Examp le 4.2 to derive fv,1('1D).
Fw(w) = P [W < 'w] = P[lOOX <'ID ] = P [X < 'w/ 100] = Fx('w/ 100) . (6.1)
The calcu lation of Fx('w/100) depe nds on the probabil ity model for X. For this prob
lem , we recall t hat Examp le 4.2 derives the CDF of X,
0 x; < 0,
Fx(x) = x O<x;< l , (6.2)
1 x; > 1.
w
0 100 < O, 0 w < 0,
'ID 'U) 'W
Fw('w) = Fx ( 'ID ) = 0< < 1, = 0 < 'W < 100. (6.3)
100 100  100 100
 I
'W
1 > 1 1 w > 100.
100  )
We take the derivative of the CDF of VT! over each of the interva ls to find the PDF:
We llSe this tvvostep procedure in the following theorem to generalize E xarr1ple 6.2
by derivir1g the CDF a nd PDF for an}' scale cha nge a nd arl}' cor1t inuous randorr1
variable.
Theorem 6.2
If VV = aX ; 1JJhere a,> 0, then, Ml has GDF a'n d PDF
Example 6.3
The triangular PDF of X is
~a=2 (6.8)
O"'===:.Ll~~1.~~.....J 0 otherwise.
0 1 2 3
w As a increases, the PDF stretches horizonta lly .
For the farr1ilies of cont inl1ous randorr1 v ariables in Sectior1s 4.5 a nd 4.6 , we can
t1se T11eorern 6.2 to shov.r t11at rr1ultipl}ring a randorr1 variable by a constar1t produces
a nev.r farr1ily rr1err1ber witl1 transforrr1ed pararr1eters.
[
T11e next t11eorem shovvs that adding a constant to a random v ariable sirr1ply
sl1ifts the CDF and t he PDF b}' that constar1t.
Theorem 6.4
IfW = X + b,
Fw ('w) = Fx ('w  b) , f"w ('UJ) = f"x (VJ  b) .
Proof F irst, we find t he CDF Fw(1D) = P[X + b < ?D] = P[X < ?lJ  b] = Fx('UJ  b). \?Ve
take t he derivative of Fvv(111) to find t he P DF : fw(1D) = dFw(11J) / d11J = f x('UJ  b).
In contrast to the line<ir trar1sforrnatior1s of Theorerr1 6.2 and Theorem 6.4, t11e
followir1g exarnple is trick}' becat1se g(X) transforms rnore than one valt1e of X to
the same W.
Example 6.4
Suppose Xis the conti nuous un iform (  1, 1) ra ndom variable and W = X 2 . Fin d the
CDF Fw('w) and P DF f"11v( 'ID) .
(6.9)
We can ta ke o ne more step by writing the probability (6.9) as an integra l usin g the
PDF f x( ;i;) :
,/W
F w ('w) = P [ v 'w < X < v w J = j,/W
1x (x) d;i;. (6.10)
[
So far , we have used no properties of the PDF f x(x) . However, to evaluate the integral
(6.10), we now recall from the problem statement and Definition 4.5 that the PDF of
X is
fx(x)
' I
1/ 2 
1/4 l <x<3.
f'x (x) =
  I
(6.11)
0 otherwise.
 I I I
 x
1 3
The integral ( 6.10) is somewhat tricky because the Ii m its depend on the va Iue of VJ.
We first observe that 1 < X < 3 implies 0 <VT! < 9. Thus F 1!\r(vJ) = 0 for VJ < 0,
and F 1,v(vJ) = 1 for 'W > 9. For 0 < 'W < 1,
fx(x)
1/ 2
fi:; 1 v'W
F w ('ID) = jfl:; 4
 dx; =
2
. (6.12)
J'W J'W 3
 fl:; 1 fl:; 3
By combining the separate pieces, we can write a complete expression for Fw('ID) :
F \;\!('ID) = (6.14)
fo + 1
o l <iv<9.
  I
0 5 10 'UJ
4
1 'W > 9.
To find f 1,v(w), we take the derivative of F 1,v('w) over each interval.
We end this section vvith a useful applicatior1 of derived randorr1 variables. The
follovvir1g theorem shovvs hovv to derive sarnple values of randorr1 variables usir1g
[
Theorem 6.5
Let U be a 'U'niforrn (0, 1) ra'ndorn variable a'nd let F(x) de'note a c'urn,11,lative distri
[yution, f11,n,ctior1, 111ith an, in,verse p I ( v,) defin,ed for 0 < v, < 1. The ran,dorn variable
X = p 1 (U) has GDF Fx(x) = F(x).
vVe observe that the req1.iirerr1en t that Fx('u) have a n inverse for 0 < u < 1 lirnits
the applicability of T11eor ern 6.5. For exarnple, this reqt1irem ent is not rnet by the
rnixed ra ndom varia bles of Section 4. 7. A ger1er alizaton of the theorerr1 that does
hold for rnixed ra ndorr1 variables is given in Problern 6.3.13. The follov.ring examples
demonstrate the utility of T11eorem 6.5.
Example 6.5
U is the un iform (0, 1) random variable and X = g(U). Derive g(U) such that X is
the exponential (1) rando m variable.
T he CDF of X is
0 x < 0,
Fx(:i;) = (6.17)
1  ex x > o.
Notethat if v, = Fx(x;) = 1  e~r;. then x =  lr1(l  u). T hat is ,Fx 1 (v,) =  ln(l  'u)
for 0 < v, < 1. T hus, by T heorem 6.5 ,
X = g ( U) =  ln ( 1  U) (6.18)
is the exponent ial random variab le with parameter >. = 1. Problem 6 .2.7 asks the
reader to derive the PD F of X =  ln(l  U) directly from fi rst pri nciples.
   Example 6.6,___
For a un iform (0, 1) random variable U, find a function g() such that X = g(U) has
a un iform (a, b) distribution .
T he CD F of X is
0 <a, ;i;
The techniql1e of Theorem 6.5 is p art icl1larly useful vvhen t 11e CDF is an easil}'
invert ible function. Unfortunatel}', t 11er e are m any r andorr1 variables, including
G aussian and Erlang, in vvhich t he CDF and its irrverse are difficult to corr1pute. In
these cases, \rve need t o develop ot her m ethods for t ransforrr1ir1g sarnple values of a
uniform randorr1 vaiable to sample \rallies of a r andom variable of ir1ter est.
Quiz 6.2
X is an exponer1t ial (;\) PDF. Sl1ow t hat y = v'X.
is a R a}rleigh randorr1 variable
(see Appendix A.2) . Express t 11e R ayleigh p ar arnet er a in t errr1s of t he exponer1t ial
pararnet er ;\.
g(x) 1 x < 0,
2 g(x) = (6.21 )
3 x > 0.
()
5 0 5 .T,
[
Fy(y) 0 y < 1,
Fx(O) Fy(y) = Fx(O) 1 < y < 3, (6.23)
0 1 y > 3.
0 I 2 3 4 y
The PDF co nsists of impu lses at y = 1 and y = 3. T he weights of the impulses a re the
sizes of the two jumps in the CDF: Fx(O) and 1  Fx(O), res pectively.
fy(y) ' . ..
j'y (y) = Fx (0) 6(y  1) + [1  Fx (0))6(y  3).
0 I 2 3 4 y
The follov.ring example contains a function that transforrns cor1t inuous rar1dorn
variables to a rr1ixed rar1dorn variable.
c:::::= Example 6. 8
The output voltage of a microphone is a Gaussian rando m variab le V with expected
va lue 11,v = 0 a nd standard deviation ov = 5 V. T he microp hone s ignal is the input
to a soft lim iter circuit with cutoff value 10 V. The rando m variab le vT! is the output
of the lim iter:
vT!
( 10 v<  10,
W =g ('V) = { V  10 < v < 10, (6.24)
l 10 v > 10.
To fi nd the CDF , we need to fin d F11v(w) = P[W < 'UJ] for a ll val ues of 'UJ. The key is
that all possible pairs ('V, W) satisfy lV = g('V). This implies each 'W belongs to one
of three cases:
[
(a) 'W < 10 (b)10 < 'W < 10 (c) 'w>lO
(a) 'W < 10: From the function W = g(V) we see that no possible pairs CV, l!\l)
satisfy vl! <'ID<  10. Hence F11v('IJJ) = P (W <tu] = 0 in this case. This is
perhaps a roundabout way of observing that vV =  10 is the minumum possible
W.
(b) 10 < 'ID < 10: In this case we see that the event {vV < 'W } , marked in gray
on the vertical axis, corresponds to the event {V < 'W }, marked in gray on the
horizo nta l axis. The corresponding C V, W) pairs are shown in the highlighted
segment of the function W = g(V). In this case , Fw(vJ) = P [vl! < 'W] =
P('V < 'w] = Fv('w) .
(c) 'W > 10: Here we see that the event {W < 'W} corresponds to all values of v
and P(W < 'UJ ] = P [V < oo] = 1. This is another way of saying W = 10 is the
maximum vll .
We combine these separate cases in the CDF
These conclusions are based solely on the structure of the limiter functiong (V) without
regard for the probabi lity model of V . Now we observe that because V is Gaussian
(0, 5) , Theorem 4.14 states that Fv(v) = <P(v/5). Therefore,
Quiz 6.3
Rar1dorn variable X is p assed t o a h ard lirniter that outpt1ts y . The P DF of X and
the limiter Ol1tpt1t y are
fx (x) =
1  x/2 O <x;

< 2'
 y =
x x < 1,
(6.28)
0 otherwise, 1 x > 1.
At the start of t11is chapter , vve described three vvays radio receivers can t1se signals
frorn t wo ar1tennas. T11ese techniques are exarnples of t 11e follovving sit uat ion. We
perforrn an experirr1er1t and obser ve sample va.lt1es of t vvo randorr1 variables X and
Y. After perforrning t h e exper irnent, we calcltlate a sarr1ple value of t he r andom
variable vV = g( X , Y). B ased on our knowled ge of the experirnent, \Ve 11ave a
probabilit}' rnodel for X a nd Y ernbodied in a joint PMF Px ,y(x;, y) or a joir1t PDF
fx ,y(x,y) .
In t11is section, we present rnethods for deri,ring a probability rnodel for W. '\i\Th er1
X and y are continuous r andorn \rariables and g(x;, y) is a continl1ous function ,
W = g(X, Y) is a cont ir1t1ous r andom variable. To find t he P DF , f'vv(vJ), it is
usu ally helpful t o first find t he CDF Fw('w) and t h en calcu lat e t11e derivat i,re .
Vie,vir1g {lV < 'W} as an e\rent A , "''e can apply T heorern 5. 7.
Theorern 6.6 is an a logous to ot1r a pproach in Sections 6.2 and 6.3 for ft1n ctions
W = g(X). There "''e t1sed t he fur1ct ion g(X ) to t ranslat e t he event { W < 'W} ir1to
an event { g(X) <VJ} t h a t \vas a subset of the X axis. vve then calcl1lat ed Fw(w)
by integratir1g f x (x) over that subset.
[
In Theorerr1 6.6, we t ra nslate the ever1t {g( X , Y) < 'W} into a region of t he X , Y
plar1e. Ir1tegra ting the joint PDF f x,y(x, y) O\rer t h at region "'' ill y ield the CDF
F1('w) . Once "''e obtain F11v('w), it is generall}' str aigh t forward to calct1late t h e
derivati,re f w('w) = dFw('U;)/d'UJ. Hov.re\rer, for rnost funct ior1s g(x,y), perforrning
the integr ation to find Fw('w) can be a tedious process. Fortur1ately, t11ere a re
convenien t tecl1niq11es for fir1ding f11v('UJ) for certair1 functions that arise in rnany
applications . Sectior1 6.5 and Chapter 9 consider the function, g(X, Y) = X + Y.
The follo,ving theorem addresses W = rnax(X, 'Y ) , t11e m axirr1um of t vvo r a ndom
variables. I t follO\VS frorn t h e fact t11at {rr1ax(X, 'Y ) < 'W} = {X <'ID} n {Y < 'UJ}.
== Theorem 6. 7
For crJ'nt'i'Tl/IJ,OUS ran,dorn variables X an,d Y; t he GDF of W = m ax(X, Y) is
Example 6.9
In Examples 5.7 and 5.9, X a nd Y have joint PDF
1/ 15 O <x;<5, 0 <y<3,
fx ,y(:r;,y) = (6.29)
0 otherwise.
Because X > 0 and Y > 0 , W > 0. Therefore, F1t11(w) = 0 for 'ID < 0. Because X < 5
and Y < 3, W < 5. Thus F\t\1('w) = 1 for 'ID > 5. For 0 < 'W < 5, diagrams showi ng
the regions of integration provide a guide to calculating Fw(tv) . Two cases, 0 < 'W < 3
and 3 < 'W < 5, have to be considered se parately. When 0 < 'ID < 3, Theore m 6. 7
yields
y
w 'W 1 ~1) 1
Fw('w) =
1 0 0

15
dxdy = w 2 / 15. (6.30)
w
Because t he joint PDF is uniform , we see t his probability is the area 'W 2 t im es the
va lue of the joint PDF over that area. When 3 < 'W < 5, t he integral over the regio n
{ X < 'W , Y < 'W} is
y
.
F w ('w) = 'W(13  1 dy ) d:i; = 1'W;1 dx; = 'W / 5, (6. 31 )
. 10 0 15 0 v
[
which is the area 3'w ti mes the value of the joint PDF over that area. Combining the
parts , we can write the joint CDF:
I
0 'UJ < 0,
Fw('w)
0.5 2
'UJ / 15 0 <vJ <3,
Fw('w) = (6.32)
'UJ/ 5 3< 'w<5,
() 2 4 6 'UJ w > 5. 1
By taki ng t he derivative, we fi nd the correspond ing joint P DF:
0.4 ,          ,
0 2 4 6 'ID
lo otherwise.
Example 6.10
X and y have the joint PDF
AjJ,e(>.x+Jl,'.IJ) X > 0, y > 0,
(6.34)
0 otherwise.
Fi nd the PDF of W = Y/ X.
First we fi nd the CDF:
For 'W < 0, Fw('w) = 0. For 'W > 0, we integrate t he jo int PDF f'x ,Y(x , y) over t he
region of the X , Y plane for wh ich Y < 'WX, X > 0, and Y > 0 as shown:
=lo=.\e (1  dx 1
"" e:wx)
>..
= 1 (6.36)
>.. + ,'llJ
Therefore,
0 'ID < 0,
F\;\1 ( 'IJJ) = >.. (6.37)
1 w > o.
A+ ,VJ
[
'ID > 0 ,
f'w ('ID ) = (6.38)
otherwise.
Quiz 6.4
Let T den ote t lie nurnber of seconds rieed ed for t h e tra nsfer . Express T as a
ft1n ction of L and B. W liat is t h e P 1!{F of T ?
(B) Find t h e C D F a rid the PDF of 'VT! = XY vvh en ra ndorn variables X a rid Y
have joint PDF
1 O <::r;< l , O <y< l ,
(6.39)
0 oth ervvise.
The PDF of the s11m of t v.ro indep erident contiriuous ra ndorri v ari
ables X a nd Y is the convolt1tion of the PDF of X and t lie P D F of
Y. The P1!{F of the s11rri of tv.ro independent integerv alued randorn
v a riables is tlie discrete corrvolution of t h e two PMFs .
    Theorem 6.8
The PDF of W = X +Y is
Proof
Fw(w) = P [X + Y < w] = J: (J':' Jx,Y(x, y) dy) dx. (6.40)
. (11J)
fvv = dFvv
i (11;)
G,71J
= Joo (dd (Jwxfx ,y(x , y) dy )) dx
_ 71J _
00 00
= J oo
00
 = Example 6.11
Find the PDF of W = X +Y when X and Y have the jo int PDF
)'
The PDF of W = X + Y can be found using Th eorem 6.8.
I The possible values of X , Y are in the shaded triangular region
) '  \\.'.\' where 0 < X+Y = W < 1. Thusfv.;(w) = 0 for vJ < 0 or
'ID > 1. For 0 < 'W < 1, applying Theorem 6.8 yields
\I'
{'W
f w (VJ) = J 2 dx = 2vJ, (6.44)
)I' I 0
When X arid Y are independer1t, the joir1t PDF of X and y is the product of
the rr1arginal PDFs 1x,y(x , y) = f x(x) j'y(y) . Applying T11eorerr1 6.8 t o t his special
case, vve obtain the follow ir1g theorern.
[
f w (w) = 1: f x (w  y) f y (y) dy = 1: f x ( x) f y ( w  x) dx .
In Theorem 6.9 , '""'e corr1bir1e tv.ro ur1ivariate functions, fx () and j'y( ), ir1 order t o
produce a t hird ft1nction, f'w() . The combinatior1 in T11eorerr1 6.9 , referred to as a
con,volution,, arises in m ar1:yr brar1ches of applied rnat11ernatics.
W hen X arid y are indeper1dent ir1tegervalued discret e randorn variables, the
PMF of W = X + Y is a corrvolut ion (see Problerr1 6.5.1).
00
You may have encountered convoll1tions alread:yr ir1 stt1dying lir1ear syst erns. Sorne
times, we t1se t he notation f w(w) = f x(x) * fy(y) to denote corrvolution.
Quiz 6.5
Let X and Y be indeper1dent exponential r ar1dom variables wit11 expected v alues
E[X) = 1/3 and E[Y) = 1/ 2. Find the PDF of W = X + Y.
6.6 1\1.IATLAB
Example 6.12
Use Exam ple 6 .5 to write a J\IIATLAB program t hat generates m, samples of an expo
nential (.\) random variable.
  Example 6.13
Use Example 6.6 to w r ite a J\IIATLAB functio n that generates 'IT/, samples of a uniform
(a , b) random variable.
Example 6.14
Write a l\IIATLAB function t hat uses icdfrv .m to generate samples ofY , the maximum
of three pointer spins, in Example 4.5.
function y = icdf3spin(u); From Equation (4 .18) , we see that for 0 < y < 1,
y=u.(1/3); Fy(y) = y 3 . If u = Fy(y) = y 3 , then y = Fy 1 (v,) =
'u, 113 . So we define (and save to disk) icdf3spin.m.
Now, the function ca ll y=icdfrv(icdf3spin, 1000) generates a vector holding 1000
samples of random variab le Y. The notation icdf3spin is the f unct ion handle for
the function icdf 3spin. m.
K eep in mind that for the l\IIATLAB code to rur1 quickly, it is best for the inverse
CDF function ( icdf 3spin. m ir1 the case of the last example) t o process the vector
u vvithot1t t1sing a for loop to find t h e ir1verse CDF for each elern ent u(i). \'Ve
also r1ote that t11is sarne technique car1 be extended to cases \vhere the inverse CDF
F x 1 ( v,) does not exist for all 0 < 'n < 1. For exarnple, t he in\rerse CDF does not
exist if X is a rnixed randorn \rariable or if f'x(x) is constant over an interval (a, b).
Ho\v to use icdfrv. m in these cases is addressed in Problerr1s 6.3.13 and 6.6.4.
Quiz 6.6
Write a l\IIATLAB ft1nction V=V sample (m) t hat ret11rns m sarr1ple of rar1dom variable
v \vith PDF
(v+5)/72 5<
 v <
 72 )
f'v (v) = (6.47)
0 other\vise.
[
Problems
Difficulty: Easy Moderate D ifficu lt t Experts Only
PROBLEMS 237
6.2.8 X is t he uniform (0, 1) random var 6.3.1 X has CD F
iable. F ind a function g(::i;) such t hat t he
PDF of Y = g(X) is 0 x < 1,
::i;/3 + 1/ 3 1 < x < 0,
Fx(x) =
0 < y < 1, ::i;/3 + 2/ 3 0 < x < 1,
otherwise. 1 1 < x.
Y = g(X) wher e
6.2.9 An amplifier circuit has power con
sumption Y t hat grows nonlinearly \vith t he
input signal voltage X. \i\fhen t he input sig
g(X) = { ~00 x
x
< 0,
> o.
nal is X volts, t he instan taneous power con
sumed by t he amplifier is Y = 20 + 15X2 (a) \t\fhat is Fy(y)?
\i\fatts . The input signal X is t he con t inu
(b) \iVhat isfy(y)?
ous uniform (1, 1) random variable. F ind
t he PDF fy(y). ( c) W hat is E[Y] ?
6.2 . 13
X is a con t inuous random variable.
L= { IVI IVI < 0.5,
Y = aX + b, where a, b f:. 0. Prove t hat 0.5 otherwise.
. ( ) _ fx ( (y  b) / a)
jY y  lal . (a) \t\fhat is P [L = 0.5]?
(b) \i\fhat isFL(l)?
H int : Consider t he cases a < 0 a nd a >0
separately. ( c) \t\f hat is E[L]?
W=g(U)= {~ u < 0,
u > o. IVI < 0.6,
otherwise.
F in d t he C DF Fvv( v;) a nd t he exp ected
If V is t he con t inuous uniform (5, 5) r an
value E [W ].
d om variable, 'vhat is t he PDF of W ?
6.3.8 R andom variable X has P DF
6.3.12 Xis t he con t inuous uniform (  3, 3)
. ( ) _ {x/2 0 < x < 2, random variab le . \i\fhen X is p assed
f xx t hrough a limiter , t he out put is t he discrete
0 other\vise.
random variable
X is p rocessed b y a clip ping circuit w it h
outpu t X = (X ) =
g
{c
c
X <0
X > O
x < 1,
x > 1. where c is an unspecified posit ive constan t .
(a) \!\That is t he P~1IF P .x( x) of X?
(a) W hat is P [Y = 0.5]? (b ) \tVhe.n t he limiter input is X, t he d is
(b) F ind t he C D F Fy(y) . t ort ion D bet,ve~n t he input X and t he
limiter outpu t X is
6.3.9 G iven a n input voltage V, t he ou t
put voltage of a halfwave r ectifier is given D = d( X ) = (X  g(X )) 2 .
[
PROBLEMS 239
A
f x,Y (x, JJ) = { ~ otherwise.
( c) Prove t hat X = F(U) has CDF
Fx(x) = Fx(:i;). Let vV = Y / X.
6.4.1 Random variables X and Y have (a) W hat is Svv, t he r ange of vV?
joint PDF (b) F ind F w( 11;), f w( 'llJ), and E [W].
6.4. 7 Random variables X and Y have
.
f x ,y(:i;, y) = {6 0
X'lj 2
'
O<x,y<l,
otherwise.
joint PDF
PROBLEMS 241
6.5.2 X and Y have join t PDF expected values a and (3, respectively, and
sho'v t hat N = J + J{ is a Poisson random
. (
f x ' y x, y
)
= {2 ::e> O, y > O,x +y < l , var iable 'vi t h ex pected value a + (3 . Hin t:
Show t hat
0 o ther wise.
n
F ind t he PDF of vV = X + Y. PN (n,) = L PK (m) PJ (n,  m),
6.5.3 F ind t he PDF of vV = X +Y vvhen ni = O
.X .Y (x, y ) =
f
{2
0
O <:::r< y < l ,
other wise.
t r acting t he sum of a binomial P MF over
all possible values.
x < 1 ,
Fx (x) = 1 < x < 1,
6.5.8 In t his problem 've show directly x > 1.
t hat t he s um of independen t Poisson r a n
dom variables is Poisson. Let J and K be Since F x 1 ( 11,) is not d efined for 1 / 2 < 11, < 1,
independent Poisson random variables wit h use t he result of Problem 6.3.13.
[
is a nt1rr1ber that expr esses ot1r new know ledge abot1t t11e occurrence of e\ren t A ,
vvhen \Ve learn that another ever1t B occ11rs. In t11is section, we consider an e\rer1t A
242
[
Example 7.1
Let N equa l the number of bytes in an emai l. A condition ing event might be the event
I that the email contains a n image . A second kind of co ndition ing wou ld be the event
{N > 100,000} , which tells us t hat the emai l required more than 100,000 bytes. Both
events I and {J\T > 100,000} give us informatio n that t he ema il is likely to have many
bytes.
The definit ior1 of the cor1ditior1al CD F applies t o discrete, contir1uous, and rnixed
randorr1 variables. Hov.rev er , just as vve 11ave found in prior ch apters, the conditional
CD F is not the rr1ost convenient probabilit.Y rr1odel for m an:yr calculations. Inst ead
v.re have definitions for the special cases of discret e X and continuous X that are
rr1ore useful.
The f\1nctions Px 1B(:x;) arid f"x 1B(x; ) are probability rnodels for a nev.r raridom var
iable related t o X. Here we ha;ve extended our riotation corrvent iori for probabilit:y
furictioris. '\'!Ve cont inue the old cor1ven tion tliat a CDF is denoted b:y t he letter
F , a PMF b y P , a nd a PDF by f , vvith t he subscript contairiing the name of t he
randorri variable. Hov.rever, vvit h a condit ioning everit, the subscript coritains the
narne of the raridom v ari able follov.red by a ver t ical bar follov.red by a stat ernent
of the condit ioning event. The argurnent of t lie functiori is usually the lowercase
letter corresporiding to t lie variable narne. Tlie arg11rrient is a durrirny variable. It
cot1ld be any letter , so tha.t Px 1B(:x;) arid 1YIB(Y) are the same functions as Px1 B(v,)
and f YIB(v) . Sornetirries vve v.rrite the function witli no specified argurnen t at all:
Px 1B() .
W hen a conditionirig everit B c Bx , both P(B] a nd P[AB] in Equation (7.1) are
propert ies of t lie PMF Px(x;) or PDF f"x( x; ). Nov.r eit her t he event A = {X = x;}
is contained in the event B or it is not. If X is discret e arid x EB , then {AB} =
{X = x} n B = { X = x} a rid P (X = x, B ] = Px(x; ). Ot herv;.rise, if ;x; tj_ B , t lien
{ X = x} n B = 0 and P(X = x;, B] = 0. Sirnilar obser va.tions a pply \vhen X is
cont inuous . T he next t lieorerri uses t hese observations t o calculate t he condit ional
probabilit}' rriodels .
== Theorem 7.l =
For a ran,dorn variable X an,d an, even,t B C S x 'tuith P[B] > 0) the con,dition,al
PDF of X g'iven, B is
Px (x)
Discrete: Px 1B (x;) = P [B]
0 otherv.Jise
f'x (x )
:x; EB ,
Con,tin/uov,s: f XIB (x;) = P [B ]
0 oth er'tJJis e.
0.15 x = 1, 2, 3, 4,
Px(x) = O.l x = 5, 6,7,8, (7.3)
0 otherwise.
Suppose the website has two servers, one for videos shorter than five minutes and the
other for videos of five or more minutes. What is the PM F of video length in the second
server?
We seek a conditiona l PM F for the condition x EL = {5, 6, 7, 8}. From Theorem 7 .1,
Px(:i;)
x = 5, 6, 7, 8,
p [L] (7.4)
0 otherwise.
Thus the lengths of long videos are equal ly likely. Among the long videos, each length
has probabil ity 0.25.
Let L denote the left side of t he circle. In terms of the stopping position, L = [1 / 2, 1).
Recall ing from Example 4.4 that the pointer position X has a uniform PDF over [O, 1),
2 1/ 2 <x;<l,
(7.8)
0 otherwise.
[
Suppose t he bus has not arrived by the eighth minute; what is the conditiona l PMF of
your waiting time X?
Let A denote the eve nt X > 8. 0 bservi ng t hat P [A) = 12/ 20 , we ca n write the
cond itiona l PM F of X as
1/ 20 1

12
x = 9, 10, ... , 20 ,
12/ 20 (7.10)
0 otherwise.
  Example 7.6
1x (x)
1/ ~ 'i ~ < x < ('i + l ) ~ ,
fx1Bi (x;) = P [Bi ) (7.13)
otherwise, 0 otherwise.
0
Given B,i: , the conditional PDF of X is un iform over the i th qua ntization interva l.
[
In sorr1e applications, '"'e begin \vit11 a set of condit ior1al probability models such
as t he P 1!{Fs Px1Bi(x), 'i = l , 2, ... ,rn,, v.rhere B 1, B 2, ... , B rn is a par t ition. We
t hen t1se t he lavv of total probabilit}' t o find t he P MF Px(x;).
'rn,
Proof The t heorem follo,vs d irectly from Theorem 1.10 wit h A = {X = x } for d iscrete X
or _4 = {x < X < ::i; + dx } when X is cont inuous.
Example 7.7
Let X denote t he num ber of add it io na l years that a random ly chosen 70yearold person
w ill li ve. If the person has high blood pressure, denoted as eve nt H , t hen X is a
geometric (IJ = 0. 1) rand om variable. Otherwise, if t he person 's blood pressure is
norma l, event N, X has a geometric (IJ = 0.05) PMF. Find the cond itiona l PM Fs
Px 1H(:J;) and Px 1N(x;) . If 40 percent of all 70yearolds have high blood pressure , what
is the PM F of X ?
The problem stat ement specifies t he cond it ional P M Fs in words . Mathematically, t he
two cond it iona l PMFs are
By Theorem 7.2 ,
Problem 7.7.1 asks the reader to graph f x(x) to show its sim ilar ity to Figure 4.3.
Quiz7.1
(A) On the Internet , dat a is t ransmitted in packet s. In a sirnple model for vv orld
vVide Web traffic, t h e nt1rnber of packets N needed t o trar1srnit a '\i\Teb page
depends on vvhether t h