This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

**Rohit Vishal Kumar
**

For Circulation to Marketing & Financial Management (1st Semester) Students Xavier Institute of Social Service Ranchi, Jharkhand, India

July 29, 2008

Contents

1 Introduction 2 Common Terms 3 Theory of Probability 3.1 Deﬁnition’s of Probability . . . . . . . . . 3.1.1 Classical Deﬁnition of Probability . 3.1.2 Empirical Deﬁnition of Probability 3.1.3 Axiomatic Deﬁnition of Probability 4 Permutation and Combination 5 Conditional Probability 5.1 Introduction . . . . . . . 5.2 Deﬁnition and Properties 5.3 Bayes Rule . . . . . . . . 5.4 Independence of Events 6 Some Cautions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 3 3 4 5 6 7 8 8 9 9 10 11

List of Tables

1 Laws of Set Theory used in Probability . . . . . . . . . . . . . . . . . . . 4

1

1 INTRODUCTION

2

1

Introduction

The idea of probability, chance or randomness is quite old; where as its rigorous axiomisation in mathematical terms have occurred relatively recently. Many of the ideas of probability theory originated in the study of game of chance. In this century, the mathematical theory of probability has been applied to a wide variety of phenomena for example — It has been used in genetic theory to understand mutation and gene sequence, In Information management, probability has been used in designing and optimising various operating systems, to model the length of various queues, In communication theory, probability has been used to study noise in electrical devices and communication systems, In atmospheric research turbulence is modeled using probability, Actuarial sciences, used by insurance companies, rely heavily on the theory of probability to determine premium etc. In this article, we treat the basic idea of probability and statistics. In the ﬁrst part we explain the theory of probability and subsequently we deal with problems of probability.

2

Common Terms

Probability theory is concerned with situations in which the outcomes occur randomly. Generally, such situations are called experiments, and the set of all possible outcome is known as the sample space corresponding to the experiment. The sample space is generally denoted by S and an element of S is demoted by ω. Example A Driving to work, a commuter passes through a sequence of three intersection with trafﬁc lights. At each light, she either stops s, or continues c. The sample space is the set of all possible outcomes: S = {ccc, ccs, csc, scc, css, scs, ssc, sss} where csc, for example, denotes the outcome that the commuter continues through the ﬁrst light, stops at the second light and continues through the third light. Example B The number of jobs in a print queue of a mainframe computer may be modeled at random. Theoretically the sample space would consist of all non negative integers up to inﬁnity. In practice, there would be an upper limit N as to how large the print queue can be. The sample space can be deﬁned as S = {0, 1, 2, 3, . . . , N}

3 THEORY OF PROBABILITY

3

We are often interested in a particular subset of S, which in the language of probability, are called events. In Example A, the event that the commuter stops at the ﬁrst light is a subset of S and is given by A = {sss, ssc, scc, scs} Events of subsets, are usually denoted by uppercase roman letters. Similarly, in Example B, the event that the print queue has fewer than ﬁve jobs can be denoted by A = {0, 1, 2, 3, 4} The algebra of set theory is directly applicable to the events in probability theory. The union of two events A and B, denoted by A ∪ B is deﬁned as the event such that either A occurs or B occurs or both occurs. The intersection of two events, denoted by A ∩ B is deﬁned the event such that A and B both occur. The compliment of event A, denoted by A or Ac or A′ , is the event that A does not occur and thus consist of all the elements of S which are not in A. An empty set, denoted by φ, is a set which has no elements i.e. it is the event with no outcomes. If there are say two events A and B and if A ∩ B = φ then the events A and B are said to be disjoint events. The laws of set theory are extensively used in probability theory. Table 1 gives some of the laws of set theory which are used frequently in statistics. You are advised to check the validity of the laws using Venn Diagram

3

Theory of Probability

3.1 Deﬁnition’s of Probability

Even though the meaning and understanding of probability was clear to the world for a long time; there was signiﬁcant disagreement amongst the theoreticians as to how to deﬁne probability. The earliest deﬁnition of probability was the classical deﬁnition, which was later rejected because it lacked certain desirable properties. Subsequently, the mathematical deﬁnition, which deﬁned probability in terms of limits was also used and later disputed. The current deﬁnition of probability is based on three axioms and is known as the axiomatic deﬁnition of probability. All the three deﬁnitions are provided here for the sake of completeness:

3 THEORY OF PROBABILITY

Commutative Law Complimentary Law A∪B A∩B A ∪ A′ A ∩ A′ A∪S A∩S A∪φ A∩φ (A′ )′ A∩A A∪A (A ∪ B) ∪ C (A ∩ B) ∩ C (A ∪ B) ∩ C (A ∩ B) ∪ C (A ∪ B)′ (A ∩ B)′ = B∪A = B∩A = S = φ = S = A = A = φ = A = A = A = A ∪ (B ∪ C) = A ∩ (B ∩ C) = (A ∩ C) ∪ (B ∩ C) = (A ∪ C) ∩ (B ∪ C) = A′ ∩ B ′ = A′ ∪ B ′

4

Involution Law Idempotency Law Associative Law Distributive Law De Morgan’s Law

Table 1: Laws of Set Theory used in Probability 3.1.1 Classical Deﬁnition of Probability Suppose an event results in n mutually exclusive, exhaustive and equally likely cases. Let m be the number of events which are favorable to the event A. Then the probability of event A, denoted as P (A) is deﬁned as: P (A) = No. of Cases F avorable to the Event A T otal number of cases m = n

(1)

The classical deﬁnition introduced some concepts which are deﬁned here. Consider an experiment which though repeated under essentially identical conditions does not give unique results but may result in any one of the several possible outcomes. Then the experiment is known as a Trial and the outcomes are known as events or cases. The total number of possible outcome in any trial is known as the exhaustive event. Exhaustive events corresponds to the sample space of set theory. Events are said to be mutually exclusive if the occurrence of any one of them precludes (or prevents) the occurrence of all other events. Outcomes of a trial are said to be equally likely if taking into consideration all the relevant evidences, there is no reason to expect one event in preference to any other event. Favorable Events in a trial are the number of events, the occurrence of which leads to the occurrence of the deﬁned event.

3 THEORY OF PROBABILITY

5

U NDERSTANDING

THE

C LASSICAL P ROBABILITY D EFINITION

Let us consider that we toss a dice. Then, tossing of the dice is an Trial and getting 1 (or 2 or 3 or 4 or 5 or 6) is an event. Thus the above trial leads to the following events S = {1, 2, 3, 4, 5, 6} which is the sample space of the outcomes. If we assume that the dice is unbiased, we don’t know which event will occur. So we cannot favor any event over any other event. Hence all the events are said to be equally likely. Now if after tossing the dice, we get 1 (say), then in this toss any other number (2,3,4,5,6) cannot occur. Hence the outcomes of the event are mutually exclusive. Now suppose we deﬁne a new event A as the outcome of an odd number when an unbiased dice is tossed. Then the number of outcomes which are favorable to event A are A = {1, 3, 5}. Then the probability of event A is P (A) = 3/6 = 1/2.

It can be easily seen that 0 ≤ m ≤ n and therefore the value of P (A) lies between 0 and 1, both inclusive. It can also be seen that the classical deﬁnition is dependent on a ﬁnite number of cases (n = 0). Sometimes, the expression (1) is also referred to as that “the odds in favor of event A are m : (n − m) or the odds against event A are (n − m) : n. In case the sample space is inﬁnite then the classical deﬁnition fails to deﬁne probability. The classical deﬁnition also fails if the trials of an event are not equally likely. For example, suppose a candidate appears for a test then we normally assume that the candidate is equally likely to fail or pass. However, if we already know that the candidate has more than 50% chance of passing, then the outcomes are not equally likely and hence we cannnot apply the classical deﬁnition. Given these limitations of the classical deﬁnition, we now look at other deﬁnitions of probability 3.1.2 Empirical Deﬁnition of Probability If a trial is repeated a number of times under essentially homogeneous and identical conditions, then the limiting value of the ratio of number of times the event happens (m) to the number of trials (n), as the number of trials becomes indeﬁnitely large is called the probability of happening of the event (under the assumption that the limit is ﬁnite and unique). Symbolically, if in n trials, an event A occurs m times, the probability

3 THEORY OF PROBABILITY

of event A denoted by P (A) = 3.1.3 Axiomatic Deﬁnition of Probability lim m n

6

n→∞

(2)

A probability measure on S is a function P which assigns a non-negative real number to every event A which satisﬁes the following axioms: 1. P (S) = 1 2. For each A ∈ S, P (A) is deﬁned, is real and P (A) ≥ 0. 3. If A1 , A2 , A3 , . . . , An , . . . are mutually disjoint, then

∞ ∞

P

i=1

Ai

=

i=1

P (Ai)

The ﬁrst two axioms are obviously desirable. Since S consists of all possible events, hence P (S) = 1. The second axiom simply states that probability of any event A is deﬁned and non-negative. Let us ﬁrst understand the third axiom in terms of two events A1 and A2 which are disjoint in nature i.e. they have no outcome in common; then P (A1 ∪ A2 ) = P (A1 ) + P (A2 ). Thus what the third axiom says is that if there are a large number of events, A1 , A2 , A3 , . . . , An , . . ., deﬁned on the sample space S and they are all disjoint, then the probability of the union of these events is nothing but the sum of probabilities of the individual events. The following important properties derive from the axiomatic deﬁnition of probability: 1. The probability of an impossible event is zero i.e. P (φ) = 0 2. Probability of a complimentary event A′ is given by P (A′) = 1 − P (A) 3. If the event A is a subset of event B then P (A) ≤ P (B). 4. If A and B be any two events deﬁned on the sample space S and are not disjoint, then P (A ∪ B) = P (A) + P (B) − P (A ∩ B). This is also known as the Addition Law of probability. You are advised to Prove all the Properties

4 PERMUTATION AND COMBINATION

7

4

Permutation and Combination

A permutation is an ordered arrangement of objects. Suppose that from a set containing n objects we are to choose r objects and list them in order. The question is, in how many ways we can do this? The answer depends on whether we are allowed to duplicate objects from the list or not. If we are allowed to duplicate objects we are sampling with replacement and if we are not allowed to duplicate, then we are sampling without replacement. First suppose that we are allowed to duplicate i.e. we are doing sampling with replacement. So the ﬁrst object can be chosen in any of the n ways. After we have chosen the ﬁrst object, we can put the object back into the set and choose another object from the full set of n objects. So the second object can be chosen in another n ways. So there are nr ways of choosing an r objects from a set of n objects. Now suppose that the sampling is done without replacement. The ﬁrst object can be chosen in n ways. The second object can be chosen in (n − 1) ways, the third object in (n − 2) and the r th object can be chosen in (n − r + 1) ways. Thus the total number of ways in which we can choose r objects from a set of n objects without replacement is n(n − 1)(n − 2) . . . (n − r + 1) ways. This is known as permutation and is usually denoted by n Pr or by (n )r .

S OME P ROPOSITIONS

TO

R EMEMBER

P ROPOSITION A: For a set of size n and a sample of size r, there are nr different n! ordered samples with replacement and n Pr = (n−r)! = n(n − 1)(n − 2) . . . (n − r + 1) different ordered samples without replacement. P ROPOSITION B: The number of unordered samples of r objects selected from n objects without replacement is n Cr =

n! (r!)(n−r)!

C OROLLARY: The number of orderings of n elements is n(n−1)(n−2) . . . 1 = n!

Let us now consider a special case, in which we are not interested in ordered samples, but in the constituents of the sample regardless of the order in which they have been obtained. In particular, we ask the following question: If r objects are taken from a set of n objects without replacement and disregarding order of selection, then how many different samples are possible? Now we know that the number

5 CONDITIONAL PROBABILITY

8

of ordered samples without replacement is n(n − 1)(n − 2) . . . (n − r + 1) and since a sample of size r can be ordered in r! ways, the number of unordered samples is n(n − 1)(n − 2) . . . (n − r + 1)/r!. This is known as combination and is denoted by n Cr or by

n r

.

n

Cr =

n! r!(n − r)! n(n − 1)(n − 2) . . . (n − r + 1) = r(r − 1)(r − 2) . . . 3.2.1

5

Conditional Probability

5.1 Introduction

To introduce the aspect of conditional probability we take a help of an example. Digitalis therapy is often beneﬁcial to patients who have suffered congestive heart failure — a type of cardiac disease. But giving Digitalis to patients has a serious side effect as the patient runs the risk of having Digitalis toxicity which can prove fatal. To improve the chance of correct diagnosis, the concentration of Digitalis in the blood can be measured. A study was conducted in 135 cardiac heart patients to ﬁnd the concentration of Digitalis in the blood of the patients. The table below gives the results where the following notations are used: D+ congestive heart disease is present, D− the congestive heart disease is not present, T + there is high concentration of Digitalis in the blood and T − there is low concentration of Digitalis in the blood. D+ D− Total 25 14 39 18 43 78 92 96 135

T+ T− Total

Thus for example, 25 patients had high concentration of Digitalis in blood and the disease is present. Assuming that the ﬁndings of the the study holds for all the cardiac patients, the probability of having congestive heart disease is 43/135 = 0.318. But suppose now, that we have a patient and the patient shows a high concentration of toxicity in the blood. Then what would be the probability of the patient having the congestive heart disease? To answer this question we can restrict our attention to the ﬁrst row of the table. We see that out of 39 cardiac patients who have high concentration of Digitalis in the blood, 25 suffer from congestive heart disease. Thus the probability of having congestive heart disease given that the patient has high concentration of Digitalis in blood is 25/ 39 = 0.640.

5 CONDITIONAL PROBABILITY

9

Let us understand the results. The probability of having a congestive heart disease amongst cardiac patients P (D+) = 0.318, but when we get additional information of Digitalis concentration in blood the probability of having congestive heart disease becomes P (D + |T +) = 0.640 which is much higher than the P (D+). P (D+) is the unconditional probability of having congestive heart disease and P (D + |T +) is known as the conditional probability of having congestive heart disease given that we have the information T +. Conditional probability can also be looked upon as the probability of a particular event provided some additional information about the occurrence (or non occurrence) of the event is available.

**5.2 Deﬁnition and Properties
**

Let A and B be two events with P (B) = 0, the the conditional probability of A given B is deﬁned as P (A|B) = P (A ∩ B) P (B)

The following properties hold for the conditional probability: 1. P (S|A) = 1 where S is the sample space or sure event. 2. If A1 and A2 are two events such that A1 ∩A2 = φ then P (A1 ∪A2 |B) = P (A1|B) + P (A2 |B). 3. P (A′ |B) = 1 − P (A|B). 4. Given the deﬁnition of conditional probability, we also have P (A|B).P (B) = P (A ∩ B). This is also know as the multiplication law of probability. You Are Advised To Prove All The Properties

5.3 Bayes Rule

Let the events A1 , A2 , A3 , . . . , An be deﬁned on the sample space S and let these events be exhaustive and mutually exclusive and let P (Ai ) > 0 ∀i . Then for any event B deﬁned on the sample space we have

n

P (B) =

i=1

P (Ai )P (B|Ai)

Proof Since the events Ai ’s are exhaustive we have ∪n Ai = S i=1

5 CONDITIONAL PROBABILITY

10

Now we can write event B as B ∩ S (From Complimentary Law of Set Theory) and as such we have: B = B∩S = B ∩ (∪n Ai ) i=1 = ∪n (B ∩ Ai ) i=1 Since the events A1 , A2 , A3 , . . . , An are mutually exclusive, the events B ∩ A1 , B ∩ A2 , B ∩ A3 , . . . , B ∩ An are also mutually exclusive. And, therefore by the addition theorem of probability we have:

P (B) = P (∪n (B ∩ Ai )) i=1

n

=

i=1

P (B ∩ Ai )

**And from the multiplicative law of probability we know that P (B∩Ai ) = P (Ai )P (B|Ai) hence
**

n

P (B) =

i=1

P (Ai )P (B|Ai) QED

5.4 Independence of Events

Two events, A and B are said to be stochastically independent, if the probability of occurrence of one of the events, does not depend on the occurrence or non occurrence of the other event. Thus the two events are said to be independent iff P (A|B) = P (A|B ′ ) = P (A). It can be shown that the two events are stochastically independent iff P (A ∩ B) = P (A)P (B). If the events A and B are independent, then the events (i) A′ and B, (ii) A and B ′ and (iii) A′ and B ′ are also independent. You are advised to Prove the Above Independence’s

6 SOME CAUTIONS

11

I NDEPENDENCE

OF

M ANY E VENTS

If there are more than two events say A1 , A2 , A3 , . . . An then for stochastic independence it is required that the following conditions are met: P (Ai ∪ Aj ) P (Ai ∪ Aj ∪ Ak ) P (Ai ∪ Aj ∪ Aj . . . ∪ An ) = = = P (Ai)P (Aj ) P (Ai)P (Aj )P (Ak ) P (Ai)P (Aj )P (Ak ) . . . P (An )

... ... ...

Note: For n events, (2n −n−1) conditions are required to be met for stochastic independence

6

Some Cautions

The concept of probability is not easily understood by people and therefore at times is used to confuse the population at large. Take for example the following quote from the Los Angeles Times (August 24, 1987) which talks about AIDS: Several studies of people infected with the AIDS virus shows that a single act of unprotected sex with an has a surprising low risk of infecting partners — probably one in 100 to one in 1000. For an average, consider the risk to be 1 in 500. Statistically, 500 acts of unprotected sex with an infected partner or 100 acts with ﬁve partner leads to a 100% probability of infection. Have you spotted the ﬂaw? There are many but we will consider only a few. First and foremost, the report says that 500 acts with one infected partner will lead to infection. So suppose a person has 1000 acts of sex with an infected partner, then what is his probability of getting infected? According to the report it is 2 (which, of course, is not possible theoretically). Let us assume that the probability of infection is 1/500, as reported in the news. Now suppose a person has 500 acts of sex with an infected partner. What is his probability of getting infected? Let us work it out. Assume that the sexual acts are independent of each other and each act has 1/500 probability of having the infection. Then the probability of non-infection in each act is 1 − (1/500) = 499/500. So in 500 acts of

6 SOME CAUTIONS

12

unprotected sex, the probability of non-infection is (499/500)500 = 0.3675. Therefore the probability of infection is 1 − 0.3675 = 0.6325 which is much less than 100% as claimed by the study. If you did not identify the ﬂaws in the study, don’t despair. Research has shown that people are not too good at understanding probability. For example consider the following question “If Linda is a 31 year old woman who is outspoken on social issues such as disarmament and equal rights, which of the following statement is more likely to be true?” • Linda is a Bank Teller • Linda is a Bank Teller and active in feminist movement More than 80% of those questioned choose the second statement, despite the fact that the correct answer is the ﬁrst statement. Even hardened professional’s have difﬁculty in answering probabilistic calculations. For example the following question was asked to 100 doctors: “In the absence of any special information, the probability that a woman has breast cancer is 1%. If the patient has breast cancer, the probability that the radiologist will correctly diagnose it is 80%. And if the patient has benign lession (no brest cancer) then the probability that the radiologist will incorrectly diagnose it as breast cancer is 10%” Then what is the probability that a patient with a positive mammogram actually has breast cancer? 95 out of the 100 physicians estimated the probability to be about 75% However, the correct probability, as given by Bayes rule is 7.5% (You can check this). So even experts make mistakes. However, in spite of it’s misuse and lack of interpretation, probability is the cornerstone of all sciences and also of various subjects of humanities and management. So it is imperative that you have a clear idea of probability and understand the basics of probability well.

This document can be obtained from: Rohit Vishal Kumar Reader, Department of Marketing Xavier Institute of Social Service P.O. Box No: 7, Purulia Road Ranchi - 834001, Jharkhand India Phone: (91-651) 2200-873 Ext. 308 Email: rohitvishalkumar@yahoo.com Final Print on: July 29, 2008 c 2007, Rohit Vishal Kumar

- stataguide
- Matlab Notes
- Matlab for Psychologists
- Matlab Tutorial
- A Manual Matlab
- Advanced Statistics With Matlab
- pyfltkmanual
- Testing of Hypothesis
- Non Parametric Tests
- Anova
- Python Introduction
- Introduction to Python
- A Case for Python
- Hands on Python
- Byte of Python
- Think like a conputer scientist with Python
- easyj
- Learning J
- West Syndrome
- Myoconic Seizures
- Infantile Spasm - Info
- eMedicine - Infantile Spasm (West Syndrome)
- Article 4
- Article 3

- Basic Probability
- Statistics Notes
- 146_vgGray R. - Probability, Random Processes, And Ergodic Properties
- 03.Probability
- MTH263 Lecture 4
- 141_vgGray R. - Probability, Random Processes, And Ergodic Properties
- Bayes Theorem
- Exercises in Engineering Statistics
- Poisson
- [김재영] probability
- communications systems
- Lec05 Product Rule and Bayes' Rule
- mc_manual
- Interpolation Function Theorems
- Pearson Curves Frequency
- Pages From Essential Math
- Probability Cheatsheet
- Order Statistics
- Eco Stat
- Math 55 Handout (Partial) on CHAPTER 1 and PROBLEM SET 1
- tele.pdf
- Least Squares With Equality Constraints
- Chapter 6 - Ordinary Differential Equation
- Probability Notes
- rsa
- ComputingTransformations_spring2005
- On Generalized Gaussian Quadratures for Exponentials and Their Applications
- Xorshift Random Number Generators
- 280014
- chap7

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd