Professional Documents
Culture Documents
BSHM 3-2
If we roll a standard 6-sided die, describe the sample space and some simple events.
The sample space is the set of all possible simple events: {1,2,3,4,5,6}
Some examples of simple events:
We roll a 1
We roll a 5
Some compound events:
We roll a number bigger than 4
We roll an even number
Basic Probability
RIOJA, ANNA MILCA V.
BSHM 3-2
Given that all outcomes are equally likely, we can compute the probability of an event
E using this formula:
where the answer does not satisfy this is obviously wrong! We note that (A ∧ B) ∧ (A ∧ B
c ) = φ and (A ∧ B) ∨ (A ∧ B c ) = A and so it follows that P(A) = P(A ∧ B) + P(A ∧ Bc ).
C. Conditional probability
Each of the probabilities computed in the previous section (e.g., P(boy), P(7 years of
age)) is an unconditional probability, because the denominator for each is the total
population size (N=5,290) reflecting the fact that everyone in the entire population is
eligible to be selected. However, sometimes it is of interest to focus on a particular
subset of the population (e.g., a sub-population). For example, suppose we are
interested just in the girls and ask the question, what is the probability of selecting a 9
year old from the sub-population of girls? There is a total of NG=2,730 girls (here NG
refers to the population of girls), and the probability of selecting a 9 year old from the
sub-population of girls is written as follows:
where | girls indicates that we are conditioning the question to a specific subgroup, i.e.,
the subgroup specified to the right of the vertical line.
The conditional probability is computed using the same approach we used to compute
unconditional probabilities. In this case:
This also means that 16.9% of the girls are 9 years of age. Note that this is not the same
as the probability of selecting a 9-year old girl from the overall population, which is P(girl
who is 9 years of age) = 461/5,290 = 0.087.
D. Random variables
A random variable is a numerical description of the outcome of a statistical experiment.
A random variable that may assume only a finite number or an infinite sequence of
values is said to be discrete; one that may assume any value in some interval on the real
number line is said to be continuous. For instance, a random variable representing the
number of automobiles sold at a particular dealership on one day would be discrete,
while a random variable representing the weight of a person in kilograms (or pounds)
would be continuous.
A continuous random variable may assume any value in an interval on the real number
line or in a collection of intervals. Since there is an infinite number of values in any
interval, it is not meaningful to talk about the probability that the random variable will
RIOJA, ANNA MILCA V.
BSHM 3-2
take on a specific value; instead, the probability that a continuous random variable will
lie within a given interval is considered.
he expected value, or mean, of a random variable—denoted by E(x) or μ—is a weighted
average of the values the random variable may assume. In the discrete case the weights
are given by the probability mass function, and in the continuous case the weights are
given by the probability density function. The formulas for computing the expected
values of discrete and continuous random variables are given by equations 2 and 3,
respectively.
E(x) = Σxf(x) (2)
E(x) = ∫xf(x)dx (3)
The variance of a random variable, denoted by Var(x) or σ2, is a weighted average of the
squared deviations from the mean. In the discrete case the weights are given by the
probability mass function, and in the continuous case the weights are given by the
probability density function. The formulas for computing the variances of discrete and
continuous random variables are given by equations 4 and 5, respectively. The standard
deviation, denoted σ, is the positive square root of the variance. Since the standard
deviation is measured in the same units as the random variable and the variance is
measured in squared units, the standard deviation is often the preferred measure.
Var(x) = σ2 = Σ(x − μ)2f(x) (4)
Var(x) = σ2 = ∫(x − μ)2f(x)dx (5)
E. Probability distributions
In probability theory and statistics, a probability distribution is the mathematical
function that gives the probabilities of occurrence of different possible outcomes for an
experiment. It is a mathematical description of a random phenomenon in terms of its
sample space and the probabilities of events. The probability distribution for a random
variable describes how the probabilities are distributed over the values of the random
variable. For a discrete random variable, x, the probability distribution is defined by
a probability mass function, denoted by f(x). This function provides the probability for
each value of the random variable. In the development of the probability function for a
discrete random variable, two conditions must be satisfied: (1) f(x) must be nonnegative
for each value of the random variable, and (2) the sum of the probabilities for each
value of the random variable must equal one.
Special probability distributions
The binomial distribution
Two of the most widely used discrete probability distributions are the binomial and
Poisson. The binomial probability mass function (equation 6) provides the probability
that x successes will occur in n trials of a binomial experiment.
RIOJA, ANNA MILCA V.
BSHM 3-2