31 views

Uploaded by Abdallah Ezat Fahmi

A report that discusses some of the basic concepts of statistics which will be used in specific nuclear experiments.

- Cssbb Instructor Sample
- Syllabus STA302 UB S 2015
- 4-Data Cleaning - Handout
- Robust Statistical Procedure PAPER
- EECE 522 Notes_01a Review of Probability.pdf
- Introduction SPC
- Spatial Statistics for Remote Sensing Ben Gorte Alfred Stein, Freek Van der Meer, Ben Gorte.pdf
- Statistical Foundations of Control Charts
- Kemiringan Dan Penggunaan Standar Deviasi
- 005 Normal Distribution
- Why I Will Never Have a Girlfriend
- Automatic differentiation variational inference
- stat_e2 (1)
- Normal Distribution Practice 1
- Design and Analysis of Single Factor Experiments
- SimpleLinearRegression.pdf
- Acceptance
- 7 sp b 4 task great debate
- IQM - Lecture Guide
- 2.3_Distributions+分布

You are on page 1of 12

probability

Abdallah Ezat, Physics department, Sohag University

April 11, 2016

Supervisor: Dr. Mohamed Mekhemar, Sohag University

Abstract

Statistics, being the study of the collection, analysis, interpretation, presentation, and

organization of data, is the way we deal with the large populations and

approximations in the nuclear experimental physics, alongside with probability. In

this report, I will discuss some of the basic concepts of statistics which will be used in

specific nuclear experiments. At the end of the report, I will discuss the probability

theory and two probability distributions, namely, normal (Gauss) distribution and

Poisson's distribution.

Contents

1. Statistics..............................................................................................................3

1.1 Descriptive statistics................................................................................................3

1.2 Inferential statistics: population and sample...........................................................3

1.3 Variable, observation, and data set..........................................................................3

3. The Mean and the Median........................................................................4

3.1 The arithmetic Mean................................................................................................4

3.2 The properties of arithmetic mean...........................................................................4

3.3 The Median..............................................................................................................5

4.1 Dispersion or variation............................................................................................5

4.2 The range.................................................................................................................5

4.3 The standard deviation.............................................................................................5

5. Probability.........................................................................................................7

5.1 Experiment, outcomes, and sample space...............................................................7

5.2 Events, simple events, and compound events..........................................................7

5.3 Probability................................................................................................................7

5.4 Poisson's distribution...............................................................................................8

5.5 Normal distribution..................................................................................................9

References.............................................................................................................12

1. Statistics

1.1 Descriptive statistics

The use of graphs, charts, and tables and the calculation of various statistical measures to organize

and summarize information is called descriptive statistics. Descriptive statistics help to

reduce our information to a manageable size and put it into focus.

The complete collection of items or data under consideration in a statistical study

is referred to as the population. The portion of the population selected for analysis is called the

sample. Inferential statistics consists of techniques for reaching conclusions about a population

based upon information contained in a sample.

A characteristic of interest concerning the individual elements of a population or a sample is called

a variable. A variable is often represented by a letter such as x, y, or z. The value of a variable for

one particular element from the sample or population is called an Observation. A data set consists of

the observations of a variable for the elements of a sample.

if we have large masses of raw data (collected data which have not been organized numerically,

such as the set of 100 male students obtained from an alphabetical listing of university records), it is

often useful to distribute data into classes or categories and to determine the number of individuals

belonging to each class, which is called class frequency. The following table is an example:

Mass (kg)

Number of

students

60-62

63-65

66-68

69-71

72-74

5

18

42

27

8

Total 100

The first class or category, for example, consists of masses from 60 to 62 Kilograms. Since 5

students have masses belonging to this class, the corresponding class frequency is 5.

A symbol defining a class such as 60-62 in the above table is called class interval. The end

numbers, 60 and 62, are the class limits (the lower class limit is 60 and the higher class limit is 62).

3

As most nuclear experiments are affected by many factors, which may result in an inaccurate

results, we should measure the same quantity more than once, and then choose the most accurate

one of the readouts, i.e. taking the average of them. The average is a value which is typical or

representative of a set of data. Since such typical values tend to lie centrally within a set of data

arranged according to magnitude, averages are also called measurements of central tendency. There

are several types of averages, the most common being the Mean (also known as arithmetic Mean)

and the Median.

The arithmetic mean of a set of N numbers X1,X2,...XN is denoted by

X , X , ... X N

=

X= 1 2

N

Xi

i=1

If the numbers X1,X2,...XN occur f1,f2,... fk times respectively, (i.e. occur with frequencies f1,f2,... fk),

the arithmetic mean is:

X=

f 1 X 1 , f 2 X 2 , ... f k X k

f 1 +f 2+...+ f k

a) The algebraic sum of the deviations of a set of numbers from their arithmetic mean is zero

example: the deviations of the numbers 8,3,5,12,10 from their arithmetic mean 7.6 are 8-7.6,

3-7.6, 5-7.6, 12-7.6, 10-7.6 with algebraic sum 0.4-4.6-2.6+4.4+2.4=0.

b) the sum of the squares of the deviations of a set of numbers xj from any number a is a

minimum iff a = X

c) if f1 numbers have mean m1, f2 numbers have mean m2,... fk numbers have mean mk, then the

mean of all numbers is

f m , f m , ... f k mk

X= 1 1 2 2

f 1 + f 2+ ...+ f k

The median of a set of numbers arranged in order of magnitude (i.e. in an array) is the middle value

or the arithmetic mean of the two middle values.

4

1

(9+ 11)=10

2

For grouped date (like the frequency table above) the median is given by

N

( f )1

2

)

Median = L1 + L1+(

f median

where

L1 = lower class boundary of the median class (i.e. the class containing the median )

N = number of items in the data (i.e. total frequency)

( f )1 = sum of frequencies of all classes lower than the median class

fmedian = frequency of median class

c = size of median class

Example 2. the set of numbers 5,5,7,9,11,12,15,18 has median

4.1 Dispersion or variation

The degree to which numerical data tend to spread about an average value is called variation or

dispersion of the data. Various measures of dispersion or variation are available, the most common

being the range, mean deviation, semi-interquartile, and the standard deviation. We will discuss

only the range and the standard deviation.

The range of a set of numbers is the difference between the largest and the smallest number in the

set.

Example: the range of the set 2,3,3,5,5,5,8,10 is 12-2 = 10. sometimes it is given by quoting the

largest and smallest number. In the last example, for instance, the range could be indicated as 2 to

12 or 2-12.

The standard deviation of a set of numbers

population and for the standard deviation of the population itself- and is defined by

s=

( X i X)2

(i=1)

x2

(

)

N

where x represents the deviations of each of the numbers xi from the mean X .

Thus s is the root mean square 1 of the deviations from the mean, or the root mean square deviation.

1- Its name suggests that it is the root of the square of the mean of the data! This type of average is

frequently used in physical applications

5

The variance of a set of data is defined as the square of the standard deviation and is thus given by

s2 in the above equation.

4.3.1 The properties of the standard deviation

1. For normal distributions2 it turns out that:

a) 68.3% of the cases are included between X s and X + s

(i.e. one standard deviation on either side of the mean) (see the figure below)

b) 95.45% of the cases are included between X 2 s and X 2 s

(i.e. two standard deviations on either side of the mean)

c) 97.73% of the cases are included between X 3 s and X 3 s

(i.e. three standard deviations on either side of the mean)

2- Suppose that two sets consisting of N1 and N2 numbers have variances given by s 21 and s 22

respectively and the same mean X . Then the combined variance of both sets is given by

N 1 s21 + N 2 s 22

s =

N1+ N2

2

5. Probability

2- A distribution that describes most statistical processes having a continuously varying magnitude.

6

An experiment is any operation or procedure whose outcomes cannot be predicted with certainty.

The set of all possible outcomes for an experiment is called the sample space for the experiment.

Example

When a quality control technician selects an item for inspection from a production line, it may be

classified as defective or non-defective. The sample space may be represented by S = (D, N}. When

the blood type of a patient is determined, the sample space may be represented as S = (A, AB, B, 0).

An event is a subset of the sample space consisting of at least one outcome from the sample

space. If the event consists of exactly one outcome, it is called a simple event. If an event consists of

more than one outcome, it is called a compound event.

Example a quality control technician selects two computer mother boards and classifies each as

defective or non-defective. The sample space may be represented as S = {NN, ND, DN, DO], where

D represents a defective unit and N represents a non-defective unit. Let A represent the event that

neither unit is defective and let B represent the event that at least one of the units is defective. A

{NN} is a simple event and B = {ND, DN, DD) is a compound event.

5.3 Probability

Probability is a measure of the likelihood of the occurrence of some event. There are several

different definitions of probability. Three definitions are discussed in the next section. The

particular definition that is utilized depends upon the nature of the event under consideration.

However, all the definitions satisfy the following two specific properties and obey the rules of

probability.

The probability of any event E is represented by the symbol P(E) and the symbol is read as P of

E or as the probability of event E. P(E) is a real number between zero and one as indicated in the

following inequality:

0P ( E)1

The sum of the probabilities for all the simple events of an experiment must equal one. That is, if

E1 , E2 , . . . , E, are the simple events for an experiment, then the following equality must be true:

P(E1) + P(E2) + . . . + P(En) = 1

This equation is also sometimes expressed as in formula

P(S) = 1

The last equation states that the probability that some outcome in the sample space will occur is

one.

A Poisson distribution is the probability distribution that results from a Poisson experiment.

Attributes of a Poisson Experiment

A Poisson experiment is a statistical experiment that has the following properties:

The experiment results in outcomes that can be classified as successes or failures.

The average number of successes () that occurs in a specified region is known.

The probability that a success will occur is proportional to the size of the region.

The probability that a success will occur in an extremely small region is virtually zero.

Note that the specified region could take many forms. For instance, it could be a length, an area, a

volume, a period of time, etc.

Notation

The following notation is helpful, when we talk about the Poisson distribution.

e: A constant equal to approximately 2.71828. (Actually, e is the base of the natural

logarithm system.)

: The mean number of successes that occur in a specified region.

x: The actual number of successes that occur in a specified region.

P(x; ): The Poisson probability that exactly x successes occur in a Poisson experiment,

when the mean number of successes is .

Poisson Distribution

A Poisson random variable is the number of successes that result from a Poisson experiment.

Theprobability distribution of a Poisson random variable is called a Poisson distribution.

Given the mean number of successes () that occur in a specified region, we can compute the

Poisson probability based on the following formula

Poisson Formula. Suppose we conduct a Poisson experiment, in which the average number of

successes within a given region is . Then, the Poisson probability is:

P(x; ) = (e-) (x) / x!

where x is the actual number of successes that result from the experiment, and eis approximately

equal to 2.71828.

The Poisson distribution has the following properties:

The mean of the distribution is equal to .

The variance is also equal to .

Example 1

The average number of homes sold by the Acme Realty company is 2 homes per day. What is the

probability that exactly 3 homes will be sold tomorrow?

Solution: This is a Poisson experiment in which we know the following:

x = 3; since we want to find the likelihood that 3 homes will be sold tomorrow.

e = 2.71828; since e is a constant equal to approximately 2.71828.

We plug these values into the Poisson formula as follows:

P(x; ) = (e-) (x) / x!

P(3; 2) = (2.71828-2) (23) / 3!

P(3; 2) = (0.13534) (8) / 6

P(3; 2) = 0.180

Thus, the probability of selling 3 homes tomorrow is 0.180.

The normal distribution refers to a family of continuous probability distributions described by the

normal equation.

The Normal Equation

The normal distribution is defined by the following equation:

Normal equation. The value of the random variable Y is:

Y = { 1/[ * sqrt(2) ] } * e-(x - )2/22

where X is a normal random variable, is the mean, is the standard deviation, is

approximately 3.14159, and e is approximately 2.71828.

The random variable X in the normal equation is called the normal random variable. The normal

equation is the probability density function for the normal distribution.

The Normal Curve

The graph of the normal distribution depends on two factors - the mean and the standard deviation.

The mean of the distribution determines the location of the center of the graph, and the standard

deviation determines the height and width of the graph. When the standard deviation is large, the

curve is short and wide; when the standard deviation is small, the curve is tall and narrow. All

normal distributions look like a symmetric, bell-shaped curve, as shown below.

The curve on the left is shorter and wider than the curve on the right, because the curve on the left

has a bigger standard deviation.

The normal distribution is a continuous probability distribution. This has several implications for

probability.

The total area under the normal curve is equal to 1.

The probability that a normal random variable X equals any particular value is 0.

The probability that X is greater than a equals the area under the normal curve bounded by

a and plus infinity (as indicated by the non-shaded area in the figure below).

The probability that X is less than a equals the area under the normal curve bounded by a

and minus infinity (as indicated by the shaded area in the figure below).

Additionally, every normal curve (regardless of its mean or standard deviation) conforms to the

following "rule".

About 68% of the area under the curve falls within 1 standard deviation of the mean.

About 95% of the area under the curve falls within 2 standard deviations of the mean.

About 99.7% of the area under the curve falls within 3 standard deviations of the mean.

Collectively, these points are known as the empirical rule or the 68-95-99.7 rule. Clearly, given a

normal distribution, most outcomes will be within 3 standard deviations of the mean.

To find the probability associated with a normal random variable, use a graphing calculator, an

online normal distribution calculator, or a normal distribution table. In the examples below, we

illustrate the use of Stat Trek's Normal Distribution Calculator, a free tool available on this site. In

the next lesson, we demonstrate the use of normal distribution tables.

Example 1

An average light bulb manufactured by the Acme Corporation lasts 300 days with a standard

deviation of 50 days. Assuming that bulb life is normally distributed, what is the probability that an

Acme light bulb will last at most 365 days?

Solution: Given a mean score of 300 days and a standard deviation of 50 days, we want to find the

cumulative probability that bulb life is less than or equal to 365 days. Thus, we know the following:

The value of the normal random variable is 365 days.

The mean is equal to 300 days.

The standard deviation is equal to 50 days.

We enter these values into the Normal Distribution Calculator and compute the cumulative

probability. The answer is: P( X < 365) = 0.90. Hence, there is a 90% chance that a light bulb will

burn out within 365 days.

10

Example 2

Suppose scores on an IQ test are normally distributed. If the test has a mean of 100 and a standard

deviation of 10, what is the probability that a person who takes the test will score between 90 and

110?

Solution: Here, we want to know the probability that the test score falls between 90 and 110. The

"trick" to solving this problem is to realize the following:

P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )

We use the Normal Distribution Calculator to compute both probabilities on the right side of the

above equation.

To compute P( X < 110 ), we enter the following inputs into the calculator: The value of the

normal random variable is 110, the mean is 100, and the standard deviation is 10. We find

that P( X < 110 ) is 0.84.

To compute P( X < 90 ), we enter the following inputs into the calculator: The value of the

normal random variable is 90, the mean is 100, and the standard deviation is 10. We find that

P( X < 90 ) is 0.16.

We use these findings to compute our final answer as follows:

P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )

P( 90 < X < 110 ) = 0.84 - 0.16

P( 90 < X < 110 ) = 0.68

Thus, about 68% of the test scores will fall between 90 and 110.

11

References

[1] Spiegel, Murray R. Schaum's Outline of Theory and Problems of Statistics.

New York: Schaum Pub., 1961. Print.

[2] Dodge, Yadolah. The Oxford Dictionary of Statistical Terms. Oxford: Oxford

UP, 2003. Web.

[3] "Normal Distribution" <http://stattrek.com/probabilitydistributions/normal.aspx>.

[4] "Poisson Distribution" <http://stattrek.com/probabilitydistributions/poisson.aspx>.

12

- Cssbb Instructor SampleUploaded byTripuraneni Vidya Sagar
- Syllabus STA302 UB S 2015Uploaded byDavid Ramzy
- 4-Data Cleaning - HandoutUploaded byHarris Afdhal
- Robust Statistical Procedure PAPERUploaded byjiajun898
- EECE 522 Notes_01a Review of Probability.pdfUploaded byRakesh Inani
- Introduction SPCUploaded byMitul
- Spatial Statistics for Remote Sensing Ben Gorte Alfred Stein, Freek Van der Meer, Ben Gorte.pdfUploaded byDouglas
- Statistical Foundations of Control ChartsUploaded byLongshuo Li
- Kemiringan Dan Penggunaan Standar DeviasiUploaded byProdi Arsitektur Untan
- 005 Normal DistributionUploaded byOcireg Llovido
- Why I Will Never Have a GirlfriendUploaded bysarmadosho99
- Automatic differentiation variational inferenceUploaded byPeter
- stat_e2 (1)Uploaded byleyla
- Normal Distribution Practice 1Uploaded byseadiaba
- Design and Analysis of Single Factor ExperimentsUploaded byCotadom
- SimpleLinearRegression.pdfUploaded bykappi4u
- AcceptanceUploaded byNaveen Tripuraneni
- 7 sp b 4 task great debateUploaded byapi-290690409
- IQM - Lecture GuideUploaded byKiran Kumar
- 2.3_Distributions+分布Uploaded byJames Jiang
- MGFUploaded byGauravLalsinghani
- hw6 StatUploaded byyeyushi1991
- 44.8 the Poisson Distribution Qp Ial-cie-maths-s2Uploaded byBalkis
- PSQT- I.T.pdfUploaded byPrasad Pantham
- nor distributionUploaded byChris Mine
- 301403 Statistical Tables.pdfUploaded byRevli Meyhendra Harbangkara
- scimakelatex.20978.Ruben+Judocus.Xander+HendrikUploaded byJohn
- notes#9.pdfUploaded byi1958239
- Mean, Median, ModeUploaded byHairileqhwan Firdaus
- A Method for Determining Optimal Tenant Mix (Including Location)Uploaded byud

- Mean,MedianUploaded byAliNurRahman
- California-Mathematics-Course-1-Student-Textbook.pdfUploaded byVientihAg
- Solutions to Even ProblemsUploaded byAndrey Hcivokram
- Mba Finance ProjectUploaded bysunnyddr
- Chapter 01 notesUploaded bykiller12125
- CGNB293Uploaded byAhmad Zulhilmi
- Bectochem_customer Satisfaction IndexUploaded byRakesh Kumar Purohit
- ECON TEST 1.xlsxUploaded bydumbledoreaaaa
- 1.07 Using Statistics in Research (2009)Uploaded byJosé Juan Góngora Cortés
- Analysis of Extreme Rainfall Using the Log Logistic DistributionUploaded byEliza Maria
- ADL-07-Ver1+Uploaded bySanjeev Kumar
- Aggregate 10% Fines Value BS 812-111_BS en 1097_2Uploaded byrajeshji_000
- 101-1987Uploaded byDaniel Estrada
- June 2014 R QP S1 EdexcelUploaded byFaisal
- intro to RUploaded byMorten Akhøj
- 3 6 a instantchallengeflingmachineUploaded byapi-291536844
- Franco Peracchi - Height and Economic Development in Italy 1730-1980Uploaded byLuigi Torval
- RedCrab 4.47 Manual EndddfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfsdfdsfsdfsdUploaded byNic
- ACT 2Uploaded byrotat2348
- G-9 Chapter 14 StatisticsUploaded byChetan
- Unit 1 Basic Mathematics for ManagementUploaded bykokuei
- 6. Normal Probability Distributions.pdfUploaded byAhlan Jufri Abdullah
- 2015 Live Chat Performance BenchmarksUploaded byJean Gardy Clerveaux
- b Value (Marzziotti Sandri 2003)Uploaded byVicente Bergamini Puglia
- Keep 215Uploaded bysagar
- Empirical Assessment of Weibull DistributionUploaded byMubanga
- Workload ClassificationUploaded byselwinbriggs
- Arithmatic FormulaUploaded bySelva Kumar Krishnan
- Course Outline BUSN-225Uploaded byKevin Kooblall
- staceyclarkqlworksheetUploaded byapi-237082357