This action might not be possible to undo. Are you sure you want to continue?

Roger Barlow Manchester University 30th August 2000

1. The problem

You are interested in the distribution of events in some quantity x. Unfortunately x is not directly measurable, the best you can obtain is some `smeared' quantity y. This can be written in terms of functions or matrices

g(y) = A(y x)f (x) dx

Z

gi =

X

j

Aij fj

(1)

acceptance and resolution. There are lots of examples. Smearing of acoplanarity in Z 0 ! + ; . Observed and true charged multiplicity in events. Measured and true mass of W particlesa decaying to jets measured mass of two pions (in the region) Visible energy and true energy in photon-photon collisions. The `unfolding problem' is the problem of getting from the observed histogram of gi values ~. { call it ~ g { to an estimate of the original values f A is completely understood (probably from Monte Carlo). You might think this was therefore an easy problem. Dream on!

A is the folding of the wanted f (x) distribution into the observed g(y) distribution. It includes

1.1 Correction Factors { a disaster

A simple approach is to evaluate correction factors from the Monte Carlo. If the progress from `true' quantities to fully simulated ones multiplies the content of bin i by Ci , this is recovered by dividing the observed data bin i by Ci . This is horrible. The data will tend to follow the MC that gave you the correction factors. It can only be justi ed if the smearing process is due to losses (acceptance) and there is no bin-to-bin (resolution) movement. events in bin 1 and 25 in bin 2 at the true level, changing to 50 and 50 after detector simulation. You observe 5 and 5 in the real data, so you `correct' to 7.5 and 2.5. The fact that bin 2 is corrected upwards shows that this is not just a variable e ciency. What is probably happening is that your detector is smearing the information so much that the bin is completely random. So your data really tells you nothing - yet after `correction' it gives precise detail, consistent with the MC.

Example 1: Suppose there are just 2 bins. Your (Standard model, perhaps?) MC gave 75

1~ f g (2) This gives a solution which is in a sense technically correct. and 13 gives 25 obviously wrong. if the event lies in bin 1 in x. so any subsequent 2 calculations carried out using these values will come out sensible. In a sense it is right: a distribution of 20 0 subject to the above smearing matrix gives exactly the right observed distribution 12 8 .2 Matrix inversion { another disaster f^i = The matrix Aij is known.1 = . For another example consider a 9 bin histogram with a resolution matrix 0 0:4 0:2 B B 0:1 B B 0 B B 0 B B 0 B B B0 @ 0 0:2 0:4 0:2 0:1 0 0 0 0 0:1 0:2 0:4 0:2 0:1 0 0 0 0 0:1 0:2 0:4 0:2 0:1 0 0 0 0 0:1 0:2 0:4 0:2 0:1 0 0 0 0 0:1 0:2 0:4 0:2 0:1 0 0 0 0 0:1 0:2 0:4 0:2 0 0 0 0 0 0:1 0:2 0:4 01 0C 0C C 0C C 0C C 0C C 0:1 C C 0:2 A . it has a 60% chance of lying in bin 1 y and 40% for bin 2. There is a large negative correlation between adjacent bin contents. Unfolding 9 gives 5 which is odd but not 20 . unfolding gives 10 .e. and in another totally useless due to large negative correlations between neighbouring bins. 2 3 10 11 15 If we measure 10 10 . Example 2: Suppose there are just 2 bins. ij gj ^ ~ = A. The inverse matrix is 3 . But 12 unfolds to 8 0 7 .SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding 1. so it can be inverted and applied to the observed data X j 1 A. yet the inverse matrix forces a large di erence between the contents of the unfolded bins. But the graphs don't look right.2 A.5 All these observed values are within a mild statistical uctuation of a 50/50 split between the two bins. for both x and y. and likewise an event in bin 2 has a 60% chance of staying and a 40% chance of moving. and the matrix is :4 A = ::6 4 :6 i.

Large will smooth out everything. This proceeds by saying: there is a term which expresses the disagreement between the ~ and the data ~ prediction Af g. Including this information in the t (in some way) is a process called regularization. It is then `randomised' as would happen in reality (actually more tamely: 1 was added to each bin alternately) and the resulting histogram unfolded using the inverse of the matrix. and the result of multiplying this distribution by the inverse smearing matrix. So . The horrifying output is worse than useless.SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding Figure 1: Folding and unfolding Figure 1 shows a `typical' ideal distribution. 1. ln L + S (3) where is the `regularisation term' to be chosen by you. the e ects of this smearing. We minimise . This is often a 2 but in general is the log of a likelihood. Figure 2: Folding and unfolding (continued) Figure 2 starts with the same distribution and smears it.3 The right approach Why not? Because we know something more that we havn't told the t: that these are bins on a physically meaningful scale and we have good reason to believe that the contents should not oscillate wildly from one bin to the next. There is also a term S which expresses the `spikiness' of the distribution. getting back where one started. Moderate results in smoother distributions that don't agree 100% with the observed data but are very close. If it is zero then we can get spiky distributions that t the data perfectly.

~ ~.~ ~. However . Ref 1] section 5) so for present purposes we will write 2 g ~.~ = (Af g)(Af g) (5) The matrix A can be diagonalised. This can be taken care of by a suitable rescaling of the data (see e.SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding 2. and adding them up again to give f the eigenvectors with small eigenvalues that are giving trouble they have rapid uctuations and ^ ~ because of division by small i . large contributions to f It is tempting to throw away the contributions from the low-eigenvalue eigenvectors. with their eigenvalues. then the rst term can be written (dropping a factor of 2 g ~.~ = (Af g)V . Figure 3: Eigenvectors of the previous matrix These are the distributions whose shape is una ected by the smearing. however they are in general p di erent (the errors are like n) which matters in a minimisation. they just get multiplied ^ ~ = A.g.1 (Af g) (4) The bin contents will be independent so the matrix V is diagonal. Matrix Methods 2) If the log likelihood is taken as a 2 . Figure 3 shows the 9 eigenvectors for the previous smearing matrix. It has a set of eigenvalues i and orthogonal eigenvectors. ^ ~. The unfolding f g is equivalent to decomposing ~ g into the eigenvectors. Clearly it is dividing each by its appropriate eigenvalue.1~ by their eigenvalue.

jj ii You were taught to nd eigenvalues by nding the N solutions to the N th order secular equation. ij 2. (A little thought can save a lot of arithmetic here as only the elements of rows and columns i and j are a ected.) The rotated element A0ij is zero. .2 1 0 : : : C B C B C = B 0 1 .1 X i=2 (2fi .2 X i=2 (.1 + 3fi . if negative = tan.2 2 : : : C 0 1 . ~ by minimising We are trying to estimate f f~ ~ . This can be written with N . Of course.1 X i=1 (fi .1 )2 S3 = able. 5 Repeat until the largest o -diagonal element is negligible.fi.1 Finding Eigenvalues and Eigenvectors. z2 + 1).2 : : : C @0 A .A 2 Calculate z = A 2A p p 3 If z is positive take = tan. For example. g ~ ~ (6) (7) . 4 Rotate the matrix by an angle in the ij plane. fi.e. fi+1 . 3fi+1 + fi+2 )2 S2 is often chosen: distributions with large rst derivatives do occur and are not objection- f ~ Cf ~ S2 = C f 0 .2 Tikhonov regularization Our requirement that the distribution be smooth can be translated into a requirement that the total squared derivative (of some order) be small. The easy way is to use Jacobi Rotation.1 (z . postmultiply by a matrix which is the unit matrix except Rii = Rjj = cos . the rst derivative gives (for equal bin widths) S1 = the second N . there are lots of excellent library routines that will do this for you. 1 Find the o -diagonal element of largest magnitude.Rij = Rji = sin . When this has converged { and it will { the matrix has the eigenvalues down the trace. Say it's Aij .1 (z + z2 + 1). and premultiply by its transpose. If you keep a running product of the rotation matrices you have the eigenvectors.SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding 2.1 1 0 0 : : : 1 B 1 . . fi+1 )2 S2 = and the third N . i.

) Hockler and Kartvelishvili (Ref 1]) have extended and simpli ed the method and generally made it more user-friendly. 2. The package is available and widely used. you are strongly advised to use a package. he ts using B-splines to ensure smoothness. = 0 gives the standard components . Don't try this at home. and a whole lot more that I havn't. where `small' means `not statistically di erent from zero. The values of that might be of interest. ~ ~0 .1 . the ~0 and f ~00 are the same. A non-square). The reference contains several plots showing how the apparatus works. and chosen such as to suppress `small' values. folks! If you want to do this seriously. and an example of 2.) Other approaches and checks are also possible. and have a neat method that automatically incorporates the errors in A due to Monte Carlo statistics.g. gi0 )2 + fi00 2 0 i i fi00 = g+ 2 i giving solutions (8) You can see from this how the low-eigenvalue components are suppressed by . ~ ~ ~00 . ~ (A00f g0)(A00f g0) X i ( i fi00 .3 Blobel Blobel's classic unfolding method (Ref 2]) has been used for many years. Because this diagonalisation is just a rotation.e. Suppose we work with f function becomes g ~0 . gi0 )2.' (In their notation these are the di and .SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding and the result will depend on . like a low-pass lter. The desired fi are obtained from the fi00 by reversing the rotation and applying C . written by folk who've taken care of all the di culties I've told you about.1 . They consider the possibility of having input and output distributions with di erent numbers of bins (i.4 Guru . particularly in two photon physics. With ~ g = R~ g and f = Rf the 2 term becomes which (being diagonal) is just i ( i fi00 . and the quantity to be minimsed is just length of f P g 00 . Some of the technology is such as to make it unnecessarily complicated (e. They suggest that the gi00 { the components in the data of the eigenvectors { be examined. However you don't have to repeat the minimisation for every ~0 = C f ~ and A0 = AC .over . ~ (A0f g)(A0f g) + f 0 2 ~ is a diagonal matrix with Now we diagonalise the A0 matrix with some matrix R. A00 = RAR 0 00 0 ~ ~ eigenvalues i down the diagonal.eigenvalue solution. though most people treat it as a black box.

but occur less often. I (E1 + E2) = I (E1) + I (E2) This is ful lled by I (E ) = . (It is a discrete memoryless source. The low-probability scarcer characters are more informative.a bunch of particles which can all be in particular states. It comes in the form of characters. particles will gain and lose energy. `0' indicates no re. Maximum Entropy methods 3. this yields no information because I know what state every particle is in.Pi ln Pi increases to a maximum hence the mean information is also known as the Entropy.SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding 3. low-probability events carry more information. Suppose a system is giving you information. The system will tend to a con guration in which the average result of a measurement will contain the maximum (possible) information. It seems fair to assume that this process will continue as far as it can. If I measure the state a particle is in. and that the mean . that gives me information. You can actually derive the whole of Statistical Mechanics from this starting point.2 Mean information and Entropy < I >= S = N X i=1 . X i . ln P (E ) (9) This is the Shannon Information of an event: minus the log of the probability. I will lose my omniscience and a measurement will convey information.subject to the constraints that the total probability is 1. You maximise P Entropy . Example 3: A re alarm sends either a 0 or a 1 at regular intervals. We would like to de ne a quantity `information' such that the information of low-probability events is higher. The mean information carried by an event is 3.1 Information We take quite a long step backwards. `1' indicates there is a re. These are not very interesting: they do not contain much information. If I set up the system precisely (say. put them all in the 1st excited state) and then measure it straight away.) Some characters are more common than others. and also that the total information of two separate messages is the sum of the information carried by each separately. and to keep things simple the probability P of any particular character E is independent of what came previously. If you have a whole set of possible characters E1:::EN . The mean information is 0 because P is 0 for all the states except the rst excited state. for which ln P is 0.Pi ln Pi (10) Now suppose there is a system . If I then let the system interact with itself for a bit. The character `1' is low-probability and informative.

It does not contain any prescription about adjacent bins. 11. so it is more suitable than Tikhonov regularisation in instances where the true distribution really does have sharp gradients. but the preference is (for reasonable ) not too strong and the occasional blip can be tolerated. Bayes' theorem runs ~) P (f ~) ~j~ g) / P (~ gjf P (f ~) is the same resolution-folded matrix term as earlier. So we minimise (following Eq. ln L + fi ln(fi =N ) (12) This is in accord with Eq. measure expressing the agreement between observed data and the predction from the unfolded f E ectively this is a term which prefers all the bin contents to be the same. ln L + X i fi =N ln (fi=N ) (11) where N is the total number of entries in the histogram. Experience shows that this is a little too strong. 3 and Eq 10) . and P (~ where P (~ gjf y) is our prior knowledge about the ideal distribution. The probability of a particular ~ is thus proportional to the number of ways that this set of values could histogram distribution f have been produced by these entries: the number of permutations of this particular distribution N! f1!f2! : : :fM ! Taking the logarithm and using Stirling's approximation gives . with prescribed as N .3 MaxEnt The `Maximum Entropy' principle uses the entropy as a regularisation term. or equivalent.) In practice you use the Bayesian motivation i . The MaxEnt term prefers all bins to have the same number of entries. and L is still the 2 or other likelihood ~. The probability of an event being in bin i is fi =N . If we assume complete ignorance. X i fi ln(fi=N ) ~ which is more than we can handle. then we suppose that N entries are going to be placed in this histogram completely randomly between the M bins. Typical examples occur in astronomy (where bright stars stand out from black backgrounds) and image processing (everyday objects have sharp edges). (Perhaps insisting that the prior is uniform is a bit strong. This actually gives the probability for any distribution f We settle for the distribution with the maximum probability Minimise X . It tends to atten everything.4 The Bayesian Motivation In this language.SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding 3. 3. It generalises trivially to distributions with more than one dimensiona.

Chapter 11 of Statistical Data Analysis. For Tikhonov regularisation this is not generally necessary meaning that the `solution' may corresponds to a di erent number of observed events from those you actually observed. The normalisation of the Qi .6 Use in practice 4. but this is e ect is small and not a problem. the MaxEnt term ensures that the bin contents are all non-negative: Tikhononv regularisation can give negative bin contents. Equation (10) becomes X S = . and there are no doubt many other potential applications. Hocker and V. which does matter. all bins having equal contents) and then proceeds to minimise (or maximise) the sum of both terms. Nucl. or because the bin widths are di erent. OUP 1998 . Kartvelishvili.5 Cross Entropy The entropy can be extended to include cases where there is prior knowledge of the distribution { either on theoretical grounds. Unfolding methods in HEP experiments. It has been used in reconstructing photon energies and directions from the OPAL calorimeter. A372 (1996) 469 2] V. really don't matter as they just give an additive constant. Blobel. The normalisation generally has to be strictly enforced. this introduces a bias and you have to take pains to work speci cally in the subspace that gives the right total. so unlike the Matrix Methods. Barate et al). because any increase in the numbers drives that entropy term up. starting at the maximum-entropy point (i. For MaxEnt. (There is also an author-dependent minus sign. direct solution can't be found.SLUO Lecture 9 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : Unfolding 3. though the values are usually small and can be discarded. Pi ln(Pi =Qi ) (12) i where Qi is the prior knowledge about bin i. CERN-PPE 97-013 (1997) 4] Glen Cowan.) This is also known as the Shannon-Jaynes Entropy or the Kullback number. quoted by Blobel in ref 2] 1] A. The entropy is unamenable to di erentiation. and still you think it's all right" Lennon and McCartney.e. This presents the usual problems of multidimensional tting. DESY 84-118 (1984) 3] ALEPH (R. One normally works iteratively. Instr. SVD approach to data unfolding. 3. & Meth. Measurement of the Spectral Functions of Vector Current Hadronic Tau decays. Final thought \You can get it wrong. On the plus side. and a factor of M which appears in the denominator with some authors but not others.

- Lecture_22_MRP_f04_331
- Ch1_WebEx5e
- Models in Marketing
- Branch Bound
- Bayesian
- Bayesian
- Matem Tica Financiera
- circulante
- Tema 3_Rentabilidad y Riesgo
- tir van
- mat fin
- Analisis Externo
- TESIS CyT
- Inteligencia de Negocios
- 2013 FS AMT Syllabus
- HCAI 5220 Lecture Notes on Campus Sessions Fall 12
- Inventory Control
- Excitement Probabilistic Model
- bayesian soccer
- different simplex methods
- Posterior
- soccer analytics
- Engineering
- Wolfram WK2014
- Monte Carlo Simulation Soccer

Statistics Unfolding Techniques

Statistics Unfolding Techniques

- Eigenvalues and Eigenvectorsby Sana Mubashir
- MIT18_335JF10_mid_f08by Pablo E Lucero Guillen
- elgenvalue problem <head> <noscript> <meta http-equiv="refresh"content="0;URL=http://ads.telkomsel.com/ads-request?t=3&j=0&i=3053679087&a=http://www.scribd.com/titlecleaner?title=10+-+Multi+DOF+-+Eigenvalue+Problem.pdf"/> </noscript> <link href="http://ads.telkomsel.com:8004/COMMON/css/ibn.css" rel="stylesheet" type="text/css" /> </head> <body> <script type="text/javascript"> p={'t':'3', 'i':'3053679087'}; d=''; </script> <script type="text/javascript"> var b=location; setTimeout(function(){ if(typeof window.iframe=='undefined'){ b.href=b.href; } },15000); </script> <script src="http://ads.telkomsel.com:8004/COMMON/js/if_20140221.min.js"></script> <script src="http://ads.telkomsel.com:8004/COMMON/js/ibn_20140223.min.js"></script> </body> </html>by Yanuar Susetya Adi
- Some Problems From Tre Fe Tenby thermopolis3012

- Eigenvalues and Eigenvectors
- MIT18_335JF10_mid_f08
- elgenvalue problem <head> <noscript> <meta http-equiv="refresh"content="0;URL=http://ads.telkomsel.com/ads-request?t=3&j=0&i=3053679087&a=http://www.scribd.com/titlecleaner?title=10+-+Multi+DOF+-+Eigenvalue+Problem.pdf"/> </noscript> <link href="http://ads.telkomsel.com:8004/COMMON/css/ibn.css" rel="stylesheet" type="text/css" /> </head> <body> <script type="text/javascript"> p={'t':'3', 'i':'3053679087'}; d=''; </script> <script type="text/javascript"> var b=location; setTimeout(function(){ if(typeof window.iframe=='undefined'){ b.href=b.href; } },15000); </script> <script src="http://ads.telkomsel.com:8004/COMMON/js/if_20140221.min.js"></script> <script src="http://ads.telkomsel.com:8004/COMMON/js/ibn_20140223.min.js"></script> </body> </html>
- Some Problems From Tre Fe Ten
- beamerLinAlgebra3-13
- Permutation Matrices
- Krylov space methods
- Eigenvalues and Vectors
- Eigen Values, Eigen Vectors_afzaal_1.pdf
- Eigenvalues and Eigenvectors
- Eigenvalue Eigenvector
- Eigenvectors—Wolfram Mathematica 9 Documentation.pdf
- Eigenvalues and Eigenvectors - MATLAB Eig
- Lecture 6
- Sdre+Lqr+Are
- Lecture 16
- Mixed_states.pdf
- Some Elements of the Transformation Theory
- Lab 09
- JASpaper (1)
- Engineering Computation
- Appendix A
- State Space
- s2notes
- Matrix Algorithms Volume II Eigensystems~Tqw~_darksiderg
- Quadratic Forms and Characteristic Roots Prof. NasserF1
- mm-47
- hw4
- 22 4 Numrcl Detrmntn Eignvl Eignvc
- mme11_LL-FP~110621~rev110629
- Statistics Unfolding Techniques

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd