This action might not be possible to undo. Are you sure you want to continue?

# Probability, Risk and(Anti)fragility

not Statistics Fragility (Chapter 2) can be defined as an accelerating sensitivity to a harmful stressor: this response plots as a concave curve and mathematically culminates in more harm than benefit from the disorder cluster [(i) uncertainty. The problem with risk management is that “past” time series can be (and actually are) unreliable. This explains in a way why the detection of fragility is vastly more potent than that of risk -and much easier. incomplete knowledge.nb 1 Risk is Not in The Past (the Turkey Problem) Introduction: Fragility. (x) the unknown. or to be able to predict rare and random ('black swan') events. if fact surviving objects have to have had a “rosy” past. The relation of fragility. (iv) chance. you finance-imbecile: from the present. Naive inference from time series is incompatible with rigorous statistical inference.N. (ii) variability. (v) chaos. Not really. All we need is to be able to assess whether the item is accelerating towards harm or benefit. f L is defined as over the interval (t0 . “Where is he going to get the risk from since we cannot get it from the past? from the future?”. yet workers with time series believe that it is statistical inference. the present state of the system. TD . (xv) dispersion of outcomes. (xi) randomness. We do not need to know the history and statistics of an item to measure its fragility or antifragility. The risk of breaking of the coffee cup is not necessarily in the past time series of the variable. a standard sequence X= 8Xt0 +i Dt <i=0 as the discretely monitored history of a function of the process Xt X HA. (vi) volatility. convexity and sensitivity to disorder is thus mathematical and not derived from empirical data. Turkey Problems N Definition: Take. (ix) time. Taleb. he wrote. Some finance journalist (Bloomberg) was commenting on my statement in Antifragile about our chronic inability to get the risk of a variable from the past with economic time series. (xiv) error. (vii) disorder.2 (Anti)Fragility. (xiii) stressor. as of time T. (viii) entropy. the estimator MT . (iii) imperfect. T=t0 +N Dt. † But this is not just a problem with journalism. (xii) turmoil. N. (xvi) unknowledge. Antifragility is the opposite. producing a convex response that leads to more benefit than harm.

N. commodities.nb 3 X MT HA. so |E[MX Dt (A. these tools invite foolish risk taking. f(X)=X. N. Neither do alternative techniques yield reliable measures of rare events. ¶L. with a threshold x. etc. X 4 L of the fourth noncentral moment MtX HH-¶.f) where f(x) =x and A is defined on the domain of the process X. it was that we would never be able to determine "how fat" the tails were. A = (-¶. Let us rephrase: Standard statistical theory doesn’t allow claims on estimators made in a given set unless these can “generalize”. Never. f(X)=1. In other words. f L ª N ⁄i=0 1A f HXt0 +i Dt L N ⁄i=0 1A where 1 A : X ö{0. The so-called "p values" you find in studies have no meaning with economic and financial variables. The measure might be useful for the knowledge of a process. regression. or value-at-risk. the Euclidian. Q(N) is the contribution of the maximum quartic variations i=0 Max 8Xt-i Dt 4 < QHNL ª N ⁄i=0 HXt-i Dt L4 N i=0 . without assigning an exact value. are not valid scientifically (except in some rare cases where the variable is bounded). t] if and only if holds in expectation over the period Xt+i Dt . reproduce out of sample. the crash of 1987. and they effectively don't. the kind of stuff you find in textbooks. standard measures from x. correlation. except that we can tell if a remote event is underpriced. and other variables. f(x) =x Criterion: The measures M or S are considered to be an estimator over interval (t. X a) Standard Estimator. f)]|< x .MX (A. into the part of the series that has not taken place (or not seen). From Taleb (2009).i DtL Take the measure MtX HH-¶. that is. and f is a function of X. determined 80% of the kurtosis. such as moments of order z. Even the more sophisticated techniques of stochastic calculus used in mathematical finance do not work in economics except in selected pockets. Further.e. Xt ª log P HtL P Ht .N Dt. across counterfactuals of the process. that is. using Log returns. The results of most papers in economics based on these standard statistical methods—the kind of stuff people learn in statistics class—are thus not expected to replicate. and N th moment. K]. terrorism. i. one single day in 40 years. but remain insufficient for decision making as the decision-maker may be concerned for risk management purposes with the left tail (for distributions that are not entirely skewed.. ¶L. stock market. such as purely loss functions such as damage from earthquakes. X 4 L ª N ⁄i=0 HXt-i Dt L4 N N and the N-sample maximum quartic observation Max{Xt-i Dt 4 < . Actually. Mt (A. or L-2 norm). Taleb. the first moment. for time series.S. The same problem is found with interest and exchange rates. "as of period" T. a single day. and f(X) = X N correspond to the probability. or the role of remote events in determining the total properties.f)]. for t>t. f Lª N ⁄i=0 1A Xt0 +i Dt N ⁄i=0 1A .1} is an indicator function taking values 1 if X œ A and 0 otherwise. This is standard sampling theory. variance. can explain the bulk of the "kurtosis". that is. The shortfall S= E[M|X<K] estimated by MT HA.. respectively. The implication is that those tools used in economics that are based on squaring variables (more technically. such as standard deviation. The problem is not just that the data had "fat tails".000.(Anti)Fragility. a measure of "fat tails". An Application: Test of Turkey Problem on Macroeconomic data Performance of Standard Parametric Risk Estimators. For the U. it should have some t+i t stability for the estimator not to be considered random: it estimates the “true” value of the variable. it is at the core of statistics. X b) Standard Risk Estimator. something people knew but sort of wanted to forget. f(x)= xn (Norm !2 ) With economic variables one single observation in 10. both a measure how much the distribution under consideration departs from the standard Gaussian. etc.

4 (Anti)Fragility.79 0. the distribution of the square of a Chi-square distributed variable) show the maximum contribution should be around .008 ± .27 0.27 0. 25. N. 56. the fourth moment expresses the stability of the second moment.002 0.3 0. 38. f(x)= x or |x| (Norm !1) Does the past resemble the future? M@t+1D 0.. 28. 48.31 0.79 0.001 .004 0. Description of dataset: The entire dataset Naively.. For a Gaussian (i.0028 Performance of Standard NonParametric Risk Estimators. Taleb.2 N HyearsL 46.25 0. 31. 24. 26. 16.94 0. 18.25 0.48 0. 23. 17. 19.e.003 0.74 0. 48.75 0..L 0.nb VARIABLE Silver SP500 CrudeOil Short Sterling Heating Oil Nikkei FTSE JGB Eurodollar Depo 1M Sugar Ò11 Yen Bovespa Eurodollar Depo 3M CT DAX Q HMax Quartic Contr. The higher the variations the higher.N.72 0.54 0.

025 0.-2 standard deviations (equivalent)]. f(x) = x M@t+1D 0.010 0. 252 days.003 0.0004 0.004 0. f(x)= |x| Typical Manifestations of The Turkey Surprise . that is mean deviation.t+1].030 Fig 3 The “regular” is predictive of the regular.0003 Concentration of tail events without successors 0. for macroeconomic data using extreme deviations.0001 0. Taleb.0001 0.001 0.020 0.(Anti)Fragility.005 M@tD 0.015 0. where t= 1year.0002 0.001 0.030 0. 2009) M@t+1D Concentration of tail events without predecessors 0.N.-4 standard deviations (equivalent)].025 0.0003 0.nb 5 M@t+1D 0.¶) .003 0. Taleb.0004 0. f(x) = x (replication of data from The Fourth Quadrant.010 0.0005 M@tD Fig 2 This are a lot worse for large deviations A= (-¶ .002 0. N. A= (-¶ . A= (-¶ .0002 0.005 0.002 0.020 0. Comparing M[t] and M[t+1 year] for macroeconomic data using regular deviations.004 M@tD Fig 1 Comparing M[t-1.015 0. t] and M[t.

that is tail exponent a= 1.N. take Pinker(2011) claiming that the generating process has a tail exponent ~1. illustrating the inability of a set to deliver true probabilities. these are realizations of the exact same process. Fig x First 100 years (Sample Path): A Monte Carlo generated realization of a process of the "80/20 or 80/02 style". The next two figures show the realizations of two subsamples.nb 10 200 -10 -20 -30 -40 -50 400 600 800 1000 Fig x The Turkey Problem (The Black Swan. For instance. . the second 100 years dwarf the first. seen with a longer widow and at a different scale. N. one before.15 and drawing quantitative conclusions from it. Taleb. 2007/2010) When the generating process is powerlaw with low exponent. plenty of confusion can take place. and the other after the turkey problem.6 (Anti)Fragility.1 Fig x: The Turkey Surprise: Now 200 years.

d. From his statements. Goodhart’s law. In other words. M* cannot be deemed "evidence" by itself.nb 7 Summary and Conclusion The Problem With The Use of Statistics in Social Science Many social scientists do not have a clear idea of the difference between science and journalism. Much of econometrics/risk management methods do not meet this simple point and the rigor required by orthodox. Naively rare events have little data hence what estimator we may have is noisier. Take M* the estimator we saw above from the realizations (a sample path) for some process. Survivorship Bias Property: E[M*-M ] increases under the presence of an absorbing barrier for the process. Now Pinker is excusable. c.. Where E is the expectation operator under "real-world" probability measure P: Tails Sampling Property: E[|M*-M|] increases in with fat-tailedness (the mean deviation of M* seen from the realizations in different samples of the same process). No probability without metaprobability. When someone says: "Crime rate in NYC dropped between 2000 and 2010". The idea that an estimator is not about fitness to past data. Science is not about making claims about a sample. Metadistributions matter more with tail events. (Casanova effect) Left Tail Sample Insufficiency: E[M*-M] increases with negative skeweness of the true underying variable. or alternatively. (The last two properties corresponds to the black swan problem). Pinker seems to be aware that M* may have dropped (which is a straight equality) and sort of perhaps we might not be able to make claims on M which might not have really been dropping. but using a sample to make general claims and discuss properties that apply outside the sample. never on this point. 2001. basic statistical theory. So I rapidly jot down a few rules before showing proofs and derivations (limiting M to the arithmetic mean). They are not "anecdotal". hence the claim can be deemed merely journalistic. etc. Power of Extreme Deviations (N=1 is OK): Under fat tails. m[M*].(Anti)Fragility.Mathematical argument about statistical decidability. it looks like. but the imprecision in the calibration (or parameter errors) percolates in the tails. the differnce between M* and M is negligible. not M the true mean. but related to how it can capture future realizations of a process seems absent from the discourse. Acting on statistical information (a metric. . and with fat-tailed distributions. negative deviations from the mean are more informational than positive deviations. science).N. the claim is about M* the observed mean. Taleb and Douady. fat tails tend to mask the distributional properties. And in some areas not involving time series. b. and journalists are there to report "facts" not theories.Statistical rigor (or Pinker Problem). The practice is widespread in social science where academics use mechanistic techniques of statistics without understanding the properties of the statistical claims.Statistical argument on the limit of knowledge of tail events. † The hard problem (Taleb and Pilpel. Taleb. What I just wrote is at the foundation of statistics (and. Bayesians disagree on how M* converges to M. N. The Problem of Past Time Series The four aspects of what we will call the nonreplicability issue. large deviations from the mean are vastly more informational than small ones. a response) changes the statistical properties of some processes. Problems of replicability are acute for tail events. 2009): We need to specify an a priori probability distribution from which we depend. † Both problems are bridged in that a nested stochastization of standard deviation (or the scale of the parameters) for a Gaussian turn a thin-tailed distribution into a powerlaw (and stochastization that includes the mean turns it into a jump-diffusion or mixed-Poisson).Economic arguments: The Friedman-Phelps and Lucas critiques. No scientific and causal statement should be made from M* on "why violence has dropped" unless one establishes a link to M the true mean. particularly for mesures that are in the tails: a. and M the "true" mean that would emanate from knowledge of the generating process for such variable. Counterfactual Property: Another way to view the previous point. Asymmetry in Inference: Under both negative skewness and fat tails. propose a metadistribution with compact support. Tail events are impossible to price owing to the limitations from the size of the sample. or the one between rigorous empiricism and anecdotal statements. not scientific. † The soft problem: we accept the probability distribution. The distance between different values of M* one gets from repeated sampling of the process (say counterfactual history) increases with fat tails. Working with M* cannot be called "empiricism".

. ‡ Jaynes 2003 (p. II): Let X1 .. The first.8 (Anti)Fragility.nb 2 Preasymptotics and The Central Limit in the Real World An Erroneous Notion of Limit: Take the conventional formulation of the Central Limit Theorem (Grimmet & Stirzaker.. under aggregation. Granted Jaynes is still too Platonic (he falls headlong for the Gaussian by mixing thermodynamics and information). the Gaussian zone should cover more grounds. about which later. Effectively we are dealing with a double problem. The consequence is that.u § Z = ⁄n Xi i=0 n s § uF = 1 2p ‡ e 2 „Z -u u Z2 So the distribution is going to be: . Vol. the number of observations increases. Granted convergence "in distribution" is about the weakest form of convergence. but in a Gaussian way --indeed. and fewer tail events. be a sequence of independent identically distributed random variables with mean m & variance s 2 satisfying m< ¶ and 0 <s 2 <¶... Feller 1971. whatever your distribution (assuming one mode). Assume 0 mean for simplicity (and symmetry. But we accord with him on this point --along with the definition of probability as information incompleteness.x) PB..X2 .) Any attempt to go directly to the limit can result in nonsense". corresponds to the abuses of measure theory: Some properties that hold at infinity might not hold in all limiting processes --a manifestation of the classical problem of uniform and pointwise convergence. absence of skewness to simplify). your sample is going to be skewed to deliver more central observations. Take the sum of of random independent variables Xi with finite variance under distribution j(X). Taleb. the sum of these variables will converge "much" faster in the body of the distribution than in the tails. as uncovered by Jaynes.N. A more useful formulation of the Central Limit Theorem (Kolmogorov et al. Simply. then ⁄N Xi . The Problem of Convergence The CLT works does not fill-in uniformily. Now how should we look at the Central Limit Theorem? Let us see how we arrive to it assuming "independence". The second problem is that we do not have a "clean" limiting process --the process is itself idealized. 1982. As N. but not in the "tails". disturbingly so. This quick note shows the intuition of the convergence and presents the difference between distributions. 1L as n Ø ¶ D n Where Ø is converges "in distribution".N m i=1 s D Ø N H0. but contains no symbol indicating which limiting process was used (.. N.44):"The danger is that the present measure theory notation presupposes the infinite limit already accomplished.

j Hn xL depends j HxL 2) Case 2: The distribution j(x) is scale free. the additivity of the Log Characteristic function under convolution makes it easy to see the speed of the convergence to the Gaussian.nb 9 1 . j HxL j Hn xL in other words the distribution has an exponential .2 „ Z for . C(n) is the derivative of the log of the characteristic function f which we convolute N times divided by the second cumulant (i. Fat tails implies that higher moments implode --not just the 4th . i.‡ e.u] Where j'[n] is the n-summed distribution of j.u § z § u -u u Z2 inside the "tunnel" [-u.. Distribution PDF N . leading to a huge. I break it into two situations: 1) Case 1: The distribution j(x) is not scale free. on n not x. but unlikely jump. Where (1-p) is very small.u] Clearly we do not have a "tunnel".z Ø 0 Since C(N+M)=C(N)+C(M). Take x(t) the characteristic function. Table of Normalized Cumulants -Speed of Convergence (Dividing by sn where n is the order of the cumulant).N. x n the one under n-convolutions. for x large. but rather a statistical area for crossover points. tail e -k x j Hn xL j H2 n xL > . 3) Using Log Cumulants & Observing Convergence to the Gaussian The normalized cumulant of order n. I can take means of 0. sD ‰ Hx-mL2 2 s2 PoissonHlL ‰-l lx x! ExponentialHlL ‰-x l l Âz GHa. How j'[n] behaves is a bit interesting here --it is distribution dependent. s1 ) with probability p and with probability (1-p) follows another Gaussian (m2 .e. m2 is very large and s2 small we can be dealing with a jump (at the limit it becomes a Poisson). N. s2 ). and variance in the small probability case to be very large. NL = H-ÂLn !n logHfN L H-!2 logHfLN Ln-1 ê.u] --the odds of falling inside the tunnel itself and u -¶ ¶ u ‡ j '@nD HZL „ z + ‡ j '@nD HZL „ z outside the tunnel [-u. Dealing With the Distribution of the Summed distribution j Assume the very simple case of a mixed distribution. second moment).convoluted Log Characteristic 2 nd Cum 3 rd 4 th Normal@m. for x large..e. Taleb. bL b-a ‰ x b xa-1 2p s GHaL l l-Â z N log ‰ 1 0 0 Â z m- z2 s 2 2 N logI‰I-1+‰ 1 1 Nl 1 N 2 l2 Ml M N logJ 1 2l N 3! l2 N2 N N logHH1 . a route I am taking here for simplification of the calculations.e. i. Alternatively. In other words:it has all the moments. large.(Anti)Fragility.Â b zL-a L 1 2 abN 3! a2 b2 N 2 . where X follows a Gaussian (m1 . CHn. Width of the Tunnel [-u..

pL ‰ N JlogJ 3 †z§ + 1N - N logH2 †z§2 K2 H2 †z§LL 1 0 -I3 H-1 + pL p Hs2 .Discussion Note on Chebyshev's Inequality and upper bound on deviations under finite variance Even then finite variance not considering that it still does not mean much. µ 1015 9.H-1 + pL s2 L5 M 1 2 s2 L3 M 2 1 Ind Ind Ind Ind 1 Ñ Ind Ind Ind Note: On "Infinite Kurtosis". µ 10 18 . Consider Chebyshev's inequality: P@X > aD § s2 a2 1 n2 P@X > n s D § Which effectively accommodate power laws but puts a bound on the probability distribution of large deviations --but still significant. µ 1011 2. µ 104 3.s2 L2 M 1 2 ë IN 2 Hp s2 .nb 5 th 6 th 7 th 8 th 9 th 10 th Distribution 0 0 0 0 0 0 Mixed Gaussians HStoch VolL x2 2 s1 2 1 N 3 l3 1 N 4 l4 1 N 5 l5 1 N 6 l6 1 N 7 l7 1 N 8 l8 4! l3 N3 5! l4 N4 6! l5 N5 7! l6 N6 8! l7 N7 9! l8 N8 4! a3 b3 N 3 5! a4 b4 N 4 6! a5 b5 N 5 7! a6 b6 N 6 8! a7 b7 N 7 9! a8 b8 N 8 StudentTH3L x2 2 s2 2 StudentTH4L PDF N . Taleb. µ 106 1. µ 102 3.10 (Anti)Fragility.N.convoluted Log Characteristic 2 nd Cum 3 rd 4 th 5 th 6 th p ‰ 2 p s1 + H1 . µ 10 9 Chebyshev Upper Bound 9 16 25 36 49 64 81 8.H-1 + pL s2 L3 M 1 2 0 I15 H-1 + pL p H-1 + 2 pL Hs2 1 ë IN 4 Hp s2 . Deviation 3 4 5 6 7 8 9 Gaussian 7. The Effect of Finiteness of Variance This table shows the probability of exceeding a certain s for the Gaussian and the lower on probability limit for any distribution with finite variance. N.pL z2 s1 2 2 ‰ 6 z2 s 2 2 2 3 2 2 p s2 - p Ix2 +3M 12 I 3 †z§N 1 x2 +4 M 5ê2 N log p ‰ + H1 .

nb 11 10 1.11159 Standard Deviation= 0. Thin Tailed Distribution Let us proceed with a simple example.07344 -x+a Let us fit to the sample an Extreme Value Distribution (Gumbel) with location and scale parameters a and b.000. M=10. j =i=1 Assume we do so M times.N. The Extremum of a Gaussian variable: Say we generate N Gaussian variables 8Zi <N with mean 0 and unitary standard deviation. The problem is the calibration and parameter uncertainty --in the real world we don’t know the parameters. Taleb. What is Extreme Value Theory? A Simplified Exposition Case 1.a. N.b) = ‰ -‰ b + -x+a b b . This is a short presentation of the idea. On paper it looks great. j =i=1 = j=1 The next figure will plot a histogram of the result. But only on paper. Median = 4. We get the Mean of the maxima = 4. µ 1023 100 Extreme Value Theory: Fuhgetaboudit Extreme Value Theory has been considered a panacea for dealing with extreme events by some “risk modelers” .286938. followed by an exposition of the difficulty. here N= 30.000. to get M samples of maxima for the set E E = 9Max 9Zi. respectively: f(x.(Anti)Fragility. We take the upper bound Ej for the N-size sample run j E j = Max 9Zi. The ranges in the probabilities generated we get are monstrous. N M N Figure 1: Taking M samples of Gaussian maxima. and take i=1 the highest value we find.

nb Figure 2: Fitting an extreme value distribution (Gumbel) a= 3. This time it is not the Gumbel. beautiful. with N random powerlaw distributed variables Zi . The Frechet distribution a=3.235239 So far. exactly as before. But next two graphs shows the fit more closely.N. but the Fréchet distribution that would fit the result. Again.12 (Anti)Fragility. Let us next move to fat(ter) tails. we take the upper bound. Fat-Tailed Distribution Now let us generate. Case 2. . Fréchet f(x. a.97904. N. b= 0. with tail exponent m=3. for x>0 Figure 3: Fitting a Fréchet distribution tothe Student T generated with m=3 degrees of freedom. Taleb.b)= ‰ x -a b aJ N b x -1-a b . b=32 fits up to higher values of E. generated from a Student T Distribution with 3 degrees of freedom. but change the distribution.

then get the corresponding values. N. the a of the distribution.N.(Anti)Fragility. as these “risk modelers” do. The next table illustrates the different calibrations of PK the probabilities that the maximum exceeds a certain value K (as a multiple of b under different values of K and a. the rest of course owing to sample insuficiency for extremely large values. with the assumed parameters. as the points tend to fall to the right. a bias that typically causes the underestimation of tails. assuming (generously) that we know the distribution.nb 13 Figure 5: Q-Q plot. Taleb. Fits up to extremely high values of E. we don’t quite know the calibration. . In the real world. Figure 4: Seen more closely How Extreme Value Has a Severe Inverse Problem In the Real World In the previous case we start with the distribution. So here we go with the inverse problem.

5 10. Taleb.841 1000.5 1778.78 5623. and typically overstimated.103 140. 2.25 2.4 35 777.50926 12.5 1.5 4.5 846.91 10 000.5 2. So we can see that we get the probabilities mixed up > an order of magnitude.5042 42.328 316. N.0363 47.3 56 234.5031 36.52773 4.5083 18.46931 5.47 8000.501 178.3 31 623. .728 562.9437 189.14 (Anti)Fragility.6 75 659.3507 9.78 3162.5 16 918.5 3.048 81.75 3. 1 P>3 b 1 P>10 b 1 P>20 b 3. 338 359.nb a 1. 4.35 3783.25 3. 1.71218 7.6 100 001.2 µ 106 Consider that the error in estimating the a of a distribution is quite large.5 17 783.2875 32. In other words the imprecision in the computation of the a compounds in the evaluation of the probabilities of extreme values.0196 27. often > 1/2.8 160 000.51319 µ 106 3.649 400.0938 21.797 185.3517 16.2672 62.7356 100.1254 56.75 2.7968 89. 3. 20.25 1. 715 542.397 1789.N.25 4.75 4.75 5.501 107. 1.141 243.

N. 3. This explains why. (More technically. “casualties from war”. the “long shot bias” is misapplied in real life variables. Binaries are effectively bets on probability. PHX > X0 L> .nb 15 3 On the Difference Between Binaries and Vanillas This explains how and where prediction markets (or. This discussion is based on Taleb (1997) showing the difference between a binary and a vanilla option. 3. 4. more general discussions of betting matters) do not correspond to reality and have little to do with exposures to fat tails and “Black Swan” effects. etc. The idea of “clipping tails” of exposures transforms them into such a category.) Example: “prediction market”. 2. or a lottery ticket. Rise in complexity lowers the value of the binary and increases that of the exposure.N. “growth”. Definitions A binary bet (or just “a binary” or “a digital”): a outcome with payoff 0 or 1 (or. This show show. and that it is a pair probability ä payoff that matters. “market crash”.Etc. Any statistic based on YES/NO switch. “what happens to the probability of a deviation >1s when you fatten the tail (while preserving other properties)?”. Some direct applications: 1. The Problem The properties of binaries diverge from those of vanilla exposures. ludic fallacy (using the world of games to apply to real life). Wrong. “success”. When I ask economists or social scientists. while Nate Silver to be immune to my problem. Also called digital. “epidemics”. This is normal in the presence of skewness and extremely common with economic variables. When the boundary is close. rather the pair probability ä payoff A bounded exposure: an exposure(vanilla) with an upper and lower bound: say an insurance policy with a cap.. Exposures are generally “expectations”.(Anti)Fragility. it approaches a binary bet in properties. They repond differently to fat-tailedness (sometimes in opposite directions). but with implications. . in other words. “inflation”. Taleb. Philosophers have a related problem called the lottery paradox which in statistical terms is not a paradox. why political predictions are more robut than economic ones. the digital and the vanilla diverge.1. They have diametrically opposite responses to skewness. about everything. “election”. The Elementary Betting Mistake One can hold beliefs that a variable can go lower yet bet that it is going higher. When the boundary is remote (and unknown). Elementary facts. Fat tails makes binaries more tractable. They are rarely ecological. This note is to show how conflation of the two takes place: prediction markets. yes/no. it can be treated like a pure exposure. for instance. but EHXL < EHX0 L. 1. or the arithmetic mean.Why prediction markets provide very limited information outside specific domains.Studies of “long shot biases” that typically apply to binaries should not port to vanillas. almost all answer: it increases (so far all have made the mistake). They miss the idea that fat tails is the contribution of the extreme events to the total properties. not just probability.. except for political predictions.Many are surprised that I find many econometricians total charlatans. The Elementary Fat Tails Mistake 1 2 A slightly more difficult problem. -1. 2. they are mapped by the Heaviside function. most games and “lottery tickets”. Simply.) A exposure or “vanilla”: an outcome with no open limit: say “revenues”. never bets on probability.

y is the distribution of exit time t. giving the illusion of absence of volatility when in fact events are delayed and made worse (my critique of the “Great Moderation”). I’ve asked variants of the same question.nb A slightly more difficult problem. where a is the "vvol" (which is variance preserving.logHLLL 1 H L n 2 ‰. s).2 %. What is the probability of exceeding one standard deviation? P>1 s = 1 1 2 erfcK- 1 2 O . They miss the idea that fat tails is the contribution of the extreme events to the total properties. Why? “Because there are more deviations.6 0.H]} From Taleb (1997) we have the following approximation yH t sL = m 1 HlogHHL . Stopping Time & Fattening of the tails of a Brownian Motion: Consider the distribution of the time it takes for a continuously monitored Brownian motion S to exit from a "tunnel" with a lower bound L and an upper bound H. there are fewer deviations: stocks spend between 78% and 98% between ± 1 standard deviations (computed from past samples). fatter tails makes an exit (at some sigma) take longer.logHSLL logHHL .logHSLL logHHL .logHLL - H sin n p HlogHHL . P>1 s = P< 1 s >15. not just probability. where erfc is the complimentary error function. using a standard method of linear combination of two Gaussians with two standard deviations separated by s 1 + a and s 1 . almost all answer: it increases (so far all have made the mistake). as a standard deviation-preserving spreading gives the same qualitative result). N.2% of the time between ± 1 standard deviation. how much time do stocks spend between ± 1 standard deviations?” The answer has been invariably “lower”. Counterintuitively.N. 0. We notice that higher peak ïlower probability of nothing leaving the ±1 s tunnel The Event Timing Mistake Fatter tails increases time spent between deviations. for different values of a as we can see. In finance. You are likely to spend more time inside the tunnel --since exits are far more dramatic. Wrong. Let us fatten the tail.2 0. “The Gaussian distribution spends 68. a technically of no big effect here. When I ask economists or social scientists.a .86% and the probability of staying within the “stability tunnel” between ± 1 s is > 68.3 0.8 It s M p s2 n2 p2 t s2 2 HlogHHL-logHLLL2 1 2 ‚ n=1 H-1L ‰ n S L sin n p HlogHLL . Such a method leads to immediate raising of the Kurtosis by a factor of I1 + a2 M since E Ix4 M E Ix2 M 2 = 3 Ha2 + 1L 1 2 erfc 2 1 1-a 1 2 erfc 2 1 a+1 P>1 s = P< 1 s = 1 - So then. where t ª inf {t: S – [L.” Sorry.5 0. Assume m=0 for the exercise.logHLL and the fatter-tailed distribution from mixing Brownians with s separared by a coefficient a: .4 0. and that it is a pair probability ä payoff that matters.16 (Anti)Fragility.1 -4 -2 2 4 Fatter and fatter tails: different values of a. “what happens to the probability of a deviation >1s when you fatten the tail (while preserving other properties)?”. the probability of staying inside 1 sigma increases. The real world has fat tails. Taleb. Some simple derivations: Let x follow a Gaussian distribution (m .

1 0. a more advanced discussion explains why more uncertain mean (vanilla) might mean less uncertain probability (prediction).2 0. aL = 1 2 pH t s H1 . .4 0. Also see the “Why We Don’t Know What We Are Talking About When We Talk About Probability”.4 0.(Anti)Fragility.nb 17 yHt s. etc.5 0. Taleb.aLL + 1 2 pHt s H1 + aLL This graph shows the lengthening of the stopping time between events coming from fatter tails.6 0. N.1 2 4 6 8 Exit Time Expected t 8 7 6 5 4 3 0.3 0.5 0.3 0.2 0.N. Probability 0.7 v More Complicated : MetaProbabilities (TK) {this is a note.