You are on page 1of 2

Zipf–Mandelbrot law

In probability theory and statistics, the Zipf–


Mandelbrot law is a discrete probability distribution.
Zipf–Mandelbrot
Also known as the Pareto–Zipf law, it is a power-law Parameters (integer)
distribution on ranked data, named after the linguist (real)
George Kingsley Zipf who suggested a simpler (real)
distribution called Zipf's law, and the mathematician
Support
Benoit Mandelbrot, who subsequently generalized it.
PMF
The probability mass function is given by:
CDF

Mean
where is given by:
Mode
Entropy

which may be thought of as a generalization of a harmonic number. In the formula, is the rank of the data,
and and are parameters of the distribution. In the limit as approaches infinity, this becomes the
Hurwitz zeta function . For finite and the Zipf–Mandelbrot law becomes Zipf's law. For
infinite and it becomes a Zeta distribution.

Applications
The distribution of words ranked by their frequency in a random text corpus is approximated by a power-
law distribution, known as Zipf's law.

If one plots the frequency rank of words contained in a moderately sized corpus of text data versus the
number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to
one (but see Powers, 1998 and Gelbukh & Sidorov, 2001). Zipf's law implicitly assumes a fixed
vocabulary size, but the Harmonic series with s=1 does not converge, while the Zipf–Mandelbrot
generalization with s>1 does. Furthermore, there is evidence that the closed class of functional words that
define a language obeys a Zipf–Mandelbrot distribution with different parameters from the open classes of
contentive words that vary by topic, field and register.[1]

In ecological field studies, the relative abundance distribution (i.e. the graph of the number of species
observed as a function of their abundance) is often found to conform to a Zipf–Mandelbrot law.[2]

Within music, many metrics of measuring "pleasing" music conform to Zipf–Mandelbrot distributions.[3]

Notes
1. Powers, David M W (1998). "Applications and explanations of Zipf's law". New methods in
language processing and computational natural language learning. Joint conference on new
methods in language processing and computational natural language learning. Association
for Computational Linguistics. pp. 151–160.
2. Mouillot, D; Lepretre, A (2000). "Introduction of relative abundance distribution (RAD)
indices, estimated from the rank-frequency diagrams (RFD), to assess changes in
community diversity" (http://cat.inist.fr/?aModele=afficheN&cpsidt=1411186). Environmental
Monitoring and Assessment. Springer. 63 (2): 279–295. doi:10.1023/A:1006297211561 (http
s://doi.org/10.1023%2FA%3A1006297211561). S2CID 102285701 (https://api.semanticscho
lar.org/CorpusID:102285701). Retrieved 24 Dec 2008.
3. Manaris, B; Vaughan, D; Wagner, CS; Romero, J; Davis, RB. "Evolutionary Music and the
Zipf–Mandelbrot Law: Developing Fitness Functions for Pleasant Music" (https://archive.tod
ay/wQYN). Proceedings of 1st European Workshop on Evolutionary Music and Art
(EvoMUSART2003). 611.

References
Mandelbrot, Benoît (1965). "Information Theory and Psycholinguistics". In B.B. Wolman and
E. Nagel (ed.). Scientific psychology. Basic Books. Reprinted as
Mandelbrot, Benoît (1968) [1965]. "Information Theory and Psycholinguistics". In R.C.
Oldfield and J.C. Marchall (ed.). Language. Penguin Books.
Powers, David M W (1998). "Applications and explanations of Zipf's law". New methods in
language processing and computational natural language learning. Joint conference on new
methods in language processing and computational natural language learning. Association
for Computational Linguistics. pp. 151–160.
Zipf, George Kingsley (1932). Selected Studies of the Principle of Relative Frequency in
Language. Cambridge, MA: Harvard University Press.
Van Droogenbroeck F.J., 'An essential rephrasing of the Zipf–Mandelbrot law to solve
authorship attribution applications by Gaussian statistics' (2019) [1] (https://www.academia.e
du/40029629)

External links
Z. K. Silagadze: Citations and the Zipf–Mandelbrot's law (https://arxiv.org/abs/physics/99010
35)
NIST: Zipf's law (https://xlinux.nist.gov/dads/HTML/zipfslaw.html)
W. Li's References on Zipf's law (https://web.archive.org/web/20060428014625/http://www.n
slij-genetics.org/wli/zipf/index.html)
Gelbukh & Sidorov, 2001: Zipf and Heaps Laws’ Coefficients Depend on Language (http://w
ww.gelbukh.com/CV/Publications/2001/CICLing-2001-Zipf.htm)
C++ Library for generating random Zipf–Mandelbrot deviates. (https://github.com/gkohri/discr
eteRNG)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Zipf–Mandelbrot_law&oldid=1121958884"

You might also like