You are on page 1of 3

Estimation is the determination of the approximate value of an unknown characteristic or

parameter of the distribution (general population), another estimated component of a


mathematical model of a real (economic, technical, etc.) phenomenon or process based on the
results of observations. Sometimes it is formulated more briefly: estimation is the determination
of the approximate value of an unknown parameter of the general population based on the results
of observations. In this case, the parameter of the general population can be either a number, or a
set of numbers (vector), or a function, or a set or other object of a non-numeric nature. For
example, based on the results of observations distributed according to the binomial law, a
number is estimated - parameter p (probability of success). Based on the results of observations
with a gamma distribution, a set of three numbers is estimated - the parameters of form a, scale b
and shift C. The method of estimating the distribution function is given by the theorems of V.I.
Glivenko and A.N. Kolmogorov. They also evaluate probability densities, functions expressing
dependencies between variables included in probabilistic models of economic, managerial or
technological processes, etc. The purpose of evaluation may be to find the ordering of
investment projects by economic efficiency or technical products (objects) by quality, the
formulation of rules for technical or medical diagnostics, etc. (Ordering in mathematical
statistics is also called rankings. This is one of the types of objects of non-numeric nature).

The evaluation is carried out using statistical estimates, which are the basis for estimating an
unknown distribution parameter. In a number of literary sources, the term "evaluation" occurs as
a synonym for the term "evaluation". It is impractical to use the same word to denote two
different concepts: evaluation is an action, and evaluation is a statistic (a function of the results
of observations) used in the process of this action or being its result. In other words, statistics is
any random variable that is a function of the sample X.

There are two types of evaluation - point evaluation and evaluation using a confidence domain.
Point estimation is a method of estimation, which consists in the fact that the value of the
estimate is taken as an unknown value of the distribution parameter.

Example 2. Let the results of observations x1, x2,..., xn be considered in a probabilistic model as
a random sample from a normal distribution N(m,y). That is, it is assumed that the results of
observations are modeled as realizations of n independent identically distributed random
variables having a normal distribution function N(m,y) with some mathematical expectation m
and mean square deviation y, unknown to the statistician. It is required to estimate the
parameters m and y (or y2) based on the results of observations. The estimates are denoted by
m* and (y2)*, respectively. Usually, a sample arithmetic mean is used as an estimate of m*
mathematical expectation m, and a sample variance s2 is used as an estimate of (y2)* variance
y2, i.e.

m* = , (y2)* = s2.

Other statistics can also be used to estimate the mathematical expectation m, for example, the
sample median, the half-sum of the minimum and maximum terms of the variation series

m** = [x(1)+x(n)]/2
et al . To estimate the variance of y2, there are also a number of estimates, in particular (see
above) and an estimate based on the span of R, which has the form

(y2)** = [a(n)R]2,

where the coefficients a(n) are taken from special tables. These coefficients are selected so that
for samples from the normal distribution

M[a(n)R] = y.

If we are talking about estimating several numerical parameters, or functions, orderings, etc.,
then we are talking about estimating using a confidence domain. A confidence region is a region
in the parameter space that includes an unknown value of the estimated distribution parameter
with a given probability. A "given probability" is called a confidence probability and is usually
denoted by G. Let And be the parameter space. Consider the statistics I1 = I1(x1, x2,..., xn) - a
function of the results of observations x1, x2, ..., xn, whose values are subsets of the parameter
space And. Since the results of observations are random variables, then I1 is also a random
variable whose values are subsets of the set And, i.e. And1 is a random set. Recall that a set is
one of the types of objects of non-numeric nature, random sets are studied in probability theory
and statistics of objects of non-numeric nature. So, random vectors, random functions, random
sets, random rankings (orderings) are separate types of random variables. I1 is called the
confidence domain corresponding to the confidence probability of g if A confidence interval is
an interval that, with a given probability, will cover an unknown value of the estimated
distribution parameter. The boundaries of the confidence interval are called confidence
boundaries. Confidence probability d is the probability that the confidence interval will cover the
actual value of the parameter estimated from the sample data. Estimation using a confidence
interval is a method of estimation in which the boundaries of the confidence interval are set with
a given confidence probability. The presence of several methods for estimating the same
parameters leads to the need to choose between these methods.

1. Methods of parameter estimation

Various parametric models are used in applied statistics. The term "parametric" means that the
probabilistic-statistical model is completely described by a finite-dimensional vector of fixed
dimension. Moreover, this dimension does not depend on the sample size.

Consider a sample x1, x2,..., xn from a distribution with density f(x;and0), where f(x;and0) is an
element of the parametric family of probability distribution densities {f(x;and), and I}. Here And
is a pre-known k-dimensional parameter space, which is a subset of the Euclidean space Rk, and
the specific value of the parameter i0 is unknown to statistics. Usually, parametric families with
k = 1,2,3 are used in applied statistics (see Chapter 1.2). In non-numeric data statistics,
probabilities of hitting points are often considered instead of density. Recall that in parametric
estimation problems, a probabilistic model is adopted, according to which the results of
observations x1, x2, ..., xn are considered as realizations of n independent random variables.

1.2 Maximum likelihood method


In works intended for initial acquaintance with mathematical statistics, maximum likelihood
estimates (abbreviated as WMD) are usually considered:

Thus, the probability distribution density corresponding to the sample is first constructed. Since
the sample elements are independent, this density is represented as a product of densities for
individual sample elements. The joint density is considered at the point corresponding to the
observed values. This expression as a function of the parameter (for given sample elements) is
called the likelihood function. Then, in one way or another, the value of the parameter is
searched for, at which the value of the joint density is maximum. This is the maximum
likelihood estimate.

You might also like