Professional Documents
Culture Documents
Statistics comprises useful data interpretation tools like mean, median, mode, standard
deviation, coefficient of variance, and sample tests. Raw financial data in a numerical format is
interpreted using mathematical formulas. Many sectors like science, government, manufacturing,
population, psychology, banking, and financial markets rely on statistical data.
Statistics Explained
Statistics is the systematic processing and interpretation of raw data to compile a conclusive
result. These reports are drafted in a numerical format. They are presented in a succinct manner
so that one can read and understand easily. One should be able to comprehend them at a mere
glance.
Financial data is in a numerical format and includes details about portfolios, investments, and
assets. Historical data and present data are interpreted using mathematical formulas. Forecasts
are based on available information and requirements.
Application of Statistics
Statistics is indispensable for decision-making in various sectors and
verticals. It is applied in marketing, e-commerce, banking, finance,
human resource, production, and information technology. In addition,
this mathematical discipline has been a prominent part of research
and is widely used in data mining, medicine, aerospace, robotics,
psychology, and machine learning.
One can understand the importance of Statistics in business from the following:
(i) Marketing - Statistical analysis is frequently used in providing information for making
decisions in the field of marketing. It is necessary first to find out what can be sold and then to
evolve suitable strategy, so that the goods reach to the ultimate consumer. A skilful analysis of
data on production purchasing power, man power, habits of consumers, habits of consumer,
transportation cost should be considered to take any attempt to establish a new market.
(ii) Production - In the field of production statistical data and method play a very important role.
The decision about what to produce, how to produce, when to produce, and for whom to produce
is based largely on statistical analysis.
(iii) Finance - The financial organization discharging their finance function effectively depend
very heavily on statistical analysis of peat and tigers.
(iv) Banking - Banking Institute has found it increasingly necessary to establish research
department within their organization for the purpose of gathering and analysis information, not
only regarding their own business but also regarding the general economic situation and every
segment of business in which they may have interest.
(v) Investment - Statistics greatly assists investors in making clear and valued judgment in his
investment decision in selecting securities which are safe and have the best prospects of yielding
a good income.
(vi) Purchase - The purchasing department in discharging their function makes use of statistical
data to frame suitable purchase policies such as what to buy; What quantity to buy; What time to
buy; Where to buy; Whom to buy;
(vii) Accounting - Statistical data are also the employer in accounting particularly in auditing
function, the technique of sampling and destination is frequently used.
(viii) Control - The management control process combines statistical and accounting method in
making the overall budget for the coming year including sales, materials, labour and other costs
and net profits and capital requirement.
Marketing
As per Philip Kotler and Gary Armstrong marketing “ identifies customer needs and wants ,
determine which target markets the organisations can serve best, and designs appropriate
products, services and Programs to serve these markets”
Marketing is all about creating and growing customers profitably. Statistics is used in almost
every aspect of creating and growing customers profitably. Statistics is extensively used in
making decisions regarding how to sell products to customers. Also, intelligent use of
statistics helps managers to design marketing campaigns targeted at the potential customers.
Marketing research is the systematic and objective gathering, recording and analysis of data
about aspects related to marketing. IMRB international, TNS India, RNB Research, The
Nielson , Hansa Research and Ipsos Indica Research are some of the popular market research
companies in India. Web analytics is about the tracking of online behaviour of potential
customers and studying the behaviour of browsers to various websites.
Use of Statistics is indispensable in forecasting sales, market share and demand for various
types of Industrial products.
Factor analysis, conjoint analysis and multidimensional scaling are invaluable tools which
are based on statistical concepts, for designing of products and services based on customer
response.
Finance
Uncertainty is the hallmark of the financial world. All financial decisions are
based on “Expectation” that is best analysed with the help of the theory of
probability and statistical techniques. Probability and statistics are used
extensively in designing of new insurance policies and in fixing of premiums for
insurance policies. Statistical tools and technique are used for analysing risk and
quantifying risk, also used in valuation of derivative instruments, comparing
return on investment in two or more instruments or companies.
Beta of a stock or equity is a statistical tool for comparing volatility, and is
highly useful for selection of portfolio of stocks.
The most sophisticated traders in today’s stock markets are those who trade in
“derivatives” i.e financial instruments whose underlying price depends on the
price of some other asset.
Economics
Statistical data and methods render valuable assistance in the proper
understanding of the economic problem and the formulation of economic
policies. Most economic phenomena and indicators can be quantified and dealt
with statistically sound logic.
In fact, Statistics got so much integrated with Economics that it led to
development of a new subject called Econometrics which basically deals with
economics issues involving use of Statistics.
Operations
The field of operations is about transforming various resources into product and
services in the place, quantity, cost, quality and time as required by the
customers. Statistics plays a very useful role at the input stage through sampling
inspection and inventory management, in the process stage through statistical
quality control and six sigma method, and in the output stage through sampling
inspection. The term Six Sigma quality refers to situation where there is only
3.4 defects per million opportunities.
Human Resource Management or Development
Human Resource departments are inter alia entrusted with the responsibility of
evaluating the performance, developing rating systems, evolving compensatory
reward and training system, etc. All these functions involve designing forms,
collecting, storing, retrieval and analysis of a mass of data. All these functions
can be performed efficiently and effectively with the help of statistics.
What is Population?
In statistics, population is the entire set of items from which you draw data for a
statistical study. It can be a group of individuals, a set of items, etc. It makes up the data
pool for a study.
Generally, population refers to the people who live in a particular area at a specific time.
But in statistics, population refers to data on your study of interest. It can be a group of
individuals, objects, events, organizations, etc. You use populations to draw
conclusions.
For the above situation, it is easy to collect data. The population is small and willing to
provide data and can be contacted. The data collected will be complete and reliable.
If you had to collect the same data from a larger population, say the entire country of
India, it would be impossible to draw reliable conclusions because of geographical and
accessibility constraints, not to mention time and resource constraints. A lot of data
would be missing or might be unreliable. Furthermore, due to accessibility issues,
marginalized tribes or villages might not provide data at all, making the data biased
towards certain regions or groups.
What is a Sample?
The sample is an unbiased subset of the population that best represents the whole
data.
To overcome the restraints of a population, you can sometimes collect data from a
subset of your population and then consider it as the general norm. You collect the
subset information from the groups who have taken part in the study, making the data
reliable. The results obtained for different groups who took part in the study can be
extrapolated to generalize for the population.
Population Sample
All residents above the poverty line in a country would be the All residents who are millionaires
Population would make up the Sample
English
German
French
Punjabi
What’s your nationality?
American
Indian
Japanese
German
You can clearly see that in these examples of nominal data the
categories have no order.
2. Ordinal Data
Ordinal data is almost the same as nominal data but not in the case of
order as their categories can be ordered like 1st, 2nd, etc. However, there
is no continuity in the relative distances between adjacent categories.
Ordinal Data is observed but not measured, is ordered but non-
equidistant, and has no meaningful zero. Ordinal scales are always used
for measuring happiness, satisfaction, etc.
With ordinal data, likewise, with nominal data, you can amass the
information by evaluating whether they are equivalent or extraordinary.
As ordinal data are ordered, they can be arranged by making basic
comparisons between the categories, for example, greater or less than,
higher or lower, and so on.
You can't do any numerical activities with ordinal data, however, as they
are numerical data.
With ordinal data, you can calculate the same things as nominal data like
frequencies, proportions, percentage, central point but there is one more
point added in ordinal data that is summary statistics and
similarly bayesian statistics.
Examples of Ordinal data:
Opinion
o Agree
o Disagree
o Mostly agree
o Neutral
o Mostly disagree
Time of day
o Morning
o Noon
o Night
In these examples, there is an obvious order to the categories.
3. Interval Data
Interval Data are measured and ordered with the nearest items but have
no meaningful zero.
The central point of an Interval scale is that the word 'Interval' signifies
'space in between', which is the significant thing to recall, interval scales
not only educate us about the order but additionally about the value
between every item.
Interval data can be negative, though ratio data can't.
Even though interval data can show up fundamentally the same as ratio
data, the thing that matters is in their characterized zero-points. If the
zero-point of the scale has been picked subjectively, at that point the
data can't be ratio data and should be interval data.
Hence, with interval data you can easily correlate the degrees of the data
and also you can add or subtract the values.
There are some descriptive statistics that you can calculate for interval
data are central point (mean, median, mode), range (minimum,
maximum), and spread (percentiles, interquartile range, and standard
deviation).
In addition to that, similar other statistical data analysis techniques can
be used for more analysis.
Examples of Interval data:
Temperature (°C or F, but not Kelvin)
Dates (1066, 1492, 1776, etc.)
Time interval on a 12-hour clock (6 am, 6 pm)
4. Ratio Data
Ratio Data are measured and ordered with equidistant items and a
meaningful zero and never be negative like interval data.
An outstanding example of ratio data is the measurement of heights. It
could be measured in centimetres, inches, meters, or feet and it is not
practicable to have a negative height.
Ratio data enlightens us regarding the order for variables, the contrasts
among them, and they have absolutely zero. It permits a wide range of
estimations and surmisings to be performed and drawn.
Ratio data is fundamentally the same as interval data, aside from zero
means none.
The descriptive statistics which you can calculate for ratio data are the
same as interval data which are central point (mean, median, mode),
range (minimum, maximum), and spread (percentiles, interquartile range,
and standard deviation).
Example of Ratio data:
Age (from 0 years to 100+)
Temperature (in Kelvin, but not °C or F)
Distance (measured with a ruler or any other assessing device)
Time interval (measured with a stop-watch or similar)
Therefore, for these examples of ratio data, there is an actual, meaningful
zero-point like the age of a person, absolute zero, distance calculated
from a specified point or time all have real zeros.
MEANING OF MEASURES OF CENTRAL TENDENCY
For example, when we talk about the achievement scores of the students of a
class, we find some students with very high or very low score. However, the
score of the most students live somewhere between the highest and the
lowest scores of the whole class. Here we see a score around which the data
converge around and this will be used as a measure of central tendency.
Mean:
Mean is the average of all values given in both discrete and continuous
distribution. It is calculated differently in both discrete distribution and the
continuous distribution. In the discrete data, all the scores are added and
divided by the total number. In the continuous distribution, there are different
methods to calculate the mean.
Median:
The Median is the middle value in a series of data. In the discrete data, when
the totals of the list are odd, the median is the middle entry in the list after
sorting the list into increasing order.
When the totals of the list are even, the median is equal to the sum of the two
middle (after sorting the list into increasing order) numbers divided by two. In
the continuous distribution, some different formula is applied.
Mode:
The mode in a list of numbers refers to the list of numbers that occur most
frequently. For example, in the following data —7, 2, 2, 43, 11, 11, 44, 18, 18,
18, 27, 39, 6 -18 occurs the most at 3 times. There can be more than one
mode for a distribution with a discrete random variable.
A distribution with two modes is called bimodal and a distribution with three
modes is called trimodal. The mode of a distribution with a continuous
random variable is calculated differently.
Range:
The range of a set of data is the difference between the largest and smallest
values. However, in descriptive statistics, this concept of range has a more
complex meaning.
It is the same in discrete random variable series and continuous random
variable series.
Rightly and rigidly defined: It implies that the definition of the measure should be
calculating it.
Easy to understand: It should be simple to calculate. Too much complexity and
Measures of Dispersion
A measure of dispersion indicates the scattering of data. It explains the disparity of data from one
another, delivering a precise view of their distribution. The measure of dispersion displays and gives
us an idea about the variation and the central value of an individual item.
In other words, dispersion is the extent to which values in a distribution differ from the average of
the distribution. It gives us an idea about the extent to which individual items vary from one another,
and from the central value.
What is Probability Distribution?
Probability distribution yields the possible outcomes for any random event. It is also defined based
on the underlying sample space as a set of possible outcomes of any random experiment. These
settings could be a set of real numbers or a set of vectors or a set of any entities. It is a part of
probability and statistics.
Random experiments are defined as the result of an experiment, whose outcome cannot be
predicted. Suppose, if we toss a coin, we cannot predict, what outcome it will appear either it will
come as Head or as Tail. The possible result of a random experiment is called an outcome. And the
set of outcomes is called a sample point. With the help of these experiments or events, we can
always create a probability pattern table in terms of variables and probabilities.
o
1. Normal or Cumulative Probability Distribution
2. Binomial or Discrete Probability Distribution
Let us discuss now both the types along with their definition, formula and examples.
For example, a set of real numbers, is a continuous or normal distribution, as it gives all the possible
outcomes of real numbers. Similarly, a set of complex numbers, a set of prime numbers, a set of
whole numbers etc. are examples of Normal Probability distribution. Also, in real-life scenarios, the
temperature of the day is an example of continuous probability. Based on these outcomes we can
create a distribution table. A probability density function describes it. The formula for the normal
distribution is;
Where,
o
μ = Mean Value
σ = Standard Distribution of probability.
If mean(μ) = 0 and standard deviation(σ) = 1, then this distribution is known
to be normal distribution.
x = Normal random variable
o
You can use a Poisson distribution to predict or explain the number of events occurring
within a given interval of time or space. “Events” could be anything from disease cases
to customer purchases to meteor strikes. The interval can be any specific amount of
time or space, such as 10 days or 5 square inches.
1. Individual events happen at random and independently. That is, the probability of one
event doesn’t affect the probability of another event.
2. You know the mean number of events occurring within a given interval of time or space.
This number is called λ (lambda), and it is assumed to be constant.
When events follow a Poisson distribution, λ is the only thing you need to know to
calculate the probability of an event occurring a certain number of times.
Sampling Distribution
Sampling Distribution in the field of statistics is a subtype of proportion
distribution wherein a statistic is calculated by randomly analyzing samples
from a given population. It is the distribution of samples in a population that
leads to the revelation of data in numerous fields.
Even though the sampling distribution does not include any sample that
deviates far off from the population's mean value, the frequency distribution
of sampling distribution often generates a normal distribution with maximum
samples close to the population's mean value.
Types of Sampling Distribution