Professional Documents
Culture Documents
Abstract :
Introduction:
Statistics is a branch of science that deals with the collection, organization, analysis
of data and drawing of inferences from the samples to the whole population. This
requires a proper design of the study, an appropriate selection of the study sample
and choice of a suitable statistical test. An adequate knowledge of statistics is
necessary for proper designing of an epidemiological study or a clinical trial.
Improper statistical methods may result in erroneous conclusions which may lead to
unethical practice. A standard statistical procedure involves the collection of data leading
to test of the relationship between two statistical data sets, or a data set and synthetic
data drawn from an idealized model. A hypothesis is proposed for the statistical
relationship between the two data sets, and this is compared as an alternative to an
idealized null hypothesis of no relationship between two data sets.
Variables:
2
Statistics: Descriptive And Inferential Statistics
Descriptive statistics try to describe the relationship between variables in a sample or
population. Descriptive statistics provide a summary of data in the form of mean,
median and mode. Inferential statistics use a random sample of data taken from a
population to describe and make inferences about the whole population. It is
valuable when it is not possible to examine each member of an entire population.
Descriptive statistics
The extent to which the observations cluster around a central location is described by
the central tendency and the spread towards the extremes is described by the degree
of dispersion.
Normal distribution or Gaussian distribution
Most of the biological variables usually cluster around a central value, with
symmetrical positive and negative deviations about this point. The standard normal
distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about
68% of the scores are within 1 SD of the mean. Around 95% of the scores are within
2 SDs of the mean and 99% within 3 SDs of the mean
3
Skewed distribution
It is a distribution with an asymmetry of the variables about its mean. In a negatively
skewed distribution, the mass of the distribution is concentrated on the right of. In a
positively skewed distribution, the mass of the distribution is concentrated on the left
of the figure leading to a longer right tail.
Inferential statistics
In inferential statistics, data are analysed from a sample to make inferences in the
larger collection of the population. The purpose is to answer or test the hypotheses.
A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon.
Hypothesis tests are thus procedures for making rational decisions about the reality
of observed effects.
Probability is the measure of the likelihood that an event will occur. Probability is
quantified as a number between 0 and 1 (where 0 indicates impossibility and 1
indicates certainty).
In inferential statistics, the term ‘null hypothesis’ (H0 ‘H-naught,’ ‘H-null’) denotes
that there is no relationship (difference) between the population variables in
question.
Statistics is a mathematical study that deals with collection and analysis. Steps
include data collection, analysis of data, perception, and organization or
summarization of data. Statistics is a form of applied mathematics that produces
a set of studies from the obtained data. This mathematical analysis makes the
dataset applicable for real life. Statistics has its dominance in the field of
psychology, geology, weather forecast, etc. the data is collected either in
quantitative or qualitative form.
4
Types of statistic data
There are majorly two types of statistics data. They are descriptive statistics and
inferential statistics. Let’s learn about these two types in more detail,
1. Descriptive Statistics
This statistics provides a description of the population through numerical, graphs
or tables by using the given data. It is further categorized as,
1. The measure of central tendency
2. Measure of variability
2. Inferential Statistics
This type of statistics makes predictions about the population based on the given
sample data. Inferential statistics uses the method of probabilities to prepare a
datasheet.
X=∑x/n
Where,
∑x = sum of numbers
n = number of items
5
• Standard Deviation
Yi = dependent variable
Xi = Independent variable
e = error terms
β = unknown parameters
• Hypothesis testing
6
Statistical Tools used for Data Analysis
• SPSS (IBM) ...
• R (R Foundation for Statistical Computing) ...
• MATLAB (The Mathworks) ...
• Microsoft Excel. ...
• SAS (Statistical Analysis Software) ...
• GraphPad Prism. ...
• Minitab.
Conclusion:
In this Research paper we have been reviewed several statistical methods, Formulae
and there use. This paper also reviewed various statistical methods that used for
analyse and clean the data. We also have seen the types of statistic data. How to
identify and handle problems with messy data, such as outliers and missing values using statistical
methods. In case of Data Science that Statistical methods are vital term.