You are on page 1of 7

Research paper on Different Statistical

Tools used for Analysis


Author :- Yasin Pathan And Mangesh Kotgire

Abstract :

Statistical methods involved in carrying out a study include planning, designing,


collecting data, analysing, drawing meaningful interpretation and reporting of the
research findings. The statistical analysis gives meaning to the meaningless numbers,
thereby breathing life into a lifeless data. The results and inferences are precise only
if proper statistical tests are used. This article will try to acquaint the reader with the
basic research tools that are utilised while conducting various studies. The article
covers a brief outline of the variables, an understanding of quantitative and
qualitative variables and the measures of central tendency. An idea of the sample
size estimation, power analysis and the statistical errors is given. Finally, there is a
summary of parametric and non-parametric tests used for data analysis.
Key words: Basic statistical tools, degree of dispersion, measures of central
tendency, parametric tests and non-parametric tests, variables, variance

Introduction:

Statistics is a branch of science that deals with the collection, organization, analysis
of data and drawing of inferences from the samples to the whole population. This
requires a proper design of the study, an appropriate selection of the study sample
and choice of a suitable statistical test. An adequate knowledge of statistics is
necessary for proper designing of an epidemiological study or a clinical trial.
Improper statistical methods may result in erroneous conclusions which may lead to
unethical practice. A standard statistical procedure involves the collection of data leading
to test of the relationship between two statistical data sets, or a data set and synthetic
data drawn from an idealized model. A hypothesis is proposed for the statistical
relationship between the two data sets, and this is compared as an alternative to an
idealized null hypothesis of no relationship between two data sets.

Variables:

Variable is a characteristic that varies from one individual member of population to


another individual. Variables such as height and weight are measured by some type
of scale, convey quantitative information and are called as quantitative variables. Sex
and eye color give qualitative information and are called as qualitative variables.

2
Statistics: Descriptive And Inferential Statistics
Descriptive statistics try to describe the relationship between variables in a sample or
population. Descriptive statistics provide a summary of data in the form of mean,
median and mode. Inferential statistics use a random sample of data taken from a
population to describe and make inferences about the whole population. It is
valuable when it is not possible to examine each member of an entire population.

Descriptive statistics
The extent to which the observations cluster around a central location is described by
the central tendency and the spread towards the extremes is described by the degree
of dispersion.
Normal distribution or Gaussian distribution

Most of the biological variables usually cluster around a central value, with
symmetrical positive and negative deviations about this point. The standard normal
distribution curve is a symmetrical bell-shaped. In a normal distribution curve, about
68% of the scores are within 1 SD of the mean. Around 95% of the scores are within
2 SDs of the mean and 99% within 3 SDs of the mean

3
Skewed distribution
It is a distribution with an asymmetry of the variables about its mean. In a negatively
skewed distribution, the mass of the distribution is concentrated on the right of. In a
positively skewed distribution, the mass of the distribution is concentrated on the left
of the figure leading to a longer right tail.

Inferential statistics
In inferential statistics, data are analysed from a sample to make inferences in the
larger collection of the population. The purpose is to answer or test the hypotheses.
A hypothesis (plural hypotheses) is a proposed explanation for a phenomenon.
Hypothesis tests are thus procedures for making rational decisions about the reality
of observed effects.
Probability is the measure of the likelihood that an event will occur. Probability is
quantified as a number between 0 and 1 (where 0 indicates impossibility and 1
indicates certainty).
In inferential statistics, the term ‘null hypothesis’ (H0 ‘H-naught,’ ‘H-null’) denotes
that there is no relationship (difference) between the population variables in
question.
Statistics is a mathematical study that deals with collection and analysis. Steps
include data collection, analysis of data, perception, and organization or
summarization of data. Statistics is a form of applied mathematics that produces
a set of studies from the obtained data. This mathematical analysis makes the
dataset applicable for real life. Statistics has its dominance in the field of
psychology, geology, weather forecast, etc. the data is collected either in
quantitative or qualitative form.

4
Types of statistic data

There are majorly two types of statistics data. They are descriptive statistics and
inferential statistics. Let’s learn about these two types in more detail,
1. Descriptive Statistics
This statistics provides a description of the population through numerical, graphs
or tables by using the given data. It is further categorized as,
1. The measure of central tendency
2. Measure of variability
2. Inferential Statistics
This type of statistics makes predictions about the population based on the given
sample data. Inferential statistics uses the method of probabilities to prepare a
datasheet.

Five methods of statistical analysis:


• Mean

Mean or average is the most commonly used method to


perform the statistical analysis. It is generally used in regard to
research, academics, and sports. The calculation of mean involves
adding up the given numbers and dividing them by the number of
items.
The mathematical formula of mean is given by

X=∑x/n
Where,
∑x = sum of numbers
n = number of items

5
• Standard Deviation

Standard deviation is a method for statistical analysis that uses the


spread of information around the mean. As when you are trying to calculate
a standard deviation the most information used is by the mean.
The mathematical formula of Standard deviation is given by
• Regression

In statistical analysis methods, regression is a connection between an


independent variable and a dependent variable. The lines used in the graphs
of the regression chart show the connections between factors and time.
The mathematical formula of regression is given by
Y=a + b(x) is the equation of the slope.
Y = independent variable
b = slope
x = dependent variable
a = y-intercept
The formula is,
Yi = f (Xi, β) + ei

Yi = dependent variable
Xi = Independent variable
e = error terms
β = unknown parameters
• Hypothesis testing

Hypothesis testing is also widely known as ‘T Testing’ is a statistical analysis


method that works on contrasting the given information against different
assumptions. It is an estimation that is made for business purposes.
• Sample Size Determination

Sample size determination is a method of examining information from an


excessive dataset. The given is so enormous that it is hard to gather exact
information for every dataset.

6
Statistical Tools used for Data Analysis
• SPSS (IBM) ...
• R (R Foundation for Statistical Computing) ...
• MATLAB (The Mathworks) ...
• Microsoft Excel. ...
• SAS (Statistical Analysis Software) ...
• GraphPad Prism. ...
• Minitab.

Conclusion:

In this Research paper we have been reviewed several statistical methods, Formulae
and there use. This paper also reviewed various statistical methods that used for
analyse and clean the data. We also have seen the types of statistic data. How to
identify and handle problems with messy data, such as outliers and missing values using statistical
methods. In case of Data Science that Statistical methods are vital term.

You might also like