You are on page 1of 38

PRESENTATION ON DATA

ANALYSIS

SUBMITTED BY :
ISHITA (30818210004) KARAN (3081821008)

AMISHA (30818210025) SAGAR ( 30818210038)

SUBMITTED TO : DR. RADHA RANI


Contents
 Meaning of Data Analysis.
 Importance of Data Analysis.
 Difference Between Data Analysis, Processing and
Interpretation.
 Types of Data Analysis.
 Statistical Analysis and Tools of Statistical
Analysis.
 Precautions in Analysis of Data
 Diagrammatic Representation of Data
Concept Of Data Analysis
 In any research, the analysis of the data is one of the most crucial tasks
requiring proficient knowledge to handle the data collected as per the pre
decided research design of the project.

 Data analysis is the process of cleaning, analyzing, interpreting, and


visualizing data to discover valuable insights that drive smarter and more
effective business decisions. Data Analysis tools  are used to extract useful
information from business data, and help make the data analysis
process easier.

 For example, researchers conducting research and data analysis for


studying the concept of ‘diabetes’ amongst respondents might analyze the
context of when and how the respondent has used or referred to the word
‘diabetes’.
Importance Of Data Analysis
 Data analysis can help businesses improve specific aspects about their products
and services, as well as their overall brand image and customer experience.

 In short, analysed data reveals insights that tell you where you need to focus
your efforts.

 Instead of relying on intuition or experience, analysing data provides solid


evidence to support decisions.

 Product teams, for example, often analyse customer feedback to understand


how customers interact with their product, what they’re frustrated with, and
which new features they’d like to see.
Difference Between Data Analysis,
Processing and Interpretation
 Data Analysis :
Data Analysis involves actions and methods
performed on data that help describe facts,
detect patterns, develop explanations and test
hypotheses. This includes data quality
assurance, statistical data analysis, modelling,
and interpretation of results.
 Data processing:
A series of actions or steps performed on data to verify,
organize, transform, integrate, and extract data in an
appropriate output form for subsequent use.
Example : Some examples of data processing are
calculation of satellite orbits, weather forecasting,
statistical analyses, and in a more practical sense,
business applications such as accounting, payroll, and
billing.
Data Interpretation :
Once the data has been processed and analyzed, the final
step required in the research process is interpretation of
the data. The line between analysis and interpretation is
very thin. Through interpretation one understands what
the given research findings really mean.

Example : For example, when founders are pitching to


potential investors, they must interpret data (e.g. market
size, growth rate, etc.) for better understanding.
Types of Data Analysis
Data analysis depends upon the nature of research that the
researcher is undertaking. Types of data analysis vary depending
upon whether the research is qualitative or quantitative in nature.
Descriptive & Inferential Analysis
1. Descriptive Analysis : The first type of data analysis is
descriptive analysis. It is at the foundation of all data insight.
It is the simplest and most common use of data in business
today. Under it statistical tools like percentage and means are
used and the data is then represented through a graph.

2. Inferential Analysis : is concerned with the various tests


of significance for testing hypotheses in order to determine
with what validity data can be said to indicate some
conclusion or conclusions. It is also concerned with the
estimation of population values. It is mainly on the basis of
inferential analysis that the task of interpretation (i.e., the task
of drawing inferences and conclusions) is performed.
SOME OF THE EXAMPLES :
Statistical Analysis
Statistics may be defined as “ the science of
collection , presentation and interpretation
of data from the logical analysis.”

Statistical Analysis is the systematic


collection and analysis of numerical data in
order to investigate and discover
relationship among phenomena.
Pros of Statistical Analysis
Statistics is useful in all fields of research
and study. One of the greatest advantages of
the use of statistics is that in a research with
large data, it helps in reducing such data into
a more manageable size for the purpose of
analysis and interpretation. It also helps in
comparing two or more series as well as
draw inferences and conclusions of the
research.
Cons Of Statistical Analysis
1. Qualitative values like subjective perceptions,
qualities and attributes are not considered under
statistics. It only considers quantities. This by far
is the greatest limitation of statistics.
2. Statistics studies and analysis group attributes
rather than individual characteristics and values.
3. Statistical analysis is mostly based on average;
hence the inferences drawn through them are
only approximate and not exact like that of
mathematics.
Tools of statistical analysis
There are various statistical tools which are available for
the researcher’s assistance.
DATA ANALYSIS TOOLS

Measures of
Measures of Measures of
Central
Dispersion Asymmetry
Tendency

Measures of Other
Relationship Measures
MEASURES OF CENTRAL TENDENCY
■ Measures of central tendency are also usually
called as the averages.
■ They give us an idea about the concentration
of the values in the central part of the
distribution.
■ The following are the six measures of central
tendency that are in common use: (i)
Arithmetic mean, (ii) Median, (iii) Mode, (iv)
Geometric mean (v) Harmonic mean and (vi)
weighted mean
MEAN
 Mean (Average) Mean locate the center of distribution.
 Also known as arithmetic mean.
 The mean is simply the sum of the values divided by the total
number of items in the set.

MEAN = SUM OF ALL THE VALUES IN THE SAMPLE


NUMBER OF VALUES IN THE SAMPLES

MERITS :-
 It is easy to understand and easy to calculate
 It is based upon all the observations
 It is familiar to common man and rigidly defined 
TYPES OF MEAN
 HARMONIC MEAN - is a type of average that is calculated by
dividing the number of values in a data series by the sum of the
reciprocals of each value in the data series.
 GEOMETRIC MEAN – It is defined as the nth root of the product of n
numbers. It is noted that the geometric mean is different from the
arithmetic mean. Because, in arithmetic mean, we add the data values
and then divide it by the total number of values. But in geometric mean,
we add the given data values and then take the root. For example: for a
given set of two numbers such as 3 and 1, the geometric mean is equal
to √(3+1) = √4 = 2.
 WEIGHTED MEAN - The weighted mean is a type of mean that is
calculated by multiplying the weight (or probability) associated with a
particular event or outcome with its associated quantitative outcome and
then summing all the products together. 
MODE
A mode is defined as the value that has a
higher frequency in a given set of values. It is
the value that appears the most number of
times.

Example: In the given set of data: 2, 4, 5, 5, 6,


7, the mode of the data set is 5 since it has
appeared in the set twice.
MEDIAN
The median is the middle value. It is the value that splits the
dataset in half. To find the median, order your data from smallest
to largest, and then find the data point that has an equal amount of
values above it and below it. The formula to calculate the median
of the data set is given as follow :


Odd Number of Observations - If the total number of observation given is odd,
then the formula to calculate the median is:
Median = {(n+1)/2}thterm

Even Number of Observations - If the total number of observation is even, then
the median formula is:
Median  = [(n/2)th term + {(n/2)+1}th]/2
Where n is the number of observations
MEASURES OF DISPERSION
 Dispersion refers to the variations of the
items among themselves / around an
average.
 Greater the variation amongst different
items of a series, the more will be the
dispersion.
 As per Bowley, “Dispersion is a measure of
the variation of the items”.
METHODS OF MEASURING
DISPERSION
 Range
 Interquartile Range & Quartile Deviation
 Mean Deviation
 Standard Deviation
 Coefficient of Variation
RANGE
 It is the simplest measures of dispersion.
 It is defined as the difference between the
largest and smallest values in the series.
R= L–S
Where R = Range
L = Largest Value
S = Smallest Value
Interquartile Range & Quartile
Deviation

Interquartile Range is the difference between the upper


quartile (Q3) and the lower quartile (Q1)
Symbolically, Interquartile Range = Q3 – Q1

Quartile Deviation is half of the interquartile range. It is


also called Semi Interquartile Range
Symbolically, Quartile Deviation = 𝑄3 −𝑄1/ 2 
MEAN DEVIATION
 It is also called Average Deviation
 It is defined as the arithmetic average of the deviation
of the various items of a series computed from
measures of central tendency like mean or median.
  M.D. from Median = Σ |𝑋 −𝑀|
N
 M.D. from Mean = Σ |𝑋 − x̅|
N
STANDARD DEVIATION
 Most important & widely used measure of dispersion.
 First used by Karl Pearson in 1893
 Also called root mean square deviations.
 It is defined as the square root of the arithmetic mean
of the squares of the deviation of the values taken
from the mean.
 Denoted by σ (sigma)
COEFFICIENT OF VARIATION
 It was developed by Karl Pearson.
 It is an important relative measure of dispersion.
 It is used in comparing the variability, homogeneity,
stability, uniformity & consistency of two or more
series.
 Higher the CV, lesser the consistency.
C.V. = 𝜎 x 100

Measures Of Asymmetry
(Skewness)

SYMMETRICAL DISTRIBUTION - A frequency


distribution is said to be symmetrical if the frequencies are equally
distributed on both the sides of central value. A symmetrical
distribution may be either bell — shaped or U shaped.
Skewed (Asymmetric) Distribution

A frequency distribution is said to be skewed if the


frequencies are not equally distributed on both the sides
of the central value. A skewed distribution may be

 Positively Skewed

 Negatively Skewed
Measure of Relationship
Correlation and coefficient is commonly used to
measure the relationship. It is mostly used for
prediction. Higher the degree of correlation,
greater the accuracy with which one can predict
a score. Karl Pearson’s coefficient of correlation
is the frequently used measure in case of
statistics of variables, whereas Yule’s coefficient
of association is used in case of statistics of
attributes.
Mainly three types of correlation research have been identified:
1. Positive correlation: A positive relationship between two variables is when an
increase in one variable leads to a rise in the other variable. A decrease in one
variable will see a reduction in the other variable. For example, the amount of
money a person has might positively correlate with the number of cars the person
owns.
2. Negative correlation: A negative correlation is quite literally the opposite of a
positive relationship. If there is an increase in one variable, the second variable
will show a decrease and vice versa. For example, being educated might
negatively correlate with the crime rate when an increase in one variable leads to
a decrease in another and vice versa. If the level of education in a country is
improved, it can lower crime rates. Please note that this doesn't mean that lack of
education leads to crimes. It only means that a lack of education and crime is
believed to have a common reason - poverty.
3. No correlation: In this third type, there is no correlation between the two
variables. A change in one variable may not necessarily see a difference in the
other variable. For example, being a millionaire and happiness is not correlated.
An increase in money doesn't lead to happiness.
Other Measures
Index number and analysis of time series are some of the
other tools of Data Analysis.
ANALYSIS OF INDEX NUMBERS
 Index numbers are indicators which reflect the relative
changes in the level of a certain phenomenon in any given
period called the current period with respect to its values
in some other period called the base period selected
primarily for this comparison.

 Example: Index number is used to compare the changes in


the national income of India from independence (1947) to
the year 2021
Analysis Of Time series
 Analysis of time series A time series is an
arrangement of statistical data in accordance with
its time of occurrence. If the values of a
phenomenon are observed at different periods of
time, the values so obtained will show
appreciable variations.

 Example: Weather records, economic indicators


and patient health evolution metrics — all are
time series data.
Precautions in Analysis of Data

■ Comprehensive knowledge and proper


perspective
■ Take into account all pertinent
elements.
■ Limitations of the study
■ Proper evaluation of data
Diagrammatic Representation of Data

■ A diagram is a visual form for presentation


of statistical data, highlighting their basic
facts and relationship.

■ If we draw diagrams on the basis of the


data collected they will easily be
understood and appreciated by all.
Types of Diagram
1. GRAPH
2. BAR DIAGRAM
3. PIE CHART
CONCLUSION
In the research process, data analysis is a very
important and scientific step especially when
the researcher is conducting a quantitative
research. The researcher must understand the
research area comprehensively and do the
processing, analysis and finally interpretation
with the help of various techniques and tools
of analysis depending upon the nature, scope
and aims of the research being conducted.

You might also like