You are on page 1of 3

Business Analytics

Assignment -1
Name – Yogender Bansal
Roll No- 11712303920
________________________________________________________________

Q1. Based on the Data Summarization Methods and Techniques discussed in the class compile the
advantages and disadvantages of each technique in a tabular form and also mention the data type
suitable for each technique used.
Ans.

S.no Techniques Advantage Disadvantage Data Type


.
1. Pie Chart show areas proportional to the be easily manipulated to yield Nominal Data
number of data points in each false impressions
category
summarize a large data set in fail to describe the attribute,
visual form behavior, or condition of interest
be visually simpler than other reveal little about central
types of graphs and permit a visual tendency, dispersion, skew, or
check of the reasonableness or kurtosis
accuracy of calculations
display relative proportions of fail to reveal key assumptions,
multiple classes of data norms, causes, effects, or
patterns

2 Histogram estimate key values at a glance fail to reveal key assumptions, Continuous
norms, causes, effects, or Data
patterns
clarify trends better than do tables be easily manipulated to yield
or arrays false impressions
show each interval in the be inadequate to describe the
frequency distribution attribute, behavior, or condition
of interest
closely resemble the bell curve if require additional written or
sufficient data and classes are used verbal explanation

3 Frequency summarize a large data set in Lose details on relative numbers Nominal Data
visual form. and proportions vis-a-vis the
histogram
begin to show central tendency, be inadequate to describe the
dispersion, and attribute, behavior, or condition
clustering/modality. of interest
estimate key values, especially the fail to delineate each interval in a
mean, and show skew and kurtosis frequency distribution
clarify trends better than do tables, require additional written or
arrays, and most other graphs. verbal explanation
4 Bar Graph be visually simpler than other be easily manipulated to yield Nominal Data
types of graphs false impressions
show areas proportional to the fail to reveal key assumptions,
number of data points in each norms, causes, effects, or
category patterns
display relative proportions of fail to describe the attribute,
multiple classes of data behavior, or condition of interest
permit a visual check of the reveal little about central
reasonableness or accuracy of tendency, dispersion, skew, or
calculations kurtosis

5 Mean It is based on all the values. It gives misleading conclusions. Interval Data
It is rigidly defined It has upward bias.
It is not based on the position in It is affected by extreme values.
the series.
It is easy to understand & simple Not appropriate with nominal
calculate. or ordinal data.
Sensitive to extreme outliners.
It is easy to understand the It cannot be calculated for open
arithmetic average even if some of end classes.
the details of the data are lacking. It cannot be located graphically

6 Median Graphic presentation Unrealistic Ordinal Data


Can be calculated even if values of Difficult to rank large number of
all observations are not known. data values.
Simplicity, Lack of representative character
Free from the effect of extreme
values,
Certainty,
Real value,
Possible even when data is No based on all the observations
incomplete

7 Mode Simple and popular Not capable of algebraic Nominal Data


Less effect of marginal values treatment
Difficult
Best representative Ignores extreme marginal
frequencies
Graphic presentation Complex procedure of grouping
No need of knowing all the items Uncertain and vague
or frequencies

8 Standard It gives a more accurate idea of Only used with data where an Continuous
Deviation how the data is distributed independent variable is plotted Data
against the frequency of it
Shows how much data is clustered It doesn't give you the full range
around a mean value of the data
It can be hard to calculate
Not as affected by extreme values Assumes a normal distribution
pattern
9 Range easy to compute and understand It is very much affected by the Ordinal Data
extreme values.
Communicates information of Range cannot be computed in
interest to readers of a report. case of open-end distribution.
best for symmetric data with no It is not based on each and every
outliers item of the distribution.
good option for ordinal data The value of Range is affected
more by sampling fluctuations

10 Interquartile Important in evaluating outliners. it is a positional measure, based Ordinal Data


Range on only the twenty-fifth and
seventy-fifth percentiles
Reduce influence of outliners and The next measures of variation
extreme scores in expressing to be examined in these notes,
variability. the standard deviation and
variance, remedy this defect
Uses more information than the For these next measures, the
range. value associated with each case
is taken into account.
Appropriate as index of variability
with ordinal measures.

You might also like