Unit 1.2

Discrete variable: Set of possible values either is finite or else can be listed in an finite sequence
Continuous variable: Continuous variables can take on an unlimited number of values between the
lowest and highest points of measurement
Frequency
Frequency distribution: Tabular summary of the data showing frequency or relative frequencies of items
in each of several non-overlapping classes.
Relative frequency:
Relative frequency of the value =No of times the value occurs(f)/Number of observations in the data
set(N)
R.F. = f/N
Histograms: A graphical display of data using bars of different heights

Constructing a Histogram for Discrete Data:
First make frequency distribution
A histogram is a plot that lets you discover, and show, the underlying frequency
distribution (shape) of a set of continuous data. This allows the inspection of the data for
its underlying distribution (e.g., normal distribution), outliers, skewness, etc. An example
of a histogram, and the raw data it was constructed from, is shown below:
36 25 38 46 55 68 72 55 36 38
67 45 22 48 91 46 52 61 58 55
How do you construct a histogram from a continuous variable?
To construct a histogram from a continuous variable you first need to split the data into
intervals, called variables. In the example above, age has been split into variables, with
each variable representing a 10-year period starting at 20 years. Each variable contains
the number of occurrences of scores in the data set that are contained within that
variable. For the above data set, the frequencies in each variable have been tabulated
along with the scores that contributed to the frequency in each variable (see below):
Variable Frequency Scores Included in Variable

20-30 2 25,22
30-40 4 36,38,36,38
40-50 4 46,45,48,46
50-60 5 55,55,52,58,55
60-70 3 68,67,61
70-80 1 72
80-90 0 -
90-100 1 91
Notice that, unlike a bar chart, there are no "gaps" between the bars (although some
bars might be "absent" reflecting no frequencies). This is because a histogram
represents a continuous data set, and as such, there are no gaps in the data (although
you will have to decide whether you round up or round down scores on the boundaries
of variables).
There is no hard and fast rule for number of classes still a reasonable rule of thumb is
No of classes = √No of observations
After determining frequencies and relative frequencies, Calculate the height of each
rectangle = Relative frequency of the class/ class width
Resulting rectangular heights are usually called densities and vertical scale is density
scale. It will give you correct picture when class width are equal.
Histograms when class widths are unequal

In a histogram, it is the area of the bar that indicates the frequency of occurrences for
each variable. This means that the height of the bar does not necessarily indicate how
many occurrences of scores there were within each individual variable.
Relative frequency= class width)(density
=(Rectangular width)(rectangular height)
= Rectangular area
It is the product of height multiplied by the width of the variable that indicates the
frequency of occurrences within that variable. One of the reasons that the height of the
bars is often incorrectly assessed as indicating frequency and not the area of the bar is
due to the fact that a lot of histograms often have equally spaced bars (variables), and
under these circumstances, the height of the variable does reflect the frequency.
Shapes of histogram
Unimodal: with single peak
Bimodal: with peak
Multimodal: with having more than two peak.
Histogram is symmetric
Positive skewed
Negative skewed
Unit 1.3
Measures of location
Mean /simple mean/arithmetic mean: (x1+ x2+ x3+ x4+ x5+……..+ xn)/N
Median: Middle value of the series
Ordering n observation from smallest to largest
((N+1)/2 )th item in odd series
Average of ((N/2 )th item and ((N+1)/2 )th item in even series
Trimmed Mean
A trimmed mean is a method of averaging that removes a small designated percentage

of the largest and smallest values before calculating the mean. After removing the
specified outlier observations, the trimmed mean is found using a standard arithmetic
averaging formula. The use of a trimmed mean helps eliminate the influence of outliers
or data points on the tails that may unfairly affect the traditional mean.
2.0 2.4 2.5 2.6 2.6 2.7 2.7 2.8 3.0 3.1 3.2 3.3
3.4 3.4 3.6 3.6 3.6 3.6 3.7 4.4 4.6 4.7 4.8 5.3
N= 26
Deleted items on each end=(Trimming percentage)(N)
Trimming percentage= (Number of items deleted on each end/N)/100

Unit 1.2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 1.2

Uploaded by

Copyright:

Available Formats

Discrete variable: Set of possible values either is finite or else can be listed in an finite sequence

Histograms: A graphical display of data using bars of different heights

First make frequency distribution

Variable Frequency Scores Included in Variable

No of classes = √No of observations

Histograms when class widths are unequal

=(Rectangular width)(rectangular height)

Unimodal: with single peak

Bimodal: with peak

Multimodal: with having more than two peak.

Ordering n observation from smallest to largest

((N+1)/2 )th item in odd series

A trimmed mean is a method of averaging that removes a small designated percentage

Deleted items on each end=(Trimming percentage)(N)

Trimming percentage= (Number of items deleted on each end/N)/100

You might also like