Professional Documents
Culture Documents
Univariate analysis explores each variable in a data set, separately. It looks at the range of values, as well as the central tendency of the values. It describes the
pattern of response to the variable. It describes each variable on its own.
Descriptive statistics describe and summarize data. Univariate descriptive statistics describe individual variables.
1) Raw Data
Obtain a printout of the raw data for all the variables. Raw data resembles a matrix, with the variable names heading the columns, and the information for
each case or record displayed across the rows.
Example: Raw data for a study of injuries among county workers (first 10 cases)
4 County C Fall 4
5 County B Fall 5
6 County A Violence 9
7 County A Auto 3
8 County A Violence 2
9 County A Violence 9
10 County B Auto 3
It is difficult to tell what is going on with each variable in this data set. Raw data is difficult to grasp, especially with large number of cases or records.
Univariate descriptive statistics can summarize large quantities of numerical data and reveal patterns in the raw data. In order to present the information in a
more organized format, start with univariate descriptive statistics for each variable.
Severity of Injury
3
4
6
4
5
9
3
2
9
3
2) Frequency Distribution
Obtain a frequency distribution of the data for the variable. This is done by identifying the lowest and highest values of the variable, and then putting all the
values of the variable in order from lowest to highest. Next, count the number of appearance of each value of the variable. This is a count of the frequency with
which each value occurs in the data set. For example, for the variable "Severity of Injury," the values range from 2 to 9.
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 2/10
1/15/2021 univariate statistics
3) Grouped Data
The severity of injury ratings can be collapsed into just a few categories or groups. Grouped data usually has from 3 to 7 groups. There should be no groups
with a frequency of zero (for example, there are no injuries with a severity rating of 7 or 8).
One way to construct groups is to have equal class intervals (e.g., 1-3, 4-6, 7-9). Another way to construct groups is to have about equal numbers of
observations in each group. Remember that class intervals must be both mutually exclusive and exhaustive.
4) Cumulative Distributions
Cumulative frequency distributions include a third column in the table (this can be done with either simple frequency distributions or with grouped data):
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 3/10
1/15/2021 univariate statistics
3 3 4
4 2 6
5 1 7
6 1 8
9 2 10
A cumulative frequency distribution can answer questions such as, how many of the injuries were at level 5 or lower? Answer=7
5) Percentage Distributions
Frequencies can also be presented in the form of percentage distributions and cumulative percentages.
Graphing is a way of visually presenting the data. Many people can grasp the information presented in a graph better than in a text format. The purpose of
graphing is to:
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 4/10
1/15/2021 univariate statistics
-avoid distortion
-provoke thought about the data
Bar Graphs
Bar graphs are used to display the frequency distributions for variables measured at the nominal and ordinal levels. Bar graphs use the same width for all the
bars on the graph, and there is space between the bars. Label the parts of the graph, including the title, the left (Y) or vertical axis, the right (X) or horizontal
axis, and the bar labels.
Bar graphs can also be rotated so that the bars are parallel to the horizontal orientation of the page. For example,
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 5/10
1/15/2021 univariate statistics
HISTOGRAM
A histogram is a chart that is similar to a bar chart, but it is used for interval and ratio level variables. With a histogram, the width of the bar is important,
since it is the total area under the bar that represents the proportion of the phenomenon accounted for by each category. The bars convey the relationship of one
group or class of the variable to the other(s).
For example, in the case of the counties and employee injuries, we might have information on the rate of injury according to the number of workers in each
county in State X.
County G 1.8
County H 1.7
County I 1.6
County J 1.0
County K 0.9
County L 0.4
If we group the injury rates into three groups, then a low rate of injury would be 0.0-1.9 injuries per 1,000 workers; moderate would be 2.0-3.9; and high
would be 4.0 and above (in this case, up to 5.9). This could be graphed as follows:
FREQUENCY POLYGON
A frequency polygon is another way of displaying information for an interval or ratio level variable. A frequency polygon displays the area under the curve
that is represented by the values of the variable. This type of chart is also used to show time series graphs, or the changes in rates over time.
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 7/10
1/15/2021 univariate statistics
For example, the following table shows the average injury rate per 1,000 employes for counties in State X for the years 1980 to 1990.
Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
Rate 3.6 4.2 3.4 5.5 3.8 3.1 1.7 1.8 1.0 1.6 0.9
A cumulative frequency polygon is used to display the cumulative distribution of values for a variable.
PIE CHART
Another way to show the relationships between classes or categories of a variable is in a pie or circle chart. In a pie chart, each "slice" represents the
proportion of the total phenomenon that is due to each of the classes or groups.
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 8/10
1/15/2021 univariate statistics
Other ways to look at the sub-groups or classes within one variable is by the relation of each sub-group or class to the whole. This can be calculated with a
proportion. A proportion is obtained by dividing the frequency of observations counted for one group or class (written as f) by the total number of observations
counted for the variable (written as N).
A rate is the relationship between two different numbers, for example, the number of injuries among county workers and the population of the county. This can
be calculated as the first number (N1, or injuries) divided by the second number (N2, or population).
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 9/10
1/15/2021 univariate statistics
Many health statistics are expressed as rates, for example, the birth rate is the number of births per some population, such as number of births per 1,000 women.
https://web.csulb.edu/~msaintg/ppa696/696uni.htm 10/10