Professional Documents
Culture Documents
NOVEMBER 2021
LECTURER: TARAMBAWAMWE P
DURATION:5 HOUR
Q1 a. For the following data, identify whether or not they are 1. Categorical [nominal or ordinal], or 2.
Numerical [interval/ratio] [discrete or continuous] Give examples of possible values for each random
variable. [Example: number of children living in a given home – “interval data [discrete], (0, 1, 2, 3, …)”
i. marital status
Marital status is a categorical and nominal variable. The numbers are shorthand for categories
of a variable. For example
iii. Time student spends studying for their first statistics test is a numerical continuous variable this
is to say that it takes any value within a specified range for example
Students can study anywhere between 0 hours, 1-hour 30mins to 24hours.
iv. The weight loss over the first week of a “fad” diet is continuous
Nyasha Masese 200182
v. The part on a new computer that breaks during the first year of ownership is a categorical
ordinal variable for example when a there is nothing broken on the computer the value of
broken parts is zero. When the screen breaks the value of broken parts is one etc
b. Given a data set consisting of 75 data values has 109 as the highest value and 29 as the lowest value,
construct the class intervals, showing the class limits of all the classes. [10 marks]
c. With suitable examples highlights the rules relating to the drawing of cross tables, multiple bar
charts and composite bar charts. Discuss the circumstances which would require the use of each form
of data presentation.
A cross table is also known as a pivotal table is a two-way table with rows and columns that records the
frequency of respondents with specific characteristics. Cross tabulation tables provide a wide range of
information concerning relationship between variables. A cross table is used when there is not an
obvious connection between data. For example, hypothetical variables Country of residence and
favorite singer. Data can be analyzed several times in a side-by-side sequential format with column
variables called banners and row variables called stubs. Cross tabulation is used often on categorical
data that is data that can be divided into separate mutually exclusive groups. It is also used when
Nyasha Masese 200182
analyzing data with relationships that are not obvious thus making it useful when conducting market
researches and survey responses.
A multiple bar chart shows relationship between different values of data. In a multiple bar diagram, two
or more sets of inter-related data are represented. For example, we may want to represent imports and
exports of a country over several different years. We would have our years on the x axis and our import
and export values on our y axis. To represent the imports and exports, we would have to use different
colors to represent each for easier identification. We use a multiple bar chart in situations where we
need to compare grouped data variables to other groups with those same variable types. They can also
be used if we want to compare mini histograms to each other such that each bar group would represent
intervals of a variable.
A composite bar chart also known as a stacked bar chart allows the standard bar chart to be able to look
at numeric values across two categorical variables. Each bar is divided into several sub-bars stacked end
to end each corresponding to a level of the second categorical variable. For example, a clothing store
retailer may want to depict revenue for a particular time period across two categorical variable’s
location and department. We can have location as our primary category so it will be shown on our x axis
with revenue on the y axis. We will then be able to stake the revenues of each department in a given
location.
Q2a. A quality control manager takes a random sample of 100 packets of biscuits from a production
line in order to check the mean weight of the whole production. The net weights he found are
tabulated below:
Over 253 0
b. . Random samples of 1000 persons have been obtained for three countries and their incomes
have been measured. The summary statistics for the per capita income distribution over the three
STATISTIC A B C
Discuss using the variations in the earnings in the three countries and suggest which country would you
comment your uncle to go and find a job, use the statistics in the table in your presentation. [15
marks]
The appropriate measure of variability would be the range since there are no serious outliers
= 6000
= 5000
= 3 500
With the given data I would suggest that my uncle goes and seeks employment in country C. If we are to
look at the standard deviation, country C has the lowest standard deviation which goes to show that the
income is clustered closer together closer to the mean thus making this data more reliable unlike A
which has such a height standard deviation meaning the income is scattered over a huge wage gap. The
country C also has a good average salary which means that it is a good prospect. There is a small wage
gap between the highest paying job and the lowest paying job which suggest that the general living
standards are high.