You are on page 1of 1

Universidad de Castilla La Mancha.

Escuela Superior de Informática


Statistics

FINAL EXAM (May 25, 2016)

2.- 60% of the messages received on a mail server are spam. Of these, a particular filter detects and rejects
90%, although there are 2% of no-spam messages that are rejected by mistake.
a) Compute the percentage of rejected messages(0.5 points)
b) Compute the probability that a rejected message was no-spam. (0.5 points)

3.- A computer scientist is investigating the usefulness of two different design languages in improving
programming tasks. Nine expert programmers, familiar with both languages, are asked to code a standard
function in both languages, and the time (in minutes) is recorded. The data is as follows:
Programmer 1 2 3 4 5 6 7 8 9
Time Design Language 1 17 16 21 14 18 24 16 14 21
Time Design Language 2 18 14 19 11 23 21 10 13 19
a) Compute the sample median of each design language. (0.5 points)
b) Find a 95% confidence interval on the difference in mean coding times. (1 point)
c) Is there any indication that one design language is preferable? (0.5 points).

4.- The company Iberian publishers are interested in a maintenance contract for their new word processing
system. Managers believe that the annual maintenance cost, C, in hundreds of euros, is related to weekly use
time, T (in hours) through the regression equation: C= 10.53 + 0.95 T. The goodness-of-fit of data to this
equation is 0.85.
a) Compute and interpret the linear correlation coefficient between the annual maintenance cost and the
number of weekly use time. (0.5 points)
b) What annual maintenance cost can be predicted for a 30-hour weekly use? (0.5 points)

5.- The file size resulting from scan images with a particular program can be assumed as normally
distributed. The program has been improved in its latest version (version B) to the extent that people who
traded it guarantee a decrease in the mean size of the resulting files in more than 6 Kb with respect to the
previous version (version A). The new version was tested in a research center. 42 similar images were
scanned with both versions (21 in each) with the following results:
Mean Size Sample Variance of Size
Version A 70.8 96.04
Version B 63.9 105.063
a) Can equal variances be assumed for file size resulting from scan images of both versions? Use a
significance level of 0.01 (1 point)
b) Assuming equal variances for the file size, what we can say to people who traded version B? Reason
out the answer using the critical region and the p-value. (1 point)

You might also like