This statistical project focuses on vehicles. These vehicles are a sample of 93 vehicles from a population of different models of cars by different manufactures in the USA. The 93 vehicles are categorized into six types being compact, small, mid-size, large, sporty and van respectively. To analyze the data in the data set of vehicles provided, two aims set out as guidelines are stipulated. The first aim is to compare price and mpg across categories of vehicles being compact, small, mid-size, large, sporty and van. To archive the first aim, price variables of both basic and top price are to be grouped according to the six car types. The mpg variables of mpg town and mpg best are also to be grouped according to the six car types. The Analysis Toolpak in Tools within Excel is then to be used to create tables for the price variables showing all descriptive statistics with the mean, median, standard deviation, range, skew measure, standard error, sum, count and coefficient of variation which is to be manually added in one table. There is to be an interpretation at the end of each table which includes concerns about the differences in the mean prices and mean mpg across vehicle types and the possible causes depending on whether it’s a mpg or price table. The distribution of price and mpg shapes across each vehicle type and what it means. Confidence intervals are to be used to talk about prices in the population of cars. Several differences of means tests on both prices and mpg are to be taken where there is no much difference between the means of different car types, in both prices and mpg. To find the likelihood of a cheap car of an efficient car across categories probability is to be used and its pattern is to be tested using chi squared tests. The second aim is to predict the mpg. This is to be done by using correlation to look for linear association between either town or best mpg and all other variables in the data set. A frequency table and a chi squared test are to be used. A correlation matrix or tables of comparative correlations of mpg measures against all other variables are to be shown, as well as comments on the implication of the correlation values. Scatter diagrams of the variables with mpg, useful for predictions are to be shown as well as the r2 value and the equation of the line. Comments on what the r2 value tells about the model used to predict the mpg and an explanation of what the value of the intercept and the slope mean in context are essential.

1

PART 1, AIM: COMPARING PRICES AND MPG’S ACROSS VEHICLE CATEGORIES 1. (A) COMPARISONS OF VARIABLE DISTRIBUTIONS TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH BASIC PRICE Type of measure

Mean Standard Error Median Standard Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count

Compact

Small

Mid-size

Large

Sporty

Van

($’000) ($’000) ($’000) ($’000) ($’000) ($’000) 15.69375 8.4285171 24.113636 22.936363 16.857142 16.2 43 36 64 86 1.4682889 0.3258061 2.1644840 1.8876764 2.1101198 0.6759766 35 9 37 33 94 6 14.05 8.2 23.05 19.9 13.7 16.6 5.8731557 1.4930314 10.152330 6.2607144 7.8953456 2.0279299 39 32 04 52 87 79 0.3742353 0.1771404 0.4210202 0.2732405 0.4683679 0.1251808 31 63 84 45 64 62 1.0521697 1.5958385 0.6866637 1.1447903 1.5386210 0.5132536 06 01 23 06 69 08 20.5 6.2 33 16.9 25.5 5.9 8.5 6.7 12.4 17.5 9.1 13.6 29 12.9 45.4 34.4 34.6 19.5 251.1 177 530.5 252.3 236 145.8 16 21 22 11 14 9

Fig.1 basic price INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN Fig 1 above shows the mean prices of cars as $15 693.75 for compact cars, $8

428.52 for small cars, $24 113.63 for mid- size cars, $22 936.36 for large cars, $16 857.14 for sporty cars and $16 200 for vans. Mid-size cars have the highest

mean price and the reason for this may be because they have an average engine size of 3.1 litres which is second biggest to large cars which have the highest average engine size of 4.1 litres. This coupled by the high number of air bags mid-size cars have ads on to their cost. However, small cars show the lowest mean price of $8 429. The difference between this price and that of mid-size cars is $15 684.6 which is way too extreme. This might be due to the

2

fact that the mean engine size of small cars is 1.6 litres and that of mid-size cars is 3.1 litres, hence the difference is 1.464 litres. Another possible cause of the extreme difference in mean prices between these two car types is that mid-size cars have a total number of 25 airbags whilst small cars have only 5 airbags.

MEDIAN The figure above shows Small cars having the lowest median price $8 200 compared to other car types. It is followed by sporty cars with a median price of $13 700, then compact cars $14 050. Next on the sequence are van cars with $16 600 then large cars with a median price of $19 900 and lastly midsize cars with a median price of $23 050.The median price values are therefore unaltered by extremely low or high prices. SKEW MEASURE From fig1 above shows the Skewness of compact cars as 1.052169706, small cars as 1.595838501, mid-size cars as 0.686663723, large cars as 1.144790306, sporty cars as 1.538621069 and lastly vans cars with a skew measure of 0.513253608. The figure goes on to indicate that compact, small, mid-size, large and sporty cars have positive Skewness. Vans display negative Skewness. This might be due to the fact that of all the car types, van cars have the lowest number of airbags of 3 whereas other car types have an average number of 14.4 airbags. COEFFICIENT OF VARIATION

Compact cars have a higher coefficient of variation (CV) 0f 0.374235331, this

means that variables are too spread away from the mean price. Small cars have a CV of 0.1771, mid-size cars show a CV of 0.4210 and large cars reveal that of 0.2732. Second but not last are sporty cars, they have a CV of 0.4684 and lastly van cars show a value of 0.1252 which is quite close to the mean. This means that for all car types, variables are spread away from the mean which means that they have extreme values except for van cars which show that the variables are close to the mean. RANGE In figure 1 above, compact cars have a range in basic price of $20 500 while small cars have a range of $6 200and mid-size cars have a range of $33 000. Again, large cars show a range of $16 900 whilst sporty cars show a range of

3

$25 500. Lastly, van cars reveal a range of$ 5 900. In contrast, mid-size cars have the highest range in basic price of $33 000 and the difference between this range and that of van cars is $27 100, this is too extreme and could be due to the fact that mid-size cars are really expensive more than van cars. However, the difference in range between mid-size cars and sporty cars is only $7 500, this could be as a result that sporty cars and mid-size cars almost have the same prices

**TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH TOP PRICE
**

Type of measure Mean Standard Error Median Standard Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count Compact Small Mid-size Large Sporty Van

($’000) ($’000) ($’000) ($’000) ($’000) ($’000) 20.725 11.904761 30.313636 25.672727 21.957142 22.033333 9 36 27 86 33 1.9902365 0.6117296 3.2162486 2.0107027 2.2912570 1.0030509 86 4 24 68 16 02 18.5 11.3 27.35 21.9 21.2 21.7 7.9609463 2.8032973 15.085543 6.6687466 8.5730987 3.0091527 42 78 24 45 38 05 0.3841228 0.2354769 0.4976487 0.2597599 0.3904469 0.1365727 63 79 49 61 17 74 0.9495877 0.9181551 1.8169565 0.9124131 1.0292764 0.1245391 1 13 84 15 06 88 25.7 10.9 65.1 19.4 30.5 8.6 11.4 7.9 14.9 18.4 11 18 37.1 18.8 80 37.8 41.5 26.6 331.6 250 666.9 282.4 307.4 198.3 16 21 22 11 14 9

Fig.2 top price INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN The figure above shows compact cars have a mean price of $ 20 725, small cars $11 905, midsize $30 314, large cars $25 673, sporty cars $21 957 and lastly vans with a mean price of $ 22 033. It indicates that mid-size cars have the highest mean price and this means that they are the most expensive cars. This is because they have the highest average number of airbags than any

4

0. Again in the preceding sequence.1365 respectively. and as for large cars is $21 200 and lastly van cars have a median price of $21 700. the median price of small cars is $11 300 and the mid-size cars have a median price of $27 350. we can note that the coefficient of variation values for compact. small cars as 0. the one for large cars is $21 900. This might be due to that of all the car types.6 litres. SKEWNESS From fig 2. and sporty cars as 1. mi-size.other car types and this feature adds on to their cost.2385.3841.94958771. Lastly van cars show a range in top price of $8 600 being the lowest. COFFICIENT OF VARIATION Looking at figure 2. 0. The difference between the 5 .124539188. midsize cars are the only car types with a total number of 25 airbags whereas other car types have an average number of 10 airbags. followed by sporty cars of $30 500 and compact cars which have a range of $25 700. RANGE Figure2 shows that mid-size cars have the highest range in top price of $65 100. Moreover. large cars follow with a range of $19 400 and small cars follow with that of $10 900. However. small.918155113 .4976. shows the Skewness of compact cars as 0. MEDIAN The median price of compact cars is $18 500.029276406 and vans as 0. This means that the variables are extremely spread away from their mean prices this is therefore due to some prices being too high and some too low. large cars as 0. sporty and van cars are 0.912413115. and 0. mid-size cars as 1. From the data above there is a resemblance of positive Skewness across all car types. this could be due to the fact that their variables and extras are different. Mid-size cars have the highest median price and small cars have the lowest median price. 0. Again. the coefficient of variation for large cars is 0. this figure also shows that small cars have the lowest mean price and this may be because they have the smallest average engine size of just 1.816956584.2598 and this value shows that the variables are spread closely to the mean price. The difference between the median prices is $16 050.3904.

501514 39 0.2247448 71 0. TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH MPG TOWN Type of measure Mean Standard Error Median Standard Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count Compact 22.0439705 03 22.545454 55 0.3332482 97 29 6.081765 63 0.857142 86 1.0497813 2 3 15 18 153 9 6 .546043 7 4 16 20 202 11 Sporty 21.4082482 9 17 1.0057805 8 6 20 26 363 16 Small 29.452723 62 19 1.4041305 82 19 1.0969811 39 0.8955404 49 0.2878894 33 24 22 46 627 21 Mid-size 19.2046314 76 1.36363 64 0.5 3.9224550 28 0.4806137 57 23 1.9061799 44 0.4767935 71 13 17 30 305 14 Van 17 0.0847363 09 0.1097112 39 0.785714 29 1.0720438 15 1.1793000 63 0.These differences in range of top price could be due to the fact that some prices are too extreme.0381364 36 7 16 23 430 22 Large 18.6875 0.price of midsize cars which is higher than that of any car type and that of vans which is the lowest is $56 500.

small cars have 1. MEDIAN The figure above (3) shows that the median values of mpg town arranged in ascending order are van with 17 miles. These means that the variables are clustered more to the median that the mean.287889433. Small cars and mid-size cars have showed a considerable positive skew shape. Again. mid-size have cars 0. sporty cars 0. large cars and sporty cars have negative Skewness. Small cars have the highest mean mpg because they have a relatively small engine size therefore they travel more miles than van cars which have the smallest mean mpg as they have bigger engine sizes. small cars is 29 857 miles.476793571 and vans have -1. compact cars.049781318. The median value in mpg is 20. compact with 23miles and small with 29 miles. sporty cars have a mean mpg of 21 786miles and lastly van is 17 000miles. 7 . mid-size cars is 19 545 miles and large cars is 18 364 miles. sporty with 22.00578058. large and mid -size with 19 miles.081765634 .084736309 .5 miles.75miles as half of the values are above and below it. for small cars is for 0. large cars -0.096981139 . Small cars are therefore said to be fuel efficient and can travel more miles than van cars without using little fuel. Van cars reveal a symmetric skew shape.Fig 3 MPG town INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN Figure 3 shows that the mpg for compact cars is 22 688 miles.546043701. COEFFICIENT OF VARIATION The co efficient of variation for compact cars is 0. This means that the median value is not affected by extremely high or low prices. this means that the median and mean mpg variables are relatively the same.038136436. mid size cars is 0. SKEWNESS Figure 3 shows that compact cars have a skew measure of -0. large cars is 0. In addition.204631476 .

Small cars have the highest range in mpg of 24miles.727272 73 0.484322 1 22 1.7352720 58 30 2. Vans display the smallest range among all the car types and therefore we can conclude that there is less dispersion in their MPG.785714 29 0.5 3. small cars have 24 miles followed by mid-size cars with a range of 7 miles then large cars with 4 miles. RANGE The figure above indicates that compact car have a range of 6 miles. it can be stated that the coefficient of variation values in miles for all cars types are spread away from the mean. this could be due to that they have a small engine size and low fuel consumption rate.875 0.88888 89 0.9731481 23 28.2720777 28.6411868 21. Looking at the figure above.6090912 26.476190 48 1.9410882 35. sporty cars with 13 miles and lastly vans with a range of 3 miles.727272 73 0.2240040 61 33 5.072043815.2720777 26.179300063 and the CV for vans is 0.3835458 75 26. These means that some values are extremely high and some are extremely low in miles.452966 8 .3835458 75 26 1.5 1. TABLE SHOWING THE DESCRIPTIVE MEASURES OF DIFFERENT CAR TYPES WITH MPG BEST Type of measure Mean Standard Error Median Standard Compact Small Mid-size Large Sporty Van 29.sporty cars is 0.

The median mpg values are unaffected by extremely low or high miles.066379 17 0.60909121 followed by sporty cars with 3.1581086 14 1. van cars have the smallest mpg best of 22miles.0475947 45 0. Small cars have a lower consumption rate as compared to van cars.Deviation Coefficient of variation Skewness Range Minimum Maximum Sum Count 34 0. SKEWNESS According to the figure above.0984464 68 0. On the sequence follows van cars with a skew measure of 1.1216279 5 9 22 31 588 22 56 0. Compact cars have a range of 30 miles and lastly small cars have a range of 33 miles. In contrast. mid-size cars is 26 727miles and large cars is 26 727miles. small cars have 35 476miles. In addition.0912718 3 3 25 28 294 11 61 0.071153 4 4 20 24 197 9 Fig.452966315 and lastly mid-size and large cars with same skew value of 1.5018105 76 12 24 36 403 14 31 0.641186861 and compact cars with a skew measure of 2.889. sporty cars have a mean mpg of 28 786miles and lastly van is 21 889miles.6 litre per car and therefore can travel over a long distance consuming a gallon of a fuel.5 miles respectively. they have a large engine size of about 3. Again. large and mid-size cars have a range of 26.2 litres per car. these cars covers short distances over a gallon of fuel. large cars follow with a range in mpg best of 26miles. it has are relatively small engine size of 1.5890515 28 10 26 36 478 16 6 0. This might be due to the fact that these cars types 9 .272077756. Small cars have the highest mpg of 35.0475947 45 0.1264928 43 0. 4 MPG best INTERPRETATION OF DESCRIPTIVE STATISTICS MEAN Fig 4 shows that the mpg for compact cars is 29 875miles.476. we note that small cars have the highest skew measure of 5.1846060 75 21 29 50 745 21 56 0. MEDIAN According to the figure above.5 miles and 28. van car have the lowest mpg of 21. Across all these car types except for mid-size cars there is positive Skewness which shows that the mean is large that the median.94108823.

This means that there is less dispersion in mpg of large cars than in small cars because the range in of 3 miles is less than the range of 21 miles. large is 3miles followed by sporty cars with 12miles and finally vans with 4miles. small cars is .05 3. mid-size cars 9miles. 10 . We used t-test because n is less than 30(n₁=16 and n₂=9) and population standard is not known. That is the mpg of large cars is clustered more closely around the mean as compared to that of small cars which is dispersed away from the mean (B) HYPOTHESIS TEST OF TWO MEANS We undertook hypothesis testing to test the interesting differences between car types for prices and mpg’s.large cars is . RANGE The range of compact cars is 10miles . α= 0. H₀ : μ₁=μ₂ H₁: μ₁≠μ₂ 2.sporty cars is and for large cars is . 1. mid-size cars is . the coefficient of variation values in miles for all cars types are spread away from the mean. The t-test was used as n was less than 30.have a relatively medium sized engine. The mid-size cars show a symmetric Skewness because it has small sized engines. small cars is 21miles . BASIC PRICE COMPACT AND VAN CAR TYPES We tested the difference in mean basic prices because there was not so much difference between their prices. Small cars have the highest mpg range of 21 miles and large cars have the lowest mpg range of 3 miles. COEFFICIENT OF VARIATION The coefficient of variation for compact cars is . This means that some values are extremely high and some are extremely low in miles.According to the figure above.

Therefore at 5% level of significance.179 11 .086. t.179 or tc › 2.120 therefore reject Hₒ if tc‹ -2.test because n is less than 30 (n₁=14 and n₂=9) and the population standard deviation is not known.086 or tc › 2. SPORTY AND VAN CAR TYPES There is a slight difference in means test between these vehicles types. at 5 % level of significance the null hypothesis is not rejected. 1. t. TOP PRICE SPORTY AND VAN CAR TYPES 1.210 5. Therefore. we do not reject the null hypothesis. 5.test because n is less than 30 (n₁=14 and n₂=9) and the population standard deviation is not known.05 3.120 or tc › 2. There is sufficient evidence that the two sample means are from the same population.05 3.313191922. H₀ : μ₁=μ₂ H₁: μ₁≠μ₂ 2.086. T-statistic= -0. 4. Decision rule: reject H ₒ if tc ‹ -2. T-statistic = 0. Critical value= 2. Critical value = 2. α= 0. There is enough evidence that the two sample means are from the same population.179.296577999. α= 0.4. Reject Hₒ if tc ‹ -2. Critical value = 2. 4. H₀ : μ₁=μ₂ H₁: μ₁≠μ₂ 2.

3. 12 .052 or tc › 2.05 standard deviation is not known.5. At 5% level of significance the null hypothesis not rejected. t. SPORTY AND COMPACT CAR TYPES 1. H₀ : μ₁=μ₂ H₁: μ₁≠μ₂ 2. H₀ : μ₁=μ₂ H₁: μ₁≠μ₂ 2. At 5% level of significance we do not reject the null hypothesis.101 Reject Hₒ if tc ‹ -2. 1. Sufficient evidence can show that sporty and van car types are quite equal. Sufficient evidence can show that sporty and compact car types are quite equal and belong to the same population.101 or tc › 2. MPG TOWN COMPACT AND SPORTY CAR TYPES A hypothesis test is taken to test the slight difference in means between these two car types.052 5. T statistic= 0.7846646963. Sufficient evidence can show that sporty and compact car types are quite equal. α= 0. T statistic= -0.030460311. α= 0. t.405985032.test because n is less than 30 (n₁=16 and n₂=14) and the population standard deviation is not known. T statistic= -0. Critical value = 2. Critical value = 2. At 5% level of significance the null hypothesis not rejected.05 3. 4. Reject Hₒ if tc ‹ -2.test because n is less than 30 (n₁=16 and n₂=14) and the population 4.052.101 5.

MID-SIZE AND LARGE CAR TYPES The difference in means between these two vehicles was so small we had to carry a hypothesis test. t.1. α= 0. T statistic= 1. t.test because n is less than 30 (n₁=22 and n₂=11) and the population standard deviation is not known.947428329. Sufficient evidence can show that mid-size and large car types are quite equal and belong to the same population. 13 .test because n is less than 30 (n₁=22 and n₂=11) and the population standard deviation is not known.060 or tc › 2. α= 0.05 3. Critical value = 2.060 Reject Hₒ if tc ‹ -2. MPG BEST MID-SIZE AND LARGE CAR TYPES We carried a hypothesis test as follows to test the difference in means for the above car types.05 3. 1. At 5% level of significance the null hypothesis not rejected. H₀: μ₁=μ₂ H₁: μ₁≠μ₂ 2. 4. H₀: μ₁=μ₂ H₁: μ₁≠μ₂ 2.060 5.

T statistic= 0.060 or tc › 2.4. COMPACT AND SPORTY CAR TYPES A t-test hypothesis test is undertaken to test the difference in means between these two vehicle types. Critical value = 2. At 5% level of significance the null hypothesis not rejected. 14 . 4.086 Reject Hₒ if tc ‹ -2.96 and not around 2. At 5% level of significance the null hypothesis not rejected.060 5.test because n is less than 30 (n₁=16 and n₂=14) and the population standard deviation is not known. Sufficient evidence can show that mid-size and large car types are equal and belong to the same population.893084498.58 standard deviations of the population mean. T statistic= 0.086 5. α= 0. 1. (C) CONFIDENCE INTERVAL Here confidence interval is used on prices and mpg’s in the population of cars.05 3.086 or tc › 2. Critical value = 2.060 Reject Hₒ if tc ‹ -2. H₀: μ₁=μ₂ H₁: μ₁≠μ₂ 2. 95% confidence interval was used because the standard error lies between 1. t. Sufficient evidence can show that sporty and compact car types are quite equal and belong to the same population.

789937051 to 9.99297785 14.2 From the figure 5.93636364 16.57159631 7.63620945 12. mid-size cars $19 871 and $28 356. large cars $19 237 and $26 636.52491425 Compact Small Mid-size Large Sporty Van Figure 5 15.35602507 19.4285172539 24.23651783 to 26.85714286 16.11363636 22.87124765 to 28.875085575 to 17. small cars $7 790 and $9 067.067097233 19.81590369 to 18. sporty 15 .A TABLE SHOWING CONFIDENCE INTERVAL ACROSS CAR TYPES BASIC PRICE Car types Mean price($’000) Confidence interval($’000) 12. there is 95% confidence that the true population means for compact cars lie between $12 816 to $18 572.72130787 to 20.6937 8.

TOP PRICE Car types Compact Small Mid-size Large Sporty van Mean Price($’000) 20 725 11 905 30 314 25 673 21 957 22 033 Confidence Interval($’000) 16 824 to 24 626 10 706 to 13 104 24 001 to 31 617 21 732 to 29 614 17 466 to 26 448 20 067 to 23 999 Figure 6 Figure 6 shows that there is 95% confidence that the population means of compact cars lie between $16 824 and $24 626. MPG TOWN Car types Mean mpg(miles) Confidence 16 . mid-size cars between $24 001 to $31 617 while large cars lie between $21 732 to $29 614. Again. sporty cars come around between $17 466 to $26 448 and lastly van cars are between $20 067 to $23 999.cars $12 721 and $20 993 and lastly van cars lie between $14 875 and $17 525. small cars between $10 706 to $13 104.

8miles.3miles.2 19. van cars lie between 16.8 17.9 to 22.7 to 20. mid-size between 18. 17 . large cars between 17.6 27.4 21.9 19.7 20.2miles to 17.2miles and sporty cars lie between 19.8miles.9miles to 30.9 to 30.9miles to 22.8 22. small cars between 27.5 26.8 Figure 8 The figure above(8) reveals that there is a 95% confidence that the true population mean for compact cars lies between 28.5 26.8 16.8 21.0 to 27.9 35.3 17.3miles.7miles whereas van cars lie between 20.7miles to 23.4miles to 31.7 29.3 33.5miles.2 to 17.6miles.5 to 19.7miles to 20.5 18.2 to 32.5miles to 19.7 28.7 to 23.0 to 27.1 to 37.5 26.8 Confidence Interval( miles) 28.7miles to 23.9miles between 26. Lastly.4 to 31.9 26.7 to 23.5 18.interval( miles) Compact Small Mid-size Large Sporty van Figure 7 Figure 7 above shows that there is a 95% confidence that the true population mean of compact cars lies between 21. small cars between 33. MPG BEST Car types Compact Small Mid-size Large Sporty Van Mean mpg(miles) 29.0 21.7 26.2miles to 32.8miles.1miles to 37.

(D) PROBABILITY BASIC PRICE (Average price $17 100) Car types High price($’000) 5 0 13 11 4 2 35 Low price ($000) 11 21 9 0 10 7 58 Total ($000) 16 21 22 11 14 9 93 Compact Small Mid-size Large Sporty Van Total Figure 9 According to figure 9.a) Compact and low in basic price is 11/58 b) Large but high in basic price is 11/35 18 . the probability that a car is.

c) A van is 9/93 d) Is of a low price 58/93 TOP PRICE (Average price $21 900) Car types Compact Small Mid-size Large Sporty Van Total(miles) Figure 10 The figure above reveals that the probability that a car could be:a) Mid-sized is 22/93 b) Low in top price is 57/93 c) Sporty but low in top price is 8/14 d) Compact and highly priced is 6/16 19 High price($’000) 6 0 14 6 6 4 36 Low price($’000) 10 21 8 5 8 5 57 Total($’000) 16 21 22 11 14 9 93 .

a) Has a low mpg but it’s a van is 9/45 b) Is sporty is 7/93 c) Is small but has a high mpg is 20/48 High MPG (miles) 9 20 1 11 7 0 48 Low MPG (miles) 7 1 21 0 7 9 45 Total(miles) 16 21 22 11 14 9 93 d) Has a high mpg but it’s a van is 0/48 20 .MPG TOWN (Average MPG 22.5miles) Car types Compact Small Mid-size Large Sporty Van Total(miles) Figure 11 Looking from figure 11. the probability that a car.

MPG BEST (Average MPG 29.1 miles) Car types High MPG(miles) 9 19 4 0 6 0 38 Low MPG(miles) 7 2 18 11 8 9 55 Total miles Compact Small Mid-size Large Sporty Van Total(miles) 16 21 22 11 14 9 93 Figure 12 This figure (12) shows that the probability that a car.a) Has a high mpg is 38/93 b) Is small and has a low mpg is 2/21 c) Sporty and has a high mpg is 6/38 d) Is compact is 9/93 21 .

83 8. the computed x2 value is 22.82 4.83 fₒ fₒ.44.(E) CHI-SQUARED Chi-squared was used across both prices and MPG’s to test the apparent pattern of probability.83 5.83 5. 22 . At the 0.83 0.17 -1.67 As from figure 13. -0.0 51.fₒ (fₒfₒ)2 (fₒ.05 level of significance we reject the null hypothesis and accept the alternate hypothesis.83 -3. The difference between the observed and expected high prices is large enough to be considered significant. BASIC PRICE Car types High price ($’000) fₒ 5 0 13 11 4 2 5. It is beyond the rejection region with a critical value of 11.83 -5.83 5.83 5.12 5.35 14.58 0.57 2.4 26.1.fₒ)2 /f ₒ 0.17 5.83 7.83 5.69 34.7 3.52 Compact Small Mid-size Large Sporty Van Figure 13.

It is beyond the rejection region with a critical value of 11.TOP PRICE Car types High price ($’000) fₒ 6 0 14 6 6 4 6 6 6 6 6 6 fₒ fₒ. the computed x2 value is 17. 23 .67 Compact Small Mid-size Large Sporty Van Figure 14 0 -6 8 0 0 -2 0 36 64 0 0 4 fₒ: observed high price fₒ: expected high price As from figure 14.34.1. At the 0.67 0 0 0.05 level of significance we reject the null hypothesis and accept the alternate hypothesis.fₒ (fₒfₒ)2 (fₒ.fₒ)2 /fₒ 0 6 10. The difference between the observed and expected high prices is large enough to be considered significant.

24 .fₒ)2 /fₒ Compact Small Mid-size Large Sporty Van Figure 15 1 12 -7 3 -1 -8 1 144 49 9 1 64 0.05 level of significance and we do not the alternate hypothesis.125 0.5 miles. The null hypothesis is rejected at the 0.125 1.fₒ (fₒfₒ)2 (fₒ.1.MPG TOWN Car types High mpg(miles) fₒ 9 20 1 11 7 0 8 8 8 8 8 8 fₒ fₒ.125 18 6. It is in the rejection region beyond the critical value of 11.125 8 This figure (15) reveals that the computed x2 value is 33.

33 6.33 7.43 40.01 0.33 -6.05 level of significance and we accept the alternate hypothesis.53 5.13 25.36 0.01 1.fₒ (fₒfₒ)2 2.11 40.86 6.01 miles.33 6.33 6.32 (fₒ. It is in the rejection region beyond the critical value of 11.33 -6.1.67 12.67 -2.33 fₒ fₒ.33 -0.32 0.02 6.fₒ)2 /fₒ Compact Small Mid-size Large Sporty Van Figure 16 This figure (16) reveals that the computed x2 value is 40.33 6. The null hypothesis is rejected at the 0.33 6.13 160.MPG BEST Car types High mpg(miles) fₒ 9 14 4 0 6 0 6. 25 .

019 .031 -.742.307 -.791 .054 -. HORSEPOWER (hp) From figure 17.578 -.811 Large -.335 -.021.378 Figure 17 From figure 17. variables which seem to be relevant and reasonable determinants of MPG town are horsepower (hp).581 -. large cars -0.870 -.960 .670 Sporty -.742 -.738 Mid-size -. compact 0. the coefficient of correlation r for MPG town and horsepower for car types are as follows. For this part we examine the relationship between MPG (miles per gallon) and other variables in the data set.282 Small -.488 -.069.703 Van .776 and van cars 0.718 -.040 -.153 -.257 .574 -. mid-size -0.441 -. engine size (litres). sporty cars -0. MPG town seemed suitable and reasonable A table showing a correlation matrix across vehicle types and variables MPG TOWN Compact HP Length Engine size RPM Weight .069 .362 -. maximum revolutions of engine per minute (RPM) and weight (pounds).021 .123 -. small -0.574.637 -.335.776 -.PART 2 AIM: PREDICTING MPG (2) a. Sporty cars have the 26 .

791. however RPM can be used as a predictor for MPG town for large cars unlike compact cars. sporty -0.strongest negative correlation therefore horsepower can be used to predict the MPG for sporty cars. WEIGHT (pounds) From figure 17.870. compact -0. sporty -0.960.703 and van -0. they show the strongest negative correlation amongst other car types. compact 0.811. this means that there is a relationship between engine size and large cars since they have the strongest negative correlation thus can be used as an MPG determinant for large cars.040 and van 0. However.153.257.There is a relationship between weight and MPG town of mid-size cars. mid-size -0. weight can be used as an MPG determinant for mid-sized cars.738. small 0.019. while compact cars have the weakest positive correlation with an r value quite close to 0. small -0. SCATTER DIAGRAMS TO SHOW PREDICTION OF MPG TOWN AND RELEVANT VARIABLES 27 .718. MAXIMUM REVOLUTIONS OF ENGINE (per minute) Figure 17 shows that the coefficient of correlation r for MPG town and RPM for car types are as follows.670.Large cars have the strongest positive correlation therefore have a strong relationship between RPM and MPG town. large 0. ENGINE SIZE (litres) Figure 17 shows that the coefficient of correlation r for MPG town and engine size for car types are as follows. large -0. sporty -0. the coefficient of correlation r for MPG town and horsepower for car types are as follows.054.637.031. mid-size -0. large -0. small -0.578 and van cars -0.378. mid-size -0.282. compact -0.

11. 28 .1121.495 where y= MPG town and x = horsepower.023x + 22. In equation y= -0.2% of variation in MPG town is justified by changes in horsepower. r2 as 0.Graph 1 Graph 1 shows the coefficient of determination. The graph shows that for a unit increase in horsepower there is a decrease in MPG town.

Graph 2 Graph 2 shows the coefficient of determination. However. engine size is a perfect predictor of MPG ton for large cars. 92% of variation in MPG town is justified by changes in engine size. Graph 3 29 . In equation y= -1. r2 as 0.018 where y= MPG town and x = engine size. The graph shows that for a unit increase in engine size there is a decrease in MPG town.9224.8184x + 26.

041. the coefficient of correlation r for MPG best and horsepower for car types are as follows. engine size (litres).685 -. r2 as 0. small -0. The graph shows that for a unit increase in weight there is a decrease in MPG town.803 where y= MPG town and x = weight.548. maximum revolutions of engine per minute (RPM) and weight (pounds). 65. compact -0.053. large cars -0.433 and van cars -0.797 -.654 -.198 HP Length Engine size RPM Weight Graph 18 From figure 18.295 -.620 Mid-size -.317 .355.222.6572. while compact cars have the weakest positive correlation with an r value quite close to 0. HORSEPOWER From figure 18. 30 .014 -.Graph 3 shows the coefficient of determination. ENGINE SIZE (litres) Figure 18 shows that the coefficient of correlation r for MPG best and engine size for car types are as follows. variables which seem to be relevant and reasonable determinants of MPG best are horsepower (hp).355 .549 -.296 -.041 . Sporty cars have the strongest negative correlation therefore horsepower can be used to predict the MPG for sporty cars. compact -0.780.724 and van cars 0. sporty cars -0.176 -. this means that there is a relationship between engine size and large cars since they have the strongest negative correlation thus can be used as an MPG determinant for large cars.546 -.745 Sporty -. mid-size -0. mid-size -0.149 -.187 -.317.548 -.254 -.612 Van . large -0. A TABLE SHOWING A CORRELATION MATRIX ACROSS VEHICLE TYPES AND VARIABLES MPG BEST Compact -.780 .453 -.628.053 .207 -.452 Small -.685. In equation y= -0.294 -. Therefore weight is a perfect predictor of MPG ton for mid-sized cars.628 -. small -0.546.688 Large -.222 -.724 -.7% of variation in MPG town is explained by changes in weight.0048 + 35. sporty -0.

Large cars have the strongest positive correlation therefore have a strong relationship between RPM and MPG town.149. compact 0.WEIGHT (pounds) From figure 18. weight can be used as an MPG determinant for large cars while van cars show the weakest. large -0. sporty -0. however RPM can be used as a predictor for MPG town for large cars unlike compact cars. compact 0.612 and van -0.198.688. small -0. large 0. the coefficient of correlation r for MPG best and weight for car types are as follows.620.745.There is a relationship between weight and MPG best of large cars. However. small -0.797. mid-size -0.549. sporty -0. mid-size -0.452. SCATTER DIAGRAMS TO SHOW PREDICTION OF MPG BEST AND RELEVANT VARIABLES 31 . MAXIMUM REVOLUTIONS OF ENGINE (per minute) Figure 18 shows that the coefficient of correlation r for MPG best and RPM for car types are as follows.254.014.207 and van 0. they show the strongest negative correlation amongst other car types.

2518 + 31. Therefore weight is a perfect predictor of MPG best for large cars.Graph 4 Graph 4 shows the coefficient of determination. The graph shows that for a unit increase in engine size there is a decrease in MPG town.996 where y= MPG best and x = engine size. 60% of variation in MPG best is explained by changes in engine size. r2 as 0.609. 32 . In equation y= -1.

3.0019x + 29.9% of variation in MPG best is justified by changes in weight. Graph 6 Graph 6 shows the coefficient of determination.Graph 5 Graph 5 shows the coefficient of determination.768 which is quite impossible. Above an mpg of 0 a unit in (pounds) increase in weight to a -0. Above an mpg of 0 a unit in (hp) increase in horsepower to a -0.068x + 30. In equation y= -0. 0. 33 .0392.0019 increase in efficiency.768 which is quite impossible. The graph shows when weight is 0 mpg is 30. The graph shows when horsepower is 0 mpg is 30. This clearly shows that weight is a bad predictor of MPG best for van cars. Therefore horsepower cannot be used to determine MPG best for compact cars. 068 increase in efficiency. r2 as 0.028.28% of variation in MPG best is justified by changes in horsepower. In equation y= -0.036 where y= MPG best and x = weight.768 where y= MPG best and x = horsepower. r2 as 0.

a conclusion can be drawn that for the first aim. positive and negative skewness. that is. The confidence intervals show the approximate values of where the true population mean might lie in the population. across all the car types. Of all the car types. The coefficient of variation shows that all car types have variables spread away from the mean with extreme values except for vans and large cars which have variables close to the mean. The mpg variables have all the different types of skewness being symmetric. All the car types are positively skewed except van cars due to the fact that these car types have a high average number of 14.CONCLUSION From analyzing the data in the data set of vehicles provided. there are extremely high and low mpg values. Chi-square tests based on probability used to test the differences in the apparent pattern in 34 . Both median prices are unaffected by extremely low or high prices. midsize cars have the highest mean prices because they have most features like a large number of airbags and large engine sizes. All hypothesis tests results from the differences of means test conducted show sufficient evidence that all these all these car types are from the same population.4 air bags whereas van cars have a low average number of 3 airbags. Both the mpg medians are unaffected by the extremely high or low mile values. small cars have the highest mean mpg mainly because they have small engine sizes. The coefficient of variation shows that the mpg variables for the entire car types are spread away from the mean. medium sized and relatively larger sized engines respectively. this is due to the small.

Another significant relationship which can be used for prediction of mpg town is the one between weight and mpg town for large cars which is shown as a strong negative correlation relationship. For the second aim. 92% of variation in mpg town is justified by engine size for large cars. it has been established that 11. Engine size can be used as a determinant of mpg for large cars since there is a strong negative correlation relationship between engine size and large cars.probability show that the difference between the observed and expected frequencies is large enough to be considered significant. Using the equation from the scatter diagrams showing the relationship between different variables and mpg town.2% of variation in mpg town is justified by changes in horsepower for large cars. from using the variables in the correlation matrix a conclusion can be drawn that sporty cars have the strongest negative correlation to be used for the prediction of mpg for sporty cars using horsepower whereas compact cars have the weakest positive correlation with an r value quite close to zero. APPENDICES BASIC PRICE TOP PRICE 35 . Large also have a strong positive correlation relationship between maximum revolutions of engine per minute and mpg town.

MPG TOWN 36 .

Coefficient of variation was calculated by hand though by applying the formula coefficient of variation = standard deviation ̸ mean. dev.960 confidence level mean std. n z half-width upper confidence limit lower confidence limit small Confidence interval . dev.MPG BEST Excel tool pak was used to obtain values for the discriptive satistics measures of mean.5715 12. n z 37 .69375 5.87315574 16 1.960 2.803297378 21 1. CONFIDENCE INTERVAL basic price Confidence interval .9047619 2.8160 confidence level mean std.8778 18.mean 95% 15. mode and skewness. median.mean 95% 11.

dev.85714286 7.9363634 6.mean 95% 16.6941 confidence level mean std.6361 19.2366 confidence level mean std. n z half-width 38 .6998 26. dev.mean 95% 24.1990 13.mean 95% 22.1037 10.1787 20.15233004 22 1.1358 confidence level mean std. n z half-width upper confidence limit lower confidence limit sporty Confidence interval .260714452 11 1.960 4.7058 half-width upper confidence limit lower confidence limit midsize Confidence interval . dev.960 4.895345687 14 1.2423 29.1. n z half-width upper confidence limit lower confidence limit large Confidence interval .9363636 10.960 3.

626 16.901 24.mean 95% 11.824 confidence level mean std.725 7.525 14.325 17.960 1.9929 12.mean 95% 20.90476 confidence level mean 39 .960 3. dev.7214 upper confidence limit lower confidence limit van Confidence interval . n z half-width upper confidence limit lower confidence limit TOP PRICE compact Confidence interval .027929979 9 1. n z half-width upper confidence limit lower confidence limit small Confidence interval .2 2.960947 16 1.875 confidence level mean std.mean 95% 16. dev.20.

95714 confidence level mean 40 .8205 confidence level mean std.3037 36. dev.960 2.8522 28. dev.2.31364 15.6174 24. dev.960 6.668747 21 1.1990 13.mean 95% 30.08554 22 1. n z half-width upper confidence limit lower confidence limit mid-size Confidence interval .mean 95% 21.mean 95% 25.7058 std.0099 confidence level mean std. n z half-width upper confidence limit lower confidence limit sporty Confidence interval .960 1.67273 6.1037 10. n z half-width upper confidence limit lower confidence limit large Confidence interval .5249 22.803297 21 1.

85714286 6.mean 95% 22.9659 23.4479 17.4908 26.mean 95% 29.7455 confidence level mean std.9420 23. n z half-width upper confidence limit lower confidence limit van Confidence interval .0674 confidence level mean std.4664 std.573099 14 1. n z half-width upper confidence limit lower confidence limit small Confidence interval .6875 1.6295 21.960 4. dev. 41 . dev.9993 20.mean 95% 22.8.960 1.109711239 confidence level mean std. n z half-width upper confidence limit lower confidence limit MPG TOWN compact Confidence interval .009153 9 1.960 0.03333 3.922455028 16 1. dev. dev.

n z half-width upper confidence limit lower confidence limit large Confidence interval .960 2.906179944 14 1.2440 n z half-width upper confidence limit lower confidence limit mid-size Confidence interval .501514387 11 1. dev.36363636 1.mean 95% 21.mean 95% 19.4703 27.960 0. dev. n z 42 .mean 95% 18. n z half-width upper confidence limit lower confidence limit sporty Confidence interval .7534 confidence level mean std.8873 19.895540449 22 1.960 0.2510 17.545455 1.7921 20.3375 18.21 1. dev.6131 32.78571429 3.4763 confidence level mean std.960 confidence level mean std.

dev. n z half-width upper confidence limit lower confidence limit MPG BEST compact Confidence interval . dev. n z half-width upper confidence limit lower confidence limit small Confidence interval .mean 95% 35. 43 .434 confidence level mean std.2.960 1.mean 95% 29.60909126 confidence level mean std.875 2.47619048 5.7396 half-width upper confidence limit lower confidence limit van Confidence interval .800 16.800 17.200 confidence level mean std.8319 19. dev.441 31.224744871 9 1.0461 23.316 28.960 0.mean 95% 17 1.941088234 16 1.

dev.272077756 11 1.8881 26.1957 confidence level mean std.4790 25.641186861 14 2.21 1.960 0.mean 95% 26.8752 33.160 2. n z half-width upper confidence limit lower confidence limit large Confidence interval . n t (df = 13) half-width upper confidence limit lower confidence limit 44 . n z half-width upper confidence limit lower confidence limit sporty Confidence interval .960 2.1024 30.5316 27.72727273 1.72727273 1.960 0.9755 confidence level mean std. dev. dev.3990 37.6834 confidence level mean std.mean 95% 28.78571429 3.272077756 22 1.2588 26.mean 95% 26.0772 n z half-width upper confidence limit lower confidence limit mid-size Confidence interval .7517 27.

88888889 1.452966315 9 1. dev.mean 95% 21.9396 confidence level mean std.9493 22.960 0.8381 20. n z half-width upper confidence limit lower confidence limit 45 .van Confidence interval .

