- EWMA
- motivaton
- Effect of Cost of Governance on Economic Growth of Nigeria
- methodology
- Relationship between Quality Management System Adoption and Organization Performance of Public Universities in Kenya
- Habib_project__05-11-2011
- Studies of One and Two-handed Work_ I. Grasping Small Parts From
- Accounting 2012 What Drives Quality of Firm Risk Disclosure
- Quiz 2G2
- 2007 F4 Add Math Projects
- 2006-Obesity-14_431
- activity 1 reflection
- 19 - Correlation and Regression
- Nothing ventured, nothing gained. Profiles of online activity, cyber-crime exposure, and security measures of end-users in European Union.pdf
- Additional Mathematics
- null
- iee_le05_wise12_slides+comments
- Week 2 -Individual Assignment
- Correlation Depression vs Internet Use
- Finance and Monte Carlo Simulatin
- WERD1400266.pdf
- TorturingExcel.pdf
- Project Report Guidelines
- Key Terms Chapter 1 Psych
- Hasil Spss Word
- T 11 Inspection
- BMI.pdf
- SHN Resource Manual June 2012 4th Draft
- obesity needs assement
- BMI & Smoking
- Challenging 3.5 and Pathfinder Parties 2 17 13.doc
- Abigail Walters Princess Sheet
- Champion
- 5e Warlock Arianna Baines
- Areas of Research
- Chapter Twelve S
- Year 10 DRAMA work (1)
- Year 10 DRAMA work

**height, weight, BMI and obesity in children
**

Hypothesis 1: The BMI will be less spread for Year 11 females than Year 11

males but will have a similar spread for Year 7 males and females.

Hypothesis 2: The BMI will be normally distributed

Hypothesis 3: The weights and heights of Years 7-11 males will be more strongly

correlated than that of Years 7-11 females.

Author: Joseph Cryan

Course: GCSE Statistics

Date: 9 April 2009

1

Overweight

Obese

Abstract

This report investigates the relationship between weight, height, body mass index and obesity in

children. It does this by collecting primary data from Heckmondwike Grammar School students

to test the following three hypotheses:

Hypothesis 1: The BMI will be less spread for Year 11 females than Year 11 males but will

have a similar spread for Year 7 males and females.

This hypothesis was proposed because the girls have more media pressure to stay thin than boys

because of all of the models and actresses. (i.e. They are all chosen to look pretty.) The

hypothesis was tested by using box plots to look at the spread and working out the relative

spread. The relative spread for Year 11 females was greater than for Year 11 males and so the

hypothesis is not correct.

Hypothesis 2: The BMI will be normally distributed

The BMI depends upon weight and height and it is well known that these tend to have a normal

distribution. As such, it is expected that the BMI will also have a normal distribution. The

hypothesis was tested by drawing the histograms for Year 7 Females and Males and Year 11

Females and Males and then working out the areas between σ t and 2σ t . For a normal

distribution these should be 68% and 95%. All were close to this but Year 11 Males was not.

This is just a basic test and so a more accurate test was used by using the

2

χ test. This showed

that Year 7 Females and Males and Year 11 Females followed a normal distribution but Year 11

males did not. Since Year 11 Males do not have a normal distribution the hypothesis is not

correct.

Hypothesis 3: The weights and heights of Years 7-11 males will be more strongly correlated

than that of Years 7-11 females.

Because of the pressure on females to have smaller clothes size, it is expected that there will be

less correlation between weight and height for females than for males. To test this either the

Pearson or the Spearman coefficient can be used. To decide on which one, the weights and

heights of both Males and Females were checked to see if they follow a normal distribution since

to use Pearson they need to. The Pearson coefficient for Males

0.81

Males

r ·

and for Females

0.665

Females

r ·

. As the Males was larger there is a stronger correlation. This means the

hypothesis is supported.

As for the question, “Is obesity a problem at HGS?”, the percentage of overweight and

obese Year 7 and Year 11 males and females has been worked out and it is a lot less than

the national figures and so obesity is not a problem at HGS.

i

Table of contents

1 Aims, Design and Strategy .......................................................................................... 1

1.1 Aims ..................................................................................................................... 1

1.2 Design and strategy .............................................................................................. 3

1.2.1 Collecting data ............................................................................................... 3

1.2.2 Hypothesis 1: BMI spread ............................................................................. 5

1.2.3 Hypothesis 2: Normal distributions ............................................................... 5

1.2.4 Hypothesis 3: Correlation .............................................................................. 5

2 Collecting the data ....................................................................................................... 6

2.1 Types of data ........................................................................................................ 6

2.2 Questionnaire design ............................................................................................ 6

2.3 Potential problems ................................................................................................ 7

2.4 Sampling options and avoiding bias ................................................................... 8

2.5 Preliminary enquiry .............................................................................................. 9

3 Deciding on how many samples ................................................................................ 11

3.1 Generating random samples ............................................................................... 11

3.2 Checking for outliers .......................................................................................... 14

4 Hypothesis 1: BMI spread ........................................................................................ 16

4.1 Justification for approach ................................................................................... 16

4.2 Testing the hypothesis ........................................................................................ 16

4.3 Interpretation of results ...................................................................................... 19

5 Hypothesis 2: BMI normally distributed ................................................................... 20

5.1 Justification for approach ................................................................................... 20

5.2 Testing the hypothesis ........................................................................................ 22

5.3 Interpretation of results ...................................................................................... 29

5.3.1 test statistic .................................................................................................. 29

6 Hypothesis 3: Correlation .......................................................................................... 35

6.1 Justification for approach ................................................................................... 35

6.2 Testing the hypothesis ........................................................................................ 35

6.3 Should Pearson or Spearman be used? ............................................................... 38

6.4 Calculating the Pearson coefficient .................................................................... 44

6.5 Interpretation of results ...................................................................................... 45

7 Is obesity a problem at HGS? .................................................................................... 46

8 Conclusions ............................................................................................................... 48

8.1 How the project could be improved ................................................................... 49

8.2 Limitations of the project ................................................................................... 50

9 Bibliography .............................................................................................................. 51

Appendix 1 – Huddersfield Examiner Article ............................................................. 52

Appendix 2 – The sampled data .................................................................................. 53

Appendix 3 – Chi-squared test .................................................................................... 62

Appendix 4 – Pearson coefficient ............................................................................... 63

.................................................................................................................................... 68

ii

1 Aims, Design and Strategy

1.1 Aims

This work aims to investigate the relationship between weight, height, Body Mass Index (BMI) and

obesity in children by using primary data from Heckmondwike Grammar School.

Obesity is when someone is dangerously overweight and it can lead to ill health. The Body Mass

Index is used to determine if someone is obese and this is worked out from the height and weight

values and is defined as

[ ]

2

Weight (kg)

Body Mass Index =

height(m)

The BMI for boys and girls are shown in Figures 1 and 2. Obese children are defined as those with

a BMI ≥95

th

centile and overweight children have a BMI ≥91

st

centile.

Obesity amongst children is becoming a major health problem. For example, an article (Appendix

1) published in the Huddersfield Daily Examiner indicates that obesity in children has tripled in

Kirklees.

Secondary statistics are available from the “Health Survey for England 2004: Updating of trend

tables to include childhood obesity data” and these are shown in Table 1.

Boys (11-15) Girls (11-15)

Overweight 12.8% 19.3%

Obese 24.2% 26.7%

Table 1 – Children’s (11-15) overweight and obesity prevalence 2004

The National Child Measurement Programme 2006/7 shows that, for Kirklees, 14.5% of Year 6

children are overweight and 16.8% are obese.

Heckmondwike Grammar School is within Kirklees and primary data will be collected from the

School and used to test three hypothesis relating to the weight, height and BMI and these are

summarised below along with the reasons for selecting them for investigation.

1

Figure 1 – Male BMI 5-18 years

Figure 2 – Female BMI 5-18 years

2

The three hypotheses to be tested are:

Hypothesis 1: The BMI will be less spread for Year 11 females than Year 11 males but will

have a similar spread for Year 7 males and females.

This hypothesis was proposed because as they get older the girls have more media pressure to stay

thin than boys because of all of the thin models used in advertising and because being thin is

considered attractive. This would make females stay thinner to look prettier, but as boys do not have

this pressure they do not try to stay thin as much. Because of this pressure, the spread for females

should be less as they get older.

Hypothesis 2: The BMI will be normally distributed

The BMI depends upon weight and height and it is well known that these tend to have a normal

distribution. As BMI is weight divided by the square of the height then this suggests that the BMI

will also be normally distributed.

Hypothesis 3: The weights and heights of Years 7-11 males will be more strongly correlated

than that of Years 7-11 females.

Because of the pressure on females to look like models and have smaller clothes size, it is expected

that there will be less correlation between weight and height for females than for males.

1.2 Design and strategy

The software package Mindjet Mindmanager Pro was used to create a Mind Map to help in the

design and strategy for this project. The Mind Map is shown in Figure 3 and the key stages are

explained below.

1.2.1 Collecting data

Since the Body Mass Index is calculated from the weights and heights this project relies on

quantitative data. Both primary data and secondary data (from the internet) will be used. A

questionnaire will be designed for the primary data collection which will include the gender,

weight (m), height (kg) and Year group. This will be anonymous to encourage people to tell the

truth.

The different types of sampling techniques will be considered and the most appropriate for the

proposed hypothesis selected.

A preliminary enquiry will be carried out with a small sample to confirm the appropriateness of the

questionnaire and sampling approach. The primary data will be compared with secondary data to

check that the primary data and sample size is representative with known national statistics.

3

Figure 3 – Mind map of the statistics project

4

1.2.2 Hypothesis 1: BMI spread

Hypothesis 1 requires an investigation of the spread of BMI for Years 7 and 11 males and females.

This could be done by drawing the histograms and working out the standard deviations or drawing

the box and whisper diagrams and working out the interquartile range. The box and whisper

diagrams will allow the four sets of data to be easily compared on a single graph and will

immediately give a visual representation of the spread and so these will be used rather than the

histograms. Since the box and whisper diagrams require the calculations of the quartiles the inter-

quartile range will be used as a measure of spread and these will be compared for Year 7 and then

Year 11 to test the hypothesis.

1.2.3 Hypothesis 2: Normal distributions

There are many choices for presenting data including pictograms, bar-charts, choropleth maps, line

graphs, stem and leaf diagrams, frequency polygons, pie-charts, box and whisper diagrams and

histograms. However, Hypothesis 2 requires an investigation into the shape of the distribution to

see if it has the symmetrical bell-shaped curve that is associated with the normal distribution. A

further test is to work out the percentage of values that are within 1 t standard deviations, if it is a

normal distribution it should be 68% of the values and 95% should lie between 2 t standard

deviations. Since both the shape and the area are required then the best option for representing the

data is the histogram since this gives a visual representation and the area under the histogram can

fairly easily worked out, unlike a stem and leaf diagram. The standard deviation will be worked out

for each set of data and then the area under the histogram calculated to see if it follows a normal

distribution. If necessary, the more complicated Chi-squared test will be carried out to test if the

distributions are normal.

1.2.4 Hypothesis 3: Correlation

This requires an analysis of the correlation that exists between weight and height for Years 7-11

males and Years 7-11 females. A scatter diagram will be used to see if there is a connection

between the weights and heights and a line of best fit will be drawn. If there is positive correlation

the scatter graph will show that as the height increases so does the weight. If there is strong

correlation the points will lie close to the line of best fit. The degree of correlation will be

determined by working out the correlation coefficient. A correlation coefficient of zero would

mean no correlation and a correlation coefficient of one would mean strong positive correlation. A

decision will have to be made as to which correlation coefficient is going to be calculated. If the

distributions are normal then Pearson’s product-moment coefficient should be used. However, in

the case of non-normal distributions, Pearson’s will lead to the wrong results and it is also sensitive

to outliers. For non-normal distributions, Spearman’s rank correlation coefficient should be used

and so the shape of the distributions will need to be checked.

5

2 Collecting the data

2.1 Types of data

Qualitative data is data that does not have numerical values, for example, colours or months of the

year. Quantitative data is provided in numerical form for example, the price of things or

measurements such as temperature and speed. In this project, quantitative data will be used as the

values of height and weight will be used to calculate the Body Mass Index.

There are two types of data. These are primary data and secondary data.

Primary data is data that is collected completely by the user, and secondary data is information that

has already been collected and has simply been accessed for use. Rather than have all students

collecting primary data such that each student is asked their height and weight at least 150 times,

HGS has collected one set of ‘primary’ data to be used specifically for the statistics project and so

this can be considered as ‘primary data for the project team’. The secondary data that will be used

is that that has been compiled nationally and that is available over the internet, that is, data that has

not been collected by the ‘project team’. This will be used to see if the primary data is

representative of the national data.

2.2 Questionnaire design

Questions should never be biased, if a question is a Leading Question it can persuade people that

there is a right answer and a wrong answer, which can end up giving you biased results. For

example, ‘do you agree that red is a nice colour’, Questionnaires are often multiple choice, if this is

the case, there should be no gaps left in answers, there can often be mistakes because of this.

Questions should never embarrass or upset people, if there is a need to ask a sensitive question then

a promise of confidentiality or two questions in which one answer can mean two different things

should be provided. Questions should always be easy to understand, because if someone

misunderstands the question then the data received can be different to that required. All questions

should have some relevance to the survey.

The questionnaire that was used to produce the data for this project is shown in Figure 4.

Figure ? – Questionnaire for collecting the data

Figure 4 – Questionnaire used to collect the height and weight data

GCSE Statistics Data Collection

This data collection sheet is designed to be anonymous. Please fill in the details as

accurately as possible. You do not have to fill in your weight if you do not want to.

Please remove your shoes before being measured and weighed. Fold the completed

slip and pass to your teacher. Thank you.

Year 10

Please circle: Male Female

Height: m

Weight: kg

6

2.3 Potential problems

The types of problems that might happen in the data collection are:

(a) Not enough data collected because people refuse to fill in the questionnaire.

(b) Missing data where people either accidentally or deliberately do not fill in the questionnaire

correctly.

(c) Incorrect data where people accidently or deliberately enter the wrong information

(d) Data entry errors where the data has been incorrectly entered into Exel.

As the questionnaire is anonymous it is hoped the people will fill it in and a decision will be made

on whether or not there are enough returns once the data has been collected. If there are not,

thought will be given as to how to collect more data by for example catching people at lunchtime or

in the play ground. The data will also be checked for missing information and outliers both of

which will be deleted to give a clean data set. A preliminary enquiry will be carried out on a small

sample and comparisons of mean heights, weights and BMI will be made against national

secondary statistics to check the validity of the approach and to ensure the type of data collected is

realistic and worth using.

7

2.4 Sampling options and avoiding bias

There are a number of sampling options available and it is important to select an approach that

avoids bias. The options are considered below:

Quota sampling is simply an interviewer asking certain people questions. They could be of a

certain social class, gender or age. The interviewer, however, chooses who to ask. This is cheap

but ineffective, this is because some people may choose to avoid the interviewer, biasing the sample

one way, this is why this method has not been used here.

Cluster sampling is dividing the population into groups, random sampling is then used to choose

the groups, this is ineffective because the groups may be of a specific type of person, again biasing

the survey, that is why this method has not been used.

Opinion sampling is using open ended questions to find out how people feel about something,

however this is irrelevant so is not going to be used. All that is of interest is actual height and

weight.

In systematic sampling, everyone in the sample is chosen at regular intervals from the list. This can

be biased if low or high values are in a regular pattern, for this reason this method has not been

used.

Convenience sampling is simply someone standing somewhere asking a certain amount of people

certain questions. This is flawed because people of similar social groups who would give similar

answers and would probably be at the place where the interviewer is, therefore biasing the sample,

therefore this method of sampling will not be used. For example, standing outside a sweet shop or

standing outside a gym would give completely different answers for weight values.

Stratified sampling ensures that when a population contains separate groups or strata, each group is

fairly represented in the sample. In this case, there are 5 year groups and samples are being taken

from both males and females and so there are 10 strata. In stratified sampling, the number taken

from each strata is proportional to the strata size and this ensures that all strata are fairly

represented.

Random sampling ensures that every member of the population has an equal chance of being

selected which removes bias and so random sampling will be used.

Having considered different sampling options, stratified and random sampling have been selected as

the most suitable for this project because it ensures that each strata is fairly represented and each

member of a strata has an equal chance of being selected and so there is no bias in the strata.

8

2.5 Preliminary enquiry

Secondary data on the weight, height and BMI for 2 to 15 year olds is available in the publication

‘Health Survey for England 2007 Latest Trends’ available from the NHS Information Centre for

Health and Social Care. Table 2 summarises this secondary data for 15 year old males and females

for survey year 2007. By comparing a preliminary enquiry sample to this the appropriateness of

both the questionnaire and the samples can be checked.

15 year old Males 15 year old Females

Mean Height 1.728 m 1.623 m

Mean Weight 63.6 kg 59.1 kg

Mean BMI 21.3 22.4

Table 2 – National survey statistics 2007

To check the appropriateness of the questionnaire and the available data a preliminary enquiry was

carried out on a sample of 10 Year 10 males and 10 Year 10 females and the results are shown in

Table 3. The BMI was calculated using

[ ]

2

Weight (kg)

Body Mass Index =

height(m)

For example, for the first female the height is 1.64 m and the weight is 61 kg and so the BMI is:

[ ]

2 2

Weight (kg) 61

Body Mass Index = 22.7

1.64

Height (m)

· ·

Table 4 compares the preliminary enquiry results with the secondary data of Table 2. All of the

preliminary enquiry results are slightly less than the secondary data with the maximum difference

being -2.7%. Since Year 10 is a mix of 14 and 15 years olds and the secondary data is for 15 year

olds only then it would be expected that the preliminary enquiry results are lower. The preliminary

enquiry results show that the questionnaire is appropriate in gathering the information required on

height and weight such that the BMI can be calculated. The preliminary enquiry samples compare

well with the secondary data which gives some confidence in the validity of the information

collected but it is not conclusive because it is just a small sample and it could be a result of

fortunate sampling.

9

Year Group Gender Height (m) Weight (kg) BMI

10 f 1.64 61 22.7

10 f 1.61 56.5 21.8

10 f 1.67 57 20.4

10 f 1.65 60 22.0

10 f 1.69 66 23.1

10 f 1.59 58 22.9

10 f 1.65 56 20.6

10 f 1.62 50 19.1

10 f 1.46 56 26.3

10 f 1.55 55 22.9

Mean = 1.61 57.6 22.2

10 m 1.77 73 23.3

10 m 1.66 64 23.2

10 m 1.69 68 23.8

10 m 1.73 53 17.7

10 m 1.65 63 23.1

10 m 1.67 54 19.4

10 m 1.66 57.5 20.9

10 m 1.73 54 18.0

10 m 1.8 65 20.1

10 m 1.79 67 20.9

Mean = 1.72 61.9 21.1

Table 3 – Preliminary enquiry BMI calculations for Year 11 Males and Females

Males Females

Secondary

data

Preliminary

enquiry

data

%

difference

Secondary

data

Preliminary

enquiry

data

%

difference

Mean Height (m) 1.728 1.72 -0.5% 1.623 1.61 -0.8%

Mean Weight (kg) 63.6 61.9 -2.7% 59.1 57.6 -2.5%

Mean BMI 21.3 21.1 -1% 22.4 22.2 -0.9%

Table 4 – Comparison of secondary and primary preliminary enquiry data

10

3 Deciding on how many samples

The size of the samples is important because if too few are used then this could lead to inaccurate

results because the will not be representative of the population. Using too many samples means

unnecessary calculations.

The breakdown of the overall population data that is available is shown in Table 5.

Female Male Total

Year 7 69 79 148

Year 8 63 85 148

Year 9 66 77 143

Year 10 72 71 143

Year 11 66 82 148

Total: 336 394 730

Table 5 – The whole population

The overall population is 730 and so a stratified sample of 300 will be taken. This represents

approximately 40% and is a good balance between taking too few and taking too many and should

be representative of the whole population.

The stratified sample is shown in Table 6.

Female Male Total

Year 7 28 32 60

Year 8 26 35 61

Year 9 27 32 59

Year 10 30 29 59

Year 11 27 34 61

Total: 138 162 300

Table 6 – Stratified sample

3.1 Generating random samples

Excel was used to generate the random samples using the following approach:

1. Firstly a random number was generated using the RAND() function to give a random

number greater than or equal to 0 and less than 1 (eg 0.5910).

2. This was then multiplied by the total population (eg for Year 7 Female this would be

69x0.5910 = 40.78

3. The ROUNDUP(number,0) function was used to roundup to the nearest integer so, for

example, ROUNDUP(40.78,0) would give 41.

4. This was repeated until the desired number of samples was achieved (eg for Year 7 Female

this would be 28)

5. The corresponding values were then taken from the population to give the sampled subset.

11

Random Nos. between 0 and 1 Random Nos. between 1 and 69 Sampling Nos

1 0.7981 55.07 56

2 0.2723 18.79 19

3 0.4442 30.65 31

4 0.4304 29.69 30

5 0.7794 53.78 54

6 0.0370 2.56 3

7 0.0033 0.23 1

8 0.9662 66.67 67

9 0.5410 37.33 38

10 0.4047 27.92 28

11 0.0861 5.94 6

12 0.6128 42.28 43

13 0.7647 52.76 53

14 0.3026 20.88 21

15 0.8527 58.83 59

16 0.6026 41.58 42

17 0.7051 48.65 49

18 0.3393 23.41 24

19 0.9764 67.37 68

20 0.5613 38.73 39

21 0.1816 12.53 13

22 0.2891 19.95 20

23 0.0214 1.47 2

24 0.1645 11.35 12

25 0.4139 28.56 29

26 0.9334 64.40 65

27 0.0623 4.30 5

28 0.3149 21.72 22

Table 7 – Generating random samples

Table 7 shows the random numbers used and Table 8 shows the samples used which are highlighted

in green.

12

Number Year Group Gender Height (m)

Weight

(kg) Number Year Group Gender Height (m) Weight (kg)

1 7 f 1.62 44.5 36 7 F 1.45 37

2 7 f 1.53 37 7 F 1.43 32

3 7 f 1.535 54 38 7 F 1.4 34

4 7 f 1.58 55 39 7 F 1.52 43

5 7 f 1.57 51 40 7 F 1.55 41

6 7 f 1.48 44.5 41 7 F 1.47 46

7 7 f 1.535 51 42 7 F 1.52 42

8 7 f 1.6 46.5 43 7 F 1.44 37

9 7 f 1.63 44 7 F 1.65 49

10 7 f 1.46 45 7 F 1.5 48

11 7 f 1.52 46 7 F 1.66 66

12 7 f 1.53 38 47 7 F 1.57 72

13 7 f 1.41 37.5 48 7 F 1.61 50

14 7 F 1.57 48 49 7 F 1.54 61

15 7 F 1.43 39 50 7 F 1.44 42

16 7 F 1.56 42 51 7 F 1.64 68

17 7 F 1.44 37 52 7 F 1.58 47

18 7 F 1.38 29 53 7 F 1.5 41

19 7 F 1.68 56 54 7 F 1.6 37

20 7 F 1.54 50 55 7 F 1.48 38

21 7 F 1.38 27 56 7 F 1.52 49

22 7 F 1.6 47 57 7 F 1.52 44

23 7 F 1.52 39 58 7 f 1.61 44.5

24 7 F 1.53 48 59 7 f 1.51

25 7 F 1.51 36 60 7 f 1.515 51

26 7 F 1.62 42 61 7 f 1.54 51

27 7 F 1.55 45 62 7 f 1.56 50.5

28 7 F 1.56 63 7 f 1.46 41.5

29 7 F 1.51 42 64 7 f 1.655 52

30 7 F 1.51 42 65 7 f 1.65 44.5

31 7 F 1.52 39 66 7 f 1.65

32 7 F 1.59 45 67 7 f 1.485

33 7 F 1.53 43 68 7 f 1.535 425

34 7 F 1.58 65 69 7 f 1.525 37.5

35 7 F 1.46 39

Table 8 – The sampled subset for Year 7 Females

A similar approach was taken for the rest of the stratified sample and the results are shown in

Appendix 2.

13

3.2 Checking for outliers

Some of the samples included non-responses for either height or weight and so these were

eliminated and replaced by other samples.

To check for outliers the interquartile range must be found, it must then be multiplied by 1.5 and

anything above the upper quartile or below the lower quartile by that amount or more is an outlier.

Excel can be used to calculate the interquartile range and the upper and lower outlier values. This

has been done for all of the sampled data and the results are shown in Tables 9-12.

Year 7 Year 8 Year 9 Year 10 Year 11

Minimum 1.380 1.050 1.540 1.460 1.550

LQ 1.528 1.580 1.585 1.590 1.585

Median 1.525 1.610 1.600 1.625 1.650

UQ 1.548 1.650 1.650 1.670 1.690

Maximum 1.680 1.740 1.740 1.830 1.740

IQR 0.020 0.070 0.065 0.080 0.105

Lower outlier 1.498 1.475 1.488 1.470 1.428

Upper outlier 1.578 1.755 1.748 1.790 1.848

Table 9 – Female Heights

Year 7 Year 8 Year 9 Year 10 Year 11

Minimum 1.380 1.430 1.310 1.370 1.640

LQ 1.450 1.515 1.610 1.650 1.738

Median 1.510 1.580 1.670 1.730 1.790

UQ 1.558 1.645 1.713 1.790 1.838

Maximum 1.760 1.720 1.880 1.880 5.700

IQR 0.108 0.130 0.103 0.140 0.1000

Lower outlier 1.289 1.320 1.456 1.440 1.588

Upper outlier 1.719 1.840 1.866 2.000 1.988

Table 10 – Male Heights

Year 7 Year 8 Year 9 Year 10 Year 11

Minimum 27.00 33.00 38.50 46.00 44.50

LQ 38.75 43.50 46.00 48.25 50.50

Median 42.75 49.50 50.00 55.50 56.00

UQ 48.25 60.50 59.75 58.50 60.25

Maximum 61.00 75.50 67.50 85.00 87.00

IQR 9.5 17.00 13.75 10.25 9.75

Lower outlier 24.5 18.00 23.375 32.875 38.875

Upper outlier 62.5 86.00 80.375 73.875 74.875

Table 11 – Female Weights

14

Year 7 Year 8 Year 9 Year 10 Year 11

Minimum 30.00 34.00 40.00 32.00 52.00

LQ 39.00 41.50 52.25 54.00 61.50

Median 44.00 48.50 60.00 62.00 66.00

UQ 50.50 57.00 67.75 66.00 73.38

Maximum 88.00 81.00 85.00 95.00 179.00

IQR 11.50 15.50 15.50 12.00 11.875

Lower outlier 21.75 18.25 29.00 36.00 43.688

Upper outlier 67.75 80.25 91.00 84.00 91.188

Table 12 – Male Weights

The highlighted values show the outliers and some of these are silly mistakes. For example, the

height of 5.7 m for a year 11 boy was probably a height entered in feet and inches rather than

metres. The weight of 179 kg for a Year 11 female was probably a data entry error and should be

79 kg. The highlighted outliers were removed as were any other data items that were significantly

outside the lower and upper outlier range.

This left a clean set of data that can be worked with.

15

4 Hypothesis 1: BMI spread

4.1 Justification for approach

The hypothesis is:

Hypothesis 1: The BMI will be less spread for Year 11 females than Year 11 males but the spread

will be similar for Year 7 males and females.

This requires an investigation of spread for Year 7 and Year 11 males and females. There are

several choices for presenting the data and there are a number of measures of spread and so an

approach must be selected and justified.

With respect to presenting the data, pie-charts are inappropriate because they do not give a visual

impression of spread. Both stem and leaf diagrams and histograms could be used since they give a

visual representation but box plots are preferred because they use actual values rather than grouped

data and they give an immediate impression of spread because of the width of the box and the width

of the whiskers. Box and whisper diagrams also make it easier to compare the four sets of data on

the same graph and to immediately get a feel for the relative spread between the data sets.

Spread could be measured using the variance or standard deviation but since box plots are being

used to visualise the data it makes sense to use the inter-quartile range as the measure of spread

since the values have already been calculated to draw the box plot. This also has the benefit of

reducing the impact of extreme values.

4.2 Testing the hypothesis

This hypothesis will be tested using box plots and relative spread. Hand calculations will be done

first to show the technique and then Autograph will be used to make the calculations simpler.

To show the calculations, the BMI data for Year 11 females will be used. The data has been

arranged into ascending order such that the box plot information can be found. This is shown in

Table 13.

From Table 13

Minimum = 16.6

Maximum = 29.4

For

n

data values,

1

2

n +

gives the position of the median. There are 27 values and so the median is

at position 14 and so the median is:

Median = 21

The lower quartile is at

1

4

n +

which in this case is position 7 and so the lower quartile is:

16

Lower quartile = 18.7

The upper quartile is at the

( ) 3 1

4

n +

position which in this case is 21

st

position and so

Upper quartile = 22.7

Yr 11 F

BMI

1 16.6

2 17.5

3 17.6

4 18.1

5 18.2

6 18.4

7 18.7

8 19.6

9 19.7

10 19.9

11 20.6

12 20.7

13 20.8

14 21.0

15 21.0

16 21.0

17 21.2

18 22.2

19 22.4

20 22.5

21 22.7

22 22.8

23 24.7

24 25.4

25 26.4

26 26.9

27 29.4

Table 13 – Year 11 Female BMI

Using this information, the box plot can now be drawn. Autograph automatically calculates these

values from the raw data and so Autograph will be used to create box plots for Year 7 and 11 males

and females. These are shown in Figure 5.

17

Figure 5 – Box plots to test the spread of data

Figure 6 – Box plots using UK secondary data

18

4.3 Interpretation of results

The box plots of Figure 5 show that, for Year 7 the median BMI for females is 18.4 and for males it

is 18.9 and the interquartile ranges are 3.1 for females and 3.5 for males but that the females have a

greater range (whiskers showing minimum to maximum). Both Year 7 box plots are positively

skewed since their medians are closer to the lower quartile.

For Year 11, the median for the females is 21 and the interquartile range is 3.5, for males they are

lower with the median being 20.6 and the interquartile range 2.9. This suggests that the hypothesis

might not be supported because the Year 11 females have a higher interquartile range than the

males. However, spread is measured in terms of relative spread with respect to the median and this

is different. For Year 11, the female data is negatively skewed (median nearer upper quartile) and

the male data is positively skewed.

Figure 6 shows the box plots using secondary data taken from the child BMI charts shown in

Figures 1 and 2. The median for Year 7 males is less than that for females and the interquartile

range is less. For Year 11, the median for females is larger than males and the interquartile range is

larger. This does not support the hypothesis.

The relative spread is given by

interquartile range

Relative spread = 100%

median

×

Calculating the relative spread for each set of data gives the results shown in Table 14:

HGS data National data

Year 7 Female 18.5% 16.8%

Year 7 Male 19% 16.3

Year 11 Female 18.8% 17.1%

Year 11 Male 15.3% 16%

Table 14 – Comparing relative spread using primary and secondary data

The relative spread for Year 7 males and females is very close and so supports the hypothesis.

The relative spread for females in Year 11 is more than that for males both for the HGS data and for

the national data and so Hypothesis 1 is not supported.

19

5 Hypothesis 2: BMI normally distributed

5.1 Justification for approach

The second hypothesis is:

Hypothesis 2: The BMI for HGS students will be normally distributed

The normal distribution is given by

( )

( )

2

2

1

exp

2 2

x

p x

µ

σ σ π

]

−

· − ]

]

]

Where

µ

is the mean and

σ

is the standard deviation. Autograph has been used to draw the

normal distribution when

0 µ ·

and 1 σ · and the result is shown in Figure 7.

Figure 7 – The normal distribution

The normal distribution is a bell shaped curve and so in deciding how to present the data the

particular format used should be able to demonstrate the shape of the distribution. This rules out

using pie-charts but a stem and leaf diagram would provide a quick visual check of the shape.

However, a more thorough check is to work out the area under the distribution and compare it to

that for a normal distribution and to do this means that a histogram must be used.

20

Using the ‘Find area’ function on Autograph, the area under a curve can be found. This was done

to find the area between 1 t standard deviations ( 1 σ t · t ) and this is shown in Figure 8. The area

was found to be 0.683. As the total area under the normal distribution is 1, then this is 68.3%

This was done for 2 2 σ t · t and this is shown in Figure 9. The area was found to be 0.955. As the

total area under the normal distribution is 1, then this is 95.%

Figure 8 – Area under normal distribution between σ t

Figure 9 – Area under normal distribution between 2σ t

If the BMI for HGS students follows a normal distribution then the histogram should have a normal

looking shape and the areas under the histograms should be 68% between σ t and 95.5% between

2σ t and so for each case the standard deviation needs to be calculated.

21

5.2 Testing the hypothesis

This hypothesis will be tested by drawing the BMI histogram for HGS males and females for both

Year 7 and Year 11 and checking each to see if it is normally distributed.

As an example, hand calculations will be done for Year 11 Females and then Autograph will be

used to make the calculations easier for the others. The raw data for Year 11 Females is shown in

Table 15.

Sample Yr 11 F BMI Sample Yr 11 F BMI

1 21.02 15 22.55

2 19.73 16 19.89

3 18.73 17 18.37

4 21.23 18 21.01

5 22.77 19 26.37

6 18.17 20 22.21

7 16.63 21 18.05

8 17.51 22 29.41

9 20.80 23 20.69

10 22.68 24 19.60

11 20.64 25 22.43

12 25.43 26 26.89

13 17.63 27 24.65

14 21.01

Table 15 – Raw data for Year 11 Female BMI

From the raw data of Table 15, a frequency table was made and this is shown in Table 16.

BMI Frequency

16 18 x ≤ < 3

18 20 x ≤ < 7

20 22 x ≤ < 7

22 24 x ≤ < 5

24 26 x ≤ < 2

26 28 x ≤ < 2

28 30 x ≤ < 1

Table 16 – Frequency Table for Year 11 Female BMI

Using the frequency table, a histogram was hand drawn and this is shown in Figure 10.

22

Figure 10 – insert hand drawn histogram.

23

To check if the histogram is like a normal curve the mean and standard deviation are needed. Excel

was used to calculate these using:

mean

x

x

n

· ·

∑

And

2

2

Standard deviation =

x x

n n

σ

| `

· −

. ,

∑ ∑

x x

2

21.02 441.83

19.73 389.14

18.73 350.83

21.23 450.62

22.77 518.62

18.17 330.01

16.63 276.59

17.51 306.47

20.80 432.50

22.68 514.38

20.64 425.80

25.43 646.81

17.63 310.85

21.01 441.32

22.55 508.35

19.89 395.67

18.37 337.29

21.01 441.32

26.37 695.39

22.21 493.12

18.05 325.93

29.41 864.82

20.69 428.08

19.60 384.02

22.43 503.21

26.89 723.20

24.65 607.86

Σx = 576.08 Σx

2

= 12544.04

Table 17 – Data for determining mean and standard deviation

Using the data in Table 17, the mean and standard deviation are given by

576.08

mean 21.33

27

x

x

n

· · · ·

∑

2

2 2

12544.04 576.08

Standard deviation = 3.06

27 27

x x

n n

σ

| `

| `

· − · − ·

. ,

. ,

∑ ∑

24

If the histogram follows a normal curve then 68% of the values should lie between a BMI of

21.33 3.06 t . This has been marked on Figure 10 and the shaded area worked out:

Shaded area = (20-(21.33-3.06))x7 + 2x7 + 2x5 + (21.33+3.06-24)x2 = 36.9

Total area = 2x3 + 2x7 +2x7 + 2x5 +2x2 +2x2 + 1x2 = 54

The percentage of values = shaded area/total area = 36.9/54 = 68.3%

If the histogram follows a normal curve then 95% of the values should lie between 21.33 2 3.06 t × .

This has been marked on Figure 11 and the shaded area worked out:

Shaded area = 2x3 +2x7 + 2x7 + 2x5 +2x2 + (21.3+2x3.06 – 26)x2 = 50.84

Total area = 2x3 + 2x7 +2x7 + 2x5 +2x2 +2x2 + 1x2 = 54

The percentage of values = shaded area/total area = 50.84/54 = 94%

This suggests that the BMI distribution for Year 10 Females at HGS follows a normal curve.

25

Figure 11 – Histograms for Year 7 Female BMI

72.2% are between

1 t standard deviation

95.2% are between

2 t standard deviation

26

Figure 12 – Histograms for Year 7 Male BMI

62.2% are between

1 t standard deviation

96% are between

2 t standard deviation

27

Figure 13 – Histograms for Year 11 Male BMI

74.1% are between

1 t standard deviation

91.7% are between

2 t standard deviation

28

5.3 Interpretation of results

The process was repeated using Autograph and the results are shown in Figures 11-13.

Figure 11 shows the histogram for Year 7 female BMI and visually it has the bell shape associated

with a normal distribution. The area under the histogram between σ t is 72.2% and between 2σ t it

is 95.2% which compare very well with the 68% and 95.5% expected for a normal distribution and

so this supports the hypothesis that the BMI is normally distributed.

Figure 12 shows the histogram for Year 7 male BMI and visually it is not quite the bell shape

associated with a normal distribution since it has positive skew. The area under the histogram

between σ t is 62.2% which represents a notable difference from the expected 68% and reflects the

positive skew. Between 2σ t it is 96% which reflects compares well with the expected 95.5% for a

normal distribution. These results do not totally support the hypothesis and suggest that a more

rigorous test is required.

Figure 13 shows the histogram for Year 11 male BMI and visually it does not have the bell shape

associated with a normal distribution. The area under the histogram between σ t is 74.1% and

between 2σ t it is 91.7% which do not compare very well with the 68% and 95.5% expected for a

normal distribution and so this does not supports the hypothesis that the BMI is normally

distributed.

Overall, the results suggest that the Year 7 female BMI and the Year 11 female BMI are normally

distributed, the Year 7 male BMI could be but the Year 11male BMI is not. The tests carried out

are basic and so it suggests that a more rigorous test is required and so a the

2

χ statistic will be

calculated to give a more accurate test.

5.3.1

2

χ test statistic

A more accurate test to see if the distribution is normal is the

2

χ test statistic that is described in

Appendix 3. The

2

χ statistic is:

( )

2

2

O E

E

χ

−

·

∑

Where O is the observed value and E is the expected value. The chi-squared statistic is a measure

of the difference between the observed values and the expected values and can be used to see how

close the distribution is to an expected distribution.

In this case, the test is to see whether or not it is a normal distribution and so the null and alternative

hypothesis are:

0

1

: BMI follows a normal distribution

: BMI does not follow a normal distribution

H

H

29

As the normal distribution is being tested, then the mean and standard deviation are required to

determine the expected values. From Appendix 3, the number of degrees of freedom is

1 k p − −

where k is the number of classes and

p

is the number of parameters estimated from the sample

data used to generate the hypothesised distribution. In this case,

2 p ·

because the mean and

standard deviation are required to determine a normal distribution.

The normal distribution is given by:

( )

( )

2

2

1

exp

2 2

x

p x

µ

σ σ π

]

−

· − ]

]

]

Autograph was used to find the area under the curve. For Year 11 Females,

21.33 µ ·

and

3.06 σ · and so the expected frequency in a particular class can be found by multiplying the

number of samples by the probability of the samples being in that class. The probability is the area

under the normal distribution curve which can be found using Autograph. For example, the

probability of BMI samples being between 20-22 for Year 11 Females is shown in Figure 14.

Figure 14 – Probability that BMI will be between 20-22 for Year 11 Females

The Expected frequency is the number of samples multiplied by the probability, 27x0.255 = 6.88.

This was repeated for each of the classes and the results are shown in Table 18 along with the

Observed frequencies. The Expected can now be subtracted from the Observed to determine the

Chi-squared statistic.

30

Body Mass Index

14-16 16-18 18-20 20-22 22-24 24-26 26-28 28-30

Expected 0.88 2.63 5.23 6.88 5.99 3.46 1.32 0.33

Observed 0 3 7 7 5 2 2 1

(O-E)

2

/E 0.88 0.05 0.60 0.00 0.16 0.61 0.35 1.34

Σ(O-E)

2

/E = 3.99 Critical Value =11.07

Table 18 -

2

χ test for Year 11 Female BMI

For Year 11 Females, the

2

3.99 χ · . The smaller that this figure is, the closer the observed is to the

expected statistic. Chi-squared tables are used to work out the significance and to do this, the

number of degrees of freedom is needed. In this case there are eight classes and so 8 k · , the

number of degrees of freedom is

1 8 2 1 5 k p − − · − − ·

. The Chi-squared critical values are shown

in Table 19 for the 0.05 significance level.

Degrees of Freedom Critical value

1 3.84

2 5.99

3 7.82

4 9.49

5 11.07

6 12.59

7 14.07

8 15.51

9 16.92

10 18.31

Table 19 – Chi-squared critical values at the 0.05 significance level

If the

2

χ value is less than the critical value then the null hypothesis is not rejected. In this case,

when the number of degrees of freedom is 5, the critical value is 11.07 and so the null hypothesis is

not rejected.

This was repeated for Year 11 Males and Year 7 Females and Males with the results shown in

Tables 20-22.

Body Mass Index

16-18 18-20 20-22 22-24 24-26 26-28 28-30

Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8

Expected 3.44 6.68 8.41 6.86 3.62 1.24 0.27

Observed 3 10 11 4 1 1 2

(O-E)

2

/E 0.06 1.65 0.80 1.19 1.90 0.05 11.08

Σ(O-E)

2

/E = 16.72 Critical Value = 9.49

Table 20 -

2

χ test for Year 11 Male BMI

31

Body Mass Index

12-14 14-16 16-18 18-20 20-22 22-24 24-26 26-28

Expected 0.34 1.97 5.66 8.04 5.66 1.97 0.34 0.02

Observed 0 2 6 9 5 1 1 0

(O-E)

2

/E 0.34 0.00 0.02 0.11 0.08 0.48 1.31 0.02

Σ(O-E)

2

/E = 2.03 Critical Value = 11.07

Table 21 -

2

χ test for Year 7 Female BMI

Body Mass Index

12-14 14-16 16-18 18-20 20-22 22-24 24-26

Expected 0.15 1.42 5.77 10.32 8.09 2.78 0.41

Observed 0 1 7 10 7 4 0

(O-E)

2

/E 0.15 0.12 0.26 0.01 0.15 0.53 0.41

Σ(O-E)

2

/E = 1.48 Critical Value = 9.49

Table 22 -

2

χ test for Year 7 Male BMI

Except for Year 11 Male, the

2

χ value was less than the critical value and so the null hypothesis

cannot be rejected which suggests that the BMI follows the normal distribution.

For Year 11 Male,

2

16.72 χ · which is more than the critical value of 9.49 and so the null

hypothesis is rejected and the BMI cannot be said to follow a normal distribution. This could be a

result of a rogue sample and so the distribution of all of the Year 11 Male data was checked. The

histogram is shown in Figure 15.

32

Figure 15 – Histogram for Year 11 Male BMI using whole population

This Year 11 Male BMI histogram again has a large upper tail and so visually it does not look as if

it follows a normal distribution but this can be checked more accurately using the Chi-squared test.

The results are shown in Table 23.

16-18 18-20 20-22 22-24 24-26 26-28 28-30

Bin 2 Bin 3 Bin 4 Bin 5 Bin 6 Bin 7 Bin 8

Expected 7 14.15 19.25 17.65 10.9 4.54 1.27

Observed 6 22 23 14 3 5 5

(O-E)

2

/E 0.14 4.35 0.73 0.75 5.73 0.05 10.96

Σ(O-E)

2

/E = 22.71 Critical value = 9.49

Table 23 -

2

χ test for Year 7 Male BMI whole population

For Year 11 Male whole population,

2

22.71 χ · which is more than the critical value of 9.49 and

so the null hypothesis is rejected and the BMI cannot be said to follow a normal distribution.

Figure 16 shows the difference between the expected and the observed and it is clear that there are

large differences and that the distribution is not normal.

33

Figure 16 – The Expected and Observed histograms for Year 11 Male BMI

The BMI for Year 7 Males and Females and for Year 11 Females follows a normal distribution.

However, the BMI for Year 11 males does not and so Hypothesis 2 is rejected.

34

6 Hypothesis 3: Correlation

6.1 Justification for approach

The third hypothesis is:

Hypothesis 3: The weights and heights of males will be more strongly correlated than that of

females.

This requires establishing a connection between the height and weight data for both males and

females. This can be done visually by using a scatter diagram since, if there is a connection, the

weight should increase as the height increases. Once the scatter diagram is drawn a line of best fit

can be constructed and if the points are close to this it suggests strong positive correlation and so the

two scatter diagrams will give an immediate visual representation of whether there is a stronger

correlation between male weights and heights or female weights and heights.

The correlation coefficient gives a numerical indication as to the strength of the correlation. A

correlation coefficient of zero would mean no correlation and a correlation coefficient of one would

mean strong positive correlation and so if the hypothesis is correct the correlation coefficient for

male weights and heights should be closer to one than that for female weights and heights. The

method for calculating the correlation coefficient depends upon the shape of the distribution. If it is

a normal distribution then Pearson’s method is used, if it is a non-normal distribution then

Spearman’s method is used and so prior to deciding on which method a test will have to be carried

out to check the shape of the distribution and this is done in section 6.3.

6.2 Testing the hypothesis

Figure 17 shows a hand drawn scatter graph for the weights and heights of Years 7-11 males using

half of the sample set (every other point was used). Visually, there looks to be a positive correlation

since the weight appears to increase with height. Assuming a linear relationship, a line of best fit

can be estimated and drawn through the (mean height, mean weight) point and the gradient and

intercept calculated.

Mean height for Males Years 7-11 = 1.65 m

Mean weight for Males Years 7-11 = 56.4 kg

Using this point, the line of best fit was drawn and this is shown in Figure 17.

The line has the equation:

w mh c · +

Where

w

is the weight (kg), h is the height (m),

m

is the gradient and

c

is the intercept. The

gradient can be found by dividing the change in weight for a given change in height:

change in weight 72.5 40

65

change in height 1.9 1.4

m

− | `

· · ·

−

. ,

35

The intercept can be found by selecting a particular (weight, height) point on the graph. Using

(40,1.4)

40 65 1.4

40 65 1.4 51

c

c

· × +

· − × · −

The equation of the line that relates weight to height in Years 7-11 males is

65 51

Male Male

w h · −

Autograph can be used to do this automatically by using the ‘y on x regression line’. This has been

done for the weight versus height scatter graph fro the Years 7-11 females and the result is shown in

Figure 18. For the females, the weight is related to the height by:

90 92

Female Female

w h · −

The gradient for the females is larger than the males, which means that weight increases more

rapidly with increases in height.

Visually, the data points appear more scattered around the line of best fit for females than it does for

males and so this suggests a weaker correlation for females. This supports the hypothesis but will

be checked more rigorously by working out the correlation coefficient.

36

Figure 17 Insert Hand drawn scatter graph here:

37

Figure 18 – Scatter graph of weight versus height for HGS females Years 7-11

6.3 Should Pearson or Spearman be used?

The Pearson product-moment correlation coefficient (Appendix 4) is a measure of the correlation

between two variables. It is calculated from

( )

( )

( )

( )

2 2

2 2

n xy x y

r

n x x n y y

−

·

− −

∑ ∑ ∑

∑ ∑ ∑ ∑

It reflects the degree of linear relationship between two variables. It ranges from +1 to -1. A

correlation of +1 means that there is a perfect positive relationship between the two variables. A

correlation of -1 means that there is a perfect negative relationship. A correlation of 0 means that

there is no linear relationship between the two variables. Pearson’s coefficient requires the two

variables to be normally distributed. In the case of non-normal distributions , Pearson’s correlation

coefficient will lead to wrong results. Also, Pearson’s coefficient is sensitive to outliers.

In the case of non-normal distributions, Spearman’s rank correlation can be used. Spearman’s

coefficient is given by

( )

2

2

6

1

1

d

n n

ρ · −

−

∑

38

Basically, it differs from Pearson’s correlation only in that the values are converted to ranks before

computing the coefficient. This has the advantage of reducing the effect of outliers. The

disadvantage of Spearman’s is that it is time consuming to rank the data when there is a lot of data.

To make a decision on whether to use Pearson or Spearman the data needs to be checked to see if it

follows a normal distribution. To do this, Chi-squared tests will be carried out on the following

hypotheses:

Male Heights

0

1

: Male heights follow a normal distribution

: Male heights do not follow a normal distribution

H

H

Male Weights:

0

1

: Male weights follow a normal distribution

: Male weights do not follow a normal distribution

H

H

Female Heights

0

1

: Female heights follow a normal distribution

: Female heights do not follow a normal distribution

H

H

Female Weights:

0

1

: Female weights follow a normal distribution

: Female weights do not follow a normal distribution

H

H

Autograph was used to get a histogram of the Year 7-11 heights and weights and the

mean 3 standard deviation t tool was used to put markers on the histogram that give a feel for the

distribution. Figure 19 shows the results and the distributions look like normal distributions.

39

Figure 19 – Male heights and weights for Yrs 7-11

40

Male Heights (m)

1.3-4 1.4-1.5 1.5-1.6 1.6-1.7 1.7-1.8 1.8-1.9 1.9-2.0

Expected 3.04 13.44 33.51 47.20 37.56 16.89 4.28

Observed 1 20 27 46 40 17 4

(O-E)

2

/E 1.37 3.20 1.26 0.03 0.16 0.00 0.02

Σ(O-E)

2

/E = 6.04

Table 24 – Chi-squared test for male heights

Table 24 shows the results of the Chi-squared test for male heights. The number of degrees of

freedom are ( ) 1 k p − −

where 7 k · is the number of classes,

p

is the number of parameters

estimated from the sample data used to generate the hypothesised distribution which in this case is 2

(

, µ σ

). From Table 16, the critical valued for

(7 2 1) 4 − − ·

degrees of freedom is 9.49. As the

Chi-squared test gave 6.06<9.49 then the null hypothesis is not rejected and it can be assumed the

male heights follow a normal distribution.

Male Weights (kg)

30-40 40-50 50-60 60-70 70-80 80-90 90-100

Expected 11.83 30.49 45.28 38.79 19.17 5.46 0.89

Observed 15 34 40 43 15 7 1

(O-E)

2

/E 0.85 0.40 0.62 0.46 0.91 0.43 0.01

Σ(O-E)

2

/E = 3.68

Table 25 – Chi-squared test for male weights

Table 25 shows the results of the Chi-squared test for male weights. From Table 16, the critical

valued for

(7 2 1) 4 − − ·

degrees of freedom is 9.49. As the Chi-squared test gave 3.68<9.49 then

the null hypothesis is not rejected and it can be assumed the male weights follow a normal

distribution.

Figure 20 shows the distributions of the female heights and weights for Years 7-11. The female

heights show a negative skew (mean<mode) and the female weights show a positive skew

(mode<mean). Visually, they both look to have normal type distributions but this will be checked

using the Chi-squared test.

41

Figure 20 – Female heights and weights for Yrs 7-11

42

Female Heights (m)

1.3-1.4 1.4-1.5 1.5-1.6 1.6-1.7 1.7-1.8 1.8-1.9

Expected 0.44 9.40 47.28 57.58 17.08 1.20

Observed 1 7 45 68 10 2

(O-E)

2

/E 0.73 0.61 0.11 1.89 2.93 0.53

Σ(O-E)

2

/E = 6.81

Table 26 – Chi-squared test for female heights

Table 26 shows the results of the Chi-squared test for female heights. From Table 16, the critical

valued for

(6 2 1) 3 − − ·

degrees of freedom is 7.82. As the Chi-squared test gave 6.81<7.82 then

the null hypothesis is not rejected and it can be assumed the female heights follow a normal

distribution.

Female Weights (kg)

20-30 30-40 40-50 50-60 60-70 70-80 80-90

Expected 1.97 13.49 38.93 47.69 24.81 5.46 0.51

Observed 1 8 53 43 21 5 1

(O-E)

2

/E 0.48 2.23 5.09 0.46 0.59 0.00 0.00

Σ(O-E)

2

/E = 8.37

Table 27 – Chi-squared test for female weights

Table 27 shows the results of the Chi-squared test for female weights. From Table 16, the critical

valued for

(7 2 1) 4 − − ·

degrees of freedom is 9.49. As the Chi-squared test gave 8.37<9.49 then

the null hypothesis is not rejected and it can be assumed the male weights follow a normal

distribution.

As each of the Chi-squared tests show that the weights and heights of Years 7 to 11 males and

females can be assumed to follow a normal distribution then the Pearson coefficient is the most

appropriate.

43

6.4 Calculating the Pearson coefficient

The Pearson product-moment correlation coefficient is calculated from

( )

( )

( )

( )

2 2

2 2

n xy x y

r

n x x n y y

−

·

− −

∑ ∑ ∑

∑ ∑ ∑ ∑

By letting the

height x ·

and the

weight y ·

, Excel was used to calculate the various summations.

For the males, these are:

Σx= 254.7

Σy= 8684

Σx

2

= 423.7

Σy

2=

513718

Σxy= 14559.1

N= 154

The Pearson coefficient for the males is given by:

( )

( )

( )

( )

2 2

154 14559.1 254.7 8684

154 423.7 254.7 154 513718 8684

30286.4

377.7 3700716

30286.4

37386.6

0.81

Males

r

× − ×

·

× − × −

·

×

·

·

For the females,

Σx= 207.17

Σy= 6749.74

Σx

2

= 333.4173

Σy

2=

365923.38

Σxy= 10903.052

N= 129

The Pearson coefficient for the females is given by:

44

( )

( )

( )

( )

2 2

129 10903 207.2 6749.7

129 333.42 207.2 129 365923.4 6749.7

8150

91.42 1645126.5

8150

12263.7

0.665

Females

r

× − ×

·

× − × −

·

×

·

·

6.5 Interpretation of results

The scatter diagrams for both males (Figure 17) and females (Figure 18) show positive correlation

and so suggest that the weight increases as the height increases. Lines of best fit were calculated

and are given by:

65 51

Male Male

w h · −

90 92

Female Female

w h · −

The gradient for the female weight height relationship is greater than that of the male weight height

relationship which means that the weight increases at a more rapid rate with height for females than

it does for males.

Chi-squared tests were carried out to check whether or not the distributions of weights and heights

were normally distributed. The results of the Chi-squared tests supported in each case that the

distributions were normal and so the Pearson correlation coefficient was used rather than

Spearman’s since Spearman’s is used for non-normal distributions.

The Pearson coefficient for males is

0.81

Males

r ·

and the Pearson coefficient for females is

0.665

Females

r ·

. Both are greater than 0.5 and so there is strong positive correlation between the

weights and heights. Since

Males Females

r r >

it can be concluded that the weights and heights of males

are more strongly correlated than the weights and heights of females and this supports Hypothesis

3.

45

7 Is obesity a problem at HGS?

Using the charts of Figures 1 and 2, the BMI for being obese and overweight can be found. For

Year 7, the average age is assumed to be 11.5 years, for Year 8 it will be 12.5 and so on. From the

charts, Table 28 was produced.

Overweight Obese

Year 7 Female 21.6 24.3

Year 7 Male 23 25.3

Year 11 Female 21.6 23

Year 11 Male 23.5 26.4

Table 28 – Overweight and obesity BMI thresholds

Using the values in Table 28, Autograph can be used to find the area under the histogram and so

work out the percentage of HGS students that are overweight and obese. Figure 21 shows an

example of this for Year 11 Males. By working out the area under the histogram for BMI greater

than 26.4 then the percentage of obese males can be found.

Figure 21 – Using Autograph to work out the percentage of obese Males for Year 11

8.8% obese

46

Overweight Obese

Year 7 Female 9% 3.5%

Year 7 Male 11.7% 6.9%

Year 11 Female 11.5% 5.6%

Year 11 Male 6.9% 8.8%

Table 29 – Overweight and obesity BMI thresholds

This was done for Year 7 Males and Females and Year 11 Females and the results are shown in

Table 29. These percentages are a lot less than the national figures shown in Table 1 and so it can

be concluded that overweight/obesity is not a problem at HGS.

47

8 Conclusions

This report investigated the relationship between weight, height, body mass index and obesity in

children. It did this by making use of primary data collected from Year 7 to Year 11 students at

Heckmondwike Grammar School.

Section 2 looked at collecting data, types of data, questionnaire design and sampling options. A

stratified sample using random sampling was used in this report. This meant equal chances for

being picked for each of the Year 7 to Year 11 students.

Section 3 looked at how many samples to take and how to get rid of outliers. Taking too few

samples will give wrong information, taking too many means more calculations and so 40% of the

population was used. Excel was used to generate random numbers to take the samples. Excel was

also used to find outliers by working out the interquartile range, multiplying it by 1.5 and finding

out if any samples were more than this beyond the upper and lower quartile. In some cases the data

was missing (eg missing weight data), in some cases the data wasn’t correct (eg decimal point

missing or wrong data). The data was cleaned up in section 3.

Section 4 looked at Hypothesis 1

Hypothesis 1: The BMI will be less spread for Year 11 females than Year 11 males but will

have a similar spread for Year 7 males and females.

This hypothesis was proposed because the girls have more media pressure to stay thin than boys

because of all of the models and actresses. (i.e. They are all chosen to look pretty.) The hypothesis

was tested by using box plots to look at the spread and working out the relative spread. The relative

spread for Year 7 Female and Year 7 Male was 18.5% and 19% respectively and so were similar.

However, the relative spread for Year 11 Female was 18.8% which was significantly larger than the

relative spread of 15.3% for Year 11 Male but the hypothesis proposed that it would be smaller and

so the hypothesis is not correct.

Section 5 looked at Hypothesis 2.

Hypothesis 2: The BMI will be normally distributed

The BMI depends upon weight and height and it is well known that these tend to have a normal

distribution. As such, it is expected that the BMI will also have a normal distribution. The

hypothesis was tested by drawing the histograms for Year 7 Females and Males and Year 11

Females and Males and then working out the areas between σ t and 2σ t . For a normal

distribution these should be 68% and 95%. All were close to this but Year 11 Males was not. This

is just a basic test and so a more accurate test was used by using the

2

χ test. This showed that Year

7 Females and Males and Year 11 Females followed a normal distribution but Year 11 males did

not. Since Year 11 Males do not have a normal distribution the hypothesis is not correct.

48

Section 6 looked at Hypothesis 3.

Hypothesis 3: The weights and heights of Years 7-11 males will be more strongly correlated

than that of Years 7-11 females.

Because of the pressure on females to have smaller clothes size, it is expected that there will be less

correlation between weight and height for females than for males. Scatter diagrams were drawn to

test this and both suggested positive correlation which suggested that the weight for both males and

females increases as the height increases. A line of best fit was drawn for each case and it was

found that the points appeared to be closer to the line of best fit for the males than the females

suggesting a stronger correlation. To test the correlation either the Pearson or the Spearman

coefficient can be used. To decide on which one, the weights and heights of both Males and

Females were checked to see if they follow a normal distribution since to use Pearson they need to.

As the distributions were normal, the Pearson coefficient was calculated and for Males

0.81

Males

r ·

and for Females

0.665

Females

r ·

. As the Males was larger there is a stronger correlation. This

means the hypothesis is supported.

As for the question, “Is obesity a problem at HGS?”, the percentage of overweight and obese Year 7

and Year 11 males and females has been worked out and it is a lot less than the national figures and

so obesity is not a problem at HGS.

8.1 How the project could be improved

The first improvement is the design of the questionnaire. As BMI is being used in this project, a

disadvantage of the questionnaire is that it gives the student the option of not completing their

weight. Without this, the BMI cannot be calculated. The other problem with the questionnaire is it

does not indicate the accuracy required for the height and weight and this can lead to inaccuracies in

the BMI calculation. For example, if someone has a height of 1.64 m and a weight of 57.6 kg then

their BMI would be

2

57.6

BMI= 21.42

1.64

·

But, if on the questionnaire they rounded down their height to 1.6 m and their weight up to 58 kg

the BMI would be calculated as 22.67 which is an overestimate of the BMI by 5.5%. An alternative

questionnaire that might lead to better data is shown below.

Figure ? – Questionnaire for collecting the data

GCSE Statistics Data Collection

This data collection sheet is designed to be anonymous. Please fill in the details as

accurately as possible. Please remove your shoes before being measured and

weighed. Fold the completed slip and pass to your teacher. Thank you.

Year 10

Please circle: Male Female

Height: m (to the nearest cm eg 1.48 m)

Weight: kg (to the nearest 100 g eg 54.3 kg)

49

This questionnaire underlines anonymous and so students know that they can be honest about their

weight. It also gives examples so it will guide students to give more accurate height and weight

information.

The second improvement is the way the second hypothesis was worded. Currently it is:

Hypothesis 2: The BMI will be normally distributed

This was ambiguous and could be interpreted in a number of ways. Does it mean all students?

Does it mean students in each year? What was meant was that the distribution in each year for each

group of males and females would follow a normal distribution and that is why tests were carried

out for Year 7 males and females and Year 11 males and females because they represented the

extreme of the age ranges. Because of the wording, an alternative test would have been to just

group the whole of the data for males and females of all of the years together and see if it was a

normal distribution and this would have been valid because of the way the hypothesis is worded. A

better wording would be:

Hypothesis 2: The BMI for both male and female students in each year group will be

normally distributed

8.2 Limitations of the project

Each of the hypotheses was tested based on taking a 40% sample of the relevant population of the

730 students at HGS. Although this was considered to be enough to be representative of the whole

population of 730 students it is possible that a different 40% sample could lead to different results.

The samples were taken from Heckmondwike Grammar School and may not be representative of

the national population. If the samples were taken from a different School the results could be

completely different because of the make up of that School. For example, Heckmondwike

Grammar School takes students from the top 5% attainment levels on entry and only has 3% of

students on school meals. Fartown High School attainment levels on entry are below the national

average, 50% are on school meals and 35% are special educational needs. If the samples were

taken from Fartown High School, the results could be completely different.

The University of Washington (source: ScienceDaily.com 29 August 2007) has shown that

geography determines obesity. Researchers found that in the Seattle Metropolitan area, obesity

levels reached 30% in the most deprived areas but were only around 5% in the most affluent. This

suggests that taking samples from Schools who take their students from deprived areas could lead to

completely different results from Schools who take their students from affluent areas. This could

also explain why, in section 7, the figures show that obesity is not a problem at HGS.

50

9 Bibliography

1. Pledger, K., Cole, G., Jolly, P., Newman, G., Petran, J. & Bright, S. (2006). Edexcel GCSE

Mathematics, Heinemann

2. Job, B. & Morley, D. (2003), Key Maths GCSE - Statistics AQA Version, Nelson Thornes

3. Zip codes and property values predict obesity rates,

http://www.sciencedaily.com/releases/2007/08/070829090143.htm

4. Health Survey for England 2004: Updating of trend tables to include childhood obesity data

http://www.ic.nhs.uk/pubs/hsechildobesityupdate

51

Appendix 1 – Huddersfield Examiner Article

52

Appendix 2 – The sampled data

Year 7:

Number Random Sample Year Group Gender Height (m) Weight (kg) BMI

1 1 7 f 1.62 44.5 17.0

2 3 7 f 1.535 54 22.9

3 5 7 f 1.57 51 20.7

4 6 7 f 1.48 44.5 20.3

5 12 7 f 1.53 38 16.2

6 13 7 f 1.41 37.5 18.9

7 19 7 F 1.68 56 19.8

8 20 7 F 1.54 50 21.1

9 21 7 F 1.38 27 14.2

10 22 7 F 1.6 47 18.4

11 24 7 F 1.53 48 20.5

12 29 7 F 1.51 42 18.4

13 30 7 F 1.51 42 18.4

14 31 7 F 1.52 39 16.9

15 38 7 F 1.4 34 17.3

16 39 7 F 1.52 43 18.6

17 42 7 F 1.52 42 18.2

18 43 7 F 1.44 37 17.8

19 49 7 F 1.54 61 25.7

20 53 7 F 1.5 41 18.2

21 54 7 F 1.6 37 14.5

22 56 7 F 1.52 49 21.2

23 65 7 f 1.65 44.5 16.3

24 68 7 f 1.535 42.5 18.0

1 1 7 m 1.63 52 19.6

2 4 7 m 1.545 54.5 22.8

3 6 7 m 1.38 30 15.8

4 7 7 m 1.54 50.5 21.3

5 10 7 m 1.45 39 18.5

6 12 7 m 1.54 53.5 22.6

7 13 7 m 1.56 42.5 17.5

8 15 7 m 1.56 53.5 22.0

9 16 7 m 1.47 50.5 23.4

10 17 7 M 1.41 45 22.6

11 19 7 M 1.5 49 21.8

12 21 7 M 1.49 42 18.9

13 24 7 M 1.52 43 18.6

14 25 7 M 1.48 39 17.8

15 29 7 M 1.41 40 20.1

16 31 7 M 1.5 48 21.3

17 32 7 M 1.61 49 18.9

18 36 7 M 1.55 48 20.0

19 42 7 M 1.52 42 18.2

20 44 7 M 1.48 36 16.4

21 45 7 M 1.58 48 19.2

22 47 7 M 1.61 54 20.8

23 51 7 M 1.45 37 17.6

25 53 7 M 1.43 38 18.6

26 55 7 M 1.41 32 16.1

27 56 7 M 1.44 35 16.9

28 57 7 M 1.7 61 21.1

29 58 7 M 1.53 43 18.4

53

30 62 7 M 1.43 34 16.6

Year 7: Female = 24 mean h = 1.526667SD = 0.1

Male = 30 mean h = 1.507759SD = 0.1

Total = 54

Height Weight BMI

Female: Min= 1.380 27.000 14.18

LQ = 1.528 38.750 17.25

Median= 1.525 42.750 18.39

UQ = 1.548 48.250 20.36

Maximum= 1.680 61.000 25.72

IQR = 0.020 9.500 3.11

Lout = 1.498 24.500 12.58

Uout = 1.578 62.500 25.03

Height Weight BMI

Male: Min= 1.380 30.000 15.8

LQ = 1.450 39.000 17.8

Median= 1.500 43.000 18.9

UQ = 1.550 50.500 21.3

Maximum= 1.700 61.000 23.4

IQR = 0.100 11.500 3.5

Lout = 1.300 21.750 12.6

Uout = 1.700 67.750 26.5

Outliers:

24 52 7 M 1.76 88 28.4

Year 8:

Number

Random

Sample Year Group Gender Height (m) Weight (kg) BMI

1 1 8F 1.65 69 25.3

2 3 8F 1.67 61.5 22.1

3 4 8F 1.74 65 21.5

4 10 8F 1.67 67 24.0

5 11 8F 1.6 45 17.6

6 14 8F 1.64 46 17.1

7 19 8F 1.58 33 13.2

8 24 8F 1.54 43 18.1

9 27 8F 1.62 42 16.0

10 30 8F 1.59 48 19.0

11 32 8F 1.65 49.5 18.2

12 34 8F 1.58 50 20.0

13 35 8F 1.59 56.5 22.3

14 38 8F 1.55 49.5 20.6

16 43 8f 1.69 57 20.0

17 44 8f 1.66 61.5 22.3

18 46 8f 1.61 51 19.7

19 48 8f 1.625 75.5 28.6

20 50 8f 1.59 41 16.2

21 51 8f 1.47 43 19.9

22 53 8F 1.65 46 16.9

54

23 54 8F 1.55 43.5 18.1

24 57 8F 1.53 42 17.9

25 59 8F 1.65 56.5 20.8

1 1 8M 1.59 39 15.4

2 2 8M 1.45 40 19.0

3 5 8M 1.7 70 24.2

4 9 8M 1.54 42 17.7

5 11 8M 1.72 71 24.0

6 15 8M 1.58 45.5 18.2

7 16 8M 1.65 51 18.7

8 19 8M 1.53 43 18.4

9 20 8M 1.68 68 24.1

10 21 8M 1.44 35.5 17.1

11 22 8M 1.7 81 28.0

12 23 8M 1.72 57.5 19.4

13 25 8M 1.63 48.5 18.3

14 28 8M 1.57 42 17.0

15 29 8M 1.45 37 17.6

16 34 8M 1.62 45 17.1

17 38 8M 1.57 52 21.1

18 40 8M 1.56 54 22.2

19 45 8M 1.55 52 21.6

20 47 8M 1.63 49 18.4

21 50 8m 1.61 55 21.2

22 55 8m 1.495 51 22.8

23 56 8m 1.45 34 16.2

24 57 8m 1.45 47 22.4

25 58 8m 1.62 63 24.0

26 60 8m 1.55 39.5 16.4

27 67 8M 1.71 57 19.5

28 68 8M 1.5 48 21.3

29 70 8M 1.64 57 21.2

30 74 8M 1.48 40 18.3

31 76 8M 1.7 62 21.5

32 78 8M 1.63 46 17.3

33 80 8M 1.69 61.5 21.5

34 81 8M 1.53 41 17.5

35 85 8M 1.43 37 18.1

Year 8: Female = 25 mean h = 1.6122917SD = 0.1

Male = 35 mean h = 1.5788333SD = 0.1

Total = 60

Height Weight BMI

Female: Min= 1.470 33.000 13.219

LQ = 1.580 43.375 17.851

Median= 1.615 49.500 19.787

UQ = 1.650 58.125 21.615

Maximum= 1.740 75.500 28.592

IQR = 0.070 14.750 3.76

Lout = 1.475 21.250 12.21

Uout = 1.755 80.250 27.26

Height Weight BMI

Male: Min= 1.430 34.000 15.427

55

LQ = 1.515 41.500 17.654

Median= 1.580 48.500 19.025

UQ = 1.645 57.000 21.588

Maximum= 1.720 81.000 28.028

IQR = 0.130 15.500 3.9

Lout = 1.320 18.250 11.8

Uout = 1.840 80.250 27.5

Outliers:

15 42 8f 1.05 60.5 54.9

Year 9:

Number Random Sample

Year

Group Gender Height (m) Weight (kg) BMI

1 3 9F 1.54 41.5 17.5

2 4 9F 1.67 66.5 23.8

3 5 9F 1.6 48 18.8

4 6 9F 1.74 64.5 21.3

5 8 9F 1.6 46 18.0

6 11 9F 1.6 43 16.8

7 12 9F 1.58 50 20.0

8 13 9F 1.59 55 21.8

9 18 9F 1.62 57.5 21.9

10 19 9F 1.63 60.1 22.6

11 22 9F 1.62 44.5 17.0

12 23 9F 1.63 50 18.8

13 24 9F 1.64 46 17.1

14 25 9F 1.6 52 20.3

15 26 9F 1.6 48.5 18.9

16 37 9F 1.7 57.5 19.9

17 38 9F 1.65 67.5 24.8

18 46 9F 1.57 38.5 15.6

19 49 9F 1.57 48 19.5

20 50 9F 1.66 67.5 24.5

21 51 9F 1.55 47 19.6

22 52 9F 1.65 60 22.0

23 55 9f 1.71 59.5 20.3

24 57 9f 1.6 51.5 20.1

25 60 9f 1.65 65 23.9

26 61 9f 1.56 42 17.3

27 63 9f 1.56 41 16.8

1 3 9M 1.69 53.5 18.7

2 9 9M 1.73 71 23.7

3 14 9M 1.59 52.5 20.8

4 20 9M 1.65 50 18.4

5 21 9M 1.6 44 17.2

6 23 9M 1.72 60.5 20.5

7 24 9M 1.66 58.5 21.2

8 26 9M 1.65 55.5 20.4

9 27 9M 1.7 70 24.2

10 28 9M 1.71 75 25.6

11 29 9M 1.77 71.5 22.8

12 33 9M 1.6 56 21.9

13 35 9M 1.57 60 24.3

56

14 36 9M 1.68 61.5 21.8

15 37 9M 1.79 85 26.5

16 39 9M 1.67 50 17.9

17 41 9M 1.51 40 17.5

18 42 9M 1.69 52 18.2

19 47 9M 1.86 80.5 23.3

20 49 9M 1.62 47 17.9

21 50 9M 1.7 64 22.1

22 51 9M 1.6 57 22.3

24 54 9M 1.67 62 22.2

25 55 9M 1.7 49.5 17.1

26 57 9M 1.55 46 19.1

27 62 9M 1.88 61 17.3

28 66 9m 1.72 67 22.6

29 68 9m 1.73 72 24.1

30 70 9m 1.65 60 22.0

31 74 9m 1.67 54 19.4

32 76 9m 1.64 61 22.7

Year 9: Female = 27 mean h = 1.618148SD = 0.0498

Male = 32 mean h = 1.674138SD = 0.0831

Total = 59

Height Weight BMI

Female: Min= 1.540 38.500 15.619

LQ = 1.585 46.000 17.734

Median= 1.600 50.000 19.896

UQ = 1.650 59.750 21.833

Maximum= 1.740 67.500 24.793

IQR = 0.065 13.750 4.10

Lout = 1.488 25.375 11.59

Uout = 1.748 80.375 27.98

Height Weight BMI

Male: Min= 1.510 40.000 17.128

LQ = 1.625 52.125 18.561

Median= 1.670 60.000 21.790

UQ = 1.715 65.500 22.751

Maximum= 1.880 85.000 26.529

IQR = 0.090 13.375 4.2

Lout = 1.490 32.063 12.3

Uout = 1.850 85.563 29.0

Outliers:

23 52 9M 1.31 75 43.7

Year 10:

Number Random Sample Year Group Gender Height (m) Weight (kg) BMI

1 2 10f 1.64 61 22.7

2 3 10f 1.6 47 18.4

3 4 10f 1.55 46 19.1

4 6 10f 1.61 56.5 21.8

5 7 10f 1.55 46 19.1

57

6 9 10f 1.67 57 20.4

7 12 10f 1.68 52.5 18.6

8 17 10f 1.65 60 22.0

9 18 10f 1.7 69 23.9

10 19 10f 1.69 66 23.1

11 20 10f 1.59 58 22.9

12 22 10f 1.65 56 20.6

13 23 10f 1.79 57 17.8

14 24 10f 1.67 46 16.5

15 28 10f 1.615 48.5 18.6

16 34 10f 1.83 79 23.6

17 38 10f 1.62 50 19.1

18 40 10f 1.62 54 20.6

19 41 10f 1.46 56 26.3

20 43 10f 1.51 52 22.8

21 45 10f 1.55 55 22.9

22 46 10f 1.53 47 20.1

23 48 10f 1.63 48.5 18.3

24 50 10f 1.62 57.5 21.9

25 52 10f 1.63 46 17.3

26 53 10f 1.65 70.14 25.8

27 58 10f 1.59 47.5 18.8

1 5 10m 1.62 44 16.8

3 10 10m 1.79 66 20.6

4 11 10m 1.77 73 23.3

5 12 10m 1.63 46 17.3

6 13 10m 1.76 55 17.8

7 14 10m 1.79 56 17.5

8 18 10m 1.63 56 21.1

9 20 10m 1.66 64 23.2

10 25 10m 1.82 71 21.4

11 26 10m 1.77 65 20.7

12 28 10m 1.69 68 23.8

13 34 10m 1.73 53 17.7

14 37 10m 1.8 65 20.1

15 38 10m 1.745 49.5 16.3

16 42 10m 1.65 63 23.1

17 43 10m 1.67 54 19.4

18 44 10m 1.62 54 20.6

19 45 10m 1.6 46 18.0

20 46 10m 1.81 62 18.9

23 55 10m 1.67 62 22.2

24 56 10m 1.67 60 21.5

25 60 10m 1.66 57.5 20.9

26 62 10m 1.88 81 22.9

27 64 10m 1.73 54 18.0

28 67 10m 1.8 65 20.1

29 71 10m 1.79 67 20.9

Year 9: Female = 28 mean h = 1.628393SD = 0.0778

Male = 29 mean h = 1.721346SD = 0.0774

Total = 57

Height Weight BMI

Female: Min= 1.460 46.000 16.494

LQ = 1.590 48.250 18.742

Median= 1.625 55.500 20.573

58

UQ = 1.670 58.500 22.905

Maximum= 1.830 79.000 29.412

IQR = 0.080 10.250 4.16

Lout = 1.470 32.875 12.50

Uout = 1.790 73.875 29.15

Height Weight BMI

Male: Min= 1.600 44.000 16.256

LQ = 1.660 54.000 17.987

Median= 1.730 61.000 20.587

UQ = 1.790 65.000 21.494

Maximum= 1.880 81.000 23.809

IQR = 0.130 11.000 3.5

Lout = 1.465 37.500 12.7

Uout = 1.985 81.500 26.8

Outliers:

2 6 10m 1.51 32 14.0

28 61 10f 1.7 85 29.4

22 53 10m 1.37 67 35.7

21 51 10m 1.83 95 28.4

Year 11:

Number Random Sample Year Group Gender Height (m) Weight (kg) BMI

1 5 11 F 1.55 50.5 21.0

2 6 11 F 1.6 50.5 19.7

3 8 11 F 1.55 45 18.7

4 11 11 F 1.55 51 21.2

5 13 11 F 1.65 62 22.8

6 17 11 F 1.74 55 18.2

7 23 11 F 1.69 47.5 16.6

8 24 11 F 1.69 50 17.5

9 25 11 F 1.67 58 20.8

10 26 11 F 1.64 61 22.7

11 27 11 F 1.64 55.5 20.6

12 29 11 F 1.7 73.5 25.4

13 30 11 F 1.65 48 17.6

14 31 11 F 1.69 60 21.0

15 32 11 F 1.59 57 22.5

16 38 11 F 1.64 53.5 19.9

17 39 11 F 1.65 50 18.4

18 42 11 F 1.69 60 21.0

19 45 11 F 1.57 65 26.4

20 49 11 F 1.63 59 22.2

21 50 11 F 1.57 44.5 18.1

22 53 11 F 1.72 87 29.4

23 56 11 F 1.71 60.5 20.7

24 58 11 F 1.66 54 19.6

25 62 11 F 1.58 56 22.4

26 63 11 F 1.67 75 26.9

27 65 11 F 1.56 60 24.7

2 4 11 M 1.84 62 18.3

59

3 7 11 M 1.95 65 17.1

4 8 11 M 1.9 67 18.6

5 9 11 M 1.8 67 20.7

6 10 11 M 1.78 69.5 21.9

7 16 11 M 1.68 80 28.3

8 17 11 M 1.77 60 19.2

9 27 11 M 1.76 83 26.8

10 28 11 M 1.71 52 17.8

11 30 11 M 1.83 66 19.7

12 34 11 M 1.64 76 28.3

13 37 11 M 1.8 70 21.6

14 38 11 M 1.9 61 16.9

15 40 11 M 1.66 61.5 22.3

16 41 11 M 1.78 61.5 19.4

17 42 11 M 1.65 59 21.7

18 45 11 M 1.83 68 20.3

19 46 11 M 1.73 57 19.0

21 51 11 M 1.71 59 20.2

22 53 11 M 1.79 81 25.3

23 55 11 M 1.81 64.5 19.7

24 59 11 M 1.8 73.5 22.7

25 60 11 M 1.86 73 21.1

26 62 11 M 1.7 62 21.5

27 63 11 M 1.78 73.5 23.2

28 64 11 M 1.78 61.5 19.4

29 65 11 M 1.79 65.5 20.4

30 68 11 M 1.79 59 18.4

31 73 11 M 1.68 64 22.7

32 74 11 M 1.9 77.5 21.5

33 79 11 M 1.88 66 18.7

34 82 11 1.77 68 21.7

Year 11: Female = 27 mean h = 1.638889SD = 0.0574

Male = 34 mean h = 1.782813SD = 0.0782

Total = 61

Height Weight BMI

Female: Min= 1.550 44.500 16.631

LQ = 1.585 50.500 19.163

Median= 1.650 56.000 21.008

UQ = 1.690 60.250 22.613

Maximum= 1.740 87.000 29.408

IQR = 0.105 9.750 3.45

Lout = 1.428 35.875 13.99

Uout = 1.848 74.875 27.79

Height Weight BMI

Male: Min= 1.640 52.000 16.898

LQ = 1.725 61.500 19.125

Median= 1.785 65.750 20.561

UQ = 1.830 70.750 22.031

Maximum= 1.950 83.000 28.345

IQR = 0.105 9.250 2.9

Lout = 1.568 47.625 14.8

Uout = 1.988 84.625 26.4

60

Outlier:

20 48 11 M 5.7 179 5.5

1 1 11 M 1.85 120 35.1

61

Appendix 3 – Chi-squared test

62

Appendix 4 – Pearson coefficient

HGS Male Pearson calculation:

x y

Height (m) Weight (kg) x

2

y

2

xy

1.38 30 1.9044 900 41.4

1.41 45 1.9881 2025 63.45

1.41 40 1.9881 1600 56.4

1.41 32 1.9881 1024 45.12

1.43 38 2.0449 1444 54.34

1.43 37 2.0449 1369 52.91

1.43 34 2.0449 1156 48.62

1.44 35.5 2.0736 1260.25 51.12

1.44 35 2.0736 1225 50.4

1.45 47 2.1025 2209 68.15

1.45 40 2.1025 1600 58

1.45 39 2.1025 1521 56.55

1.45 37 2.1025 1369 53.65

1.45 37 2.1025 1369 53.65

1.45 34 2.1025 1156 49.3

1.47 50.5 2.1609 2550.25 74.235

1.48 40 2.1904 1600 59.2

1.48 39 2.1904 1521 57.72

1.48 36 2.1904 1296 53.28

1.49 42 2.2201 1764 62.58

1.495 51 2.235025 2601 76.245

1.5 49 2.25 2401 73.5

1.5 48 2.25 2304 72

1.5 48 2.25 2304 72

1.51 40 2.2801 1600 60.4

1.52 43 2.3104 1849 65.36

1.52 42 2.3104 1764 63.84

1.53 43 2.3409 1849 65.79

1.53 43 2.3409 1849 65.79

1.53 41 2.3409 1681 62.73

1.54 53.5 2.3716 2862.25 82.39

1.54 50.5 2.3716 2550.25 77.77

1.54 42 2.3716 1764 64.68

1.545 54.5 2.387025 2970.25 84.2025

1.55 52 2.4025 2704 80.6

1.55 48 2.4025 2304 74.4

1.55 46 2.4025 2116 71.3

1.55 39.5 2.4025 1560.25 61.225

1.56 54 2.4336 2916 84.24

1.56 53.5 2.4336 2862.25 83.46

1.56 42.5 2.4336 1806.25 66.3

1.57 60 2.4649 3600 94.2

1.57 52 2.4649 2704 81.64

1.57 42 2.4649 1764 65.94

1.58 48 2.4964 2304 75.84

1.58 45.5 2.4964 2070.25 71.89

1.59 52.5 2.5281 2756.25 83.475

63

1.59 39 2.5281 1521 62.01

1.6 57 2.56 3249 91.2

1.6 56 2.56 3136 89.6

1.6 46 2.56 2116 73.6

1.6 44 2.56 1936 70.4

1.61 55 2.5921 3025 88.55

1.61 54 2.5921 2916 86.94

1.61 49 2.5921 2401 78.89

1.62 63 2.6244 3969 102.06

1.62 54 2.6244 2916 87.48

1.62 47 2.6244 2209 76.14

1.62 45 2.6244 2025 72.9

1.62 44 2.6244 1936 71.28

1.63 56 2.6569 3136 91.28

1.63 52 2.6569 2704 84.76

1.63 49 2.6569 2401 79.87

1.63 48.5 2.6569 2352.25 79.055

1.63 46 2.6569 2116 74.98

1.63 46 2.6569 2116 74.98

1.64 76 2.6896 5776 124.64

1.64 61 2.6896 3721 100.04

1.64 57 2.6896 3249 93.48

1.65 63 2.7225 3969 103.95

1.65 60 2.7225 3600 99

1.65 59 2.7225 3481 97.35

1.65 55.5 2.7225 3080.25 91.575

1.65 51 2.7225 2601 84.15

1.65 50 2.7225 2500 82.5

1.66 64 2.7556 4096 106.24

1.66 61.5 2.7556 3782.25 102.09

1.66 58.5 2.7556 3422.25 97.11

1.66 57.5 2.7556 3306.25 95.45

1.67 62 2.7889 3844 103.54

1.67 62 2.7889 3844 103.54

1.67 60 2.7889 3600 100.2

1.67 54 2.7889 2916 90.18

1.67 54 2.7889 2916 90.18

1.67 50 2.7889 2500 83.5

1.68 80 2.8224 6400 134.4

1.68 68 2.8224 4624 114.24

1.68 64 2.8224 4096 107.52

1.68 61.5 2.8224 3782.25 103.32

1.69 68 2.8561 4624 114.92

1.69 61.5 2.8561 3782.25 103.935

1.69 53.5 2.8561 2862.25 90.415

1.69 52 2.8561 2704 87.88

1.7 81 2.89 6561 137.7

1.7 70 2.89 4900 119

1.7 70 2.89 4900 119

1.7 64 2.89 4096 108.8

1.7 62 2.89 3844 105.4

1.7 62 2.89 3844 105.4

1.7 61 2.89 3721 103.7

1.7 49.5 2.89 2450.25 84.15

1.71 75 2.9241 5625 128.25

1.71 59 2.9241 3481 100.89

1.71 57 2.9241 3249 97.47

64

1.71 52 2.9241 2704 88.92

1.72 71 2.9584 5041 122.12

1.72 67 2.9584 4489 115.24

1.72 60.5 2.9584 3660.25 104.06

1.72 57.5 2.9584 3306.25 98.9

1.73 72 2.9929 5184 124.56

1.73 71 2.9929 5041 122.83

1.73 57 2.9929 3249 98.61

1.73 54 2.9929 2916 93.42

1.73 53 2.9929 2809 91.69

1.745 49.5 3.045025 2450.25 86.3775

1.76 83 3.0976 6889 146.08

1.76 55 3.0976 3025 96.8

1.77 73 3.1329 5329 129.21

1.77 71.5 3.1329 5112.25 126.555

1.77 68 3.1329 4624 120.36

1.77 65 3.1329 4225 115.05

1.77 60 3.1329 3600 106.2

1.78 73.5 3.1684 5402.25 130.83

1.78 69.5 3.1684 4830.25 123.71

1.78 61.5 3.1684 3782.25 109.47

1.78 61.5 3.1684 3782.25 109.47

1.79 85 3.2041 7225 152.15

1.79 81 3.2041 6561 144.99

1.79 67 3.2041 4489 119.93

1.79 66 3.2041 4356 118.14

1.79 65.5 3.2041 4290.25 117.245

1.79 59 3.2041 3481 105.61

1.79 56 3.2041 3136 100.24

1.8 73.5 3.24 5402.25 132.3

1.8 70 3.24 4900 126

1.8 67 3.24 4489 120.6

1.8 65 3.24 4225 117

1.8 65 3.24 4225 117

1.81 64.5 3.2761 4160.25 116.745

1.81 62 3.2761 3844 112.22

1.82 71 3.3124 5041 129.22

1.83 95 3.3489 9025 173.85

1.83 68 3.3489 4624 124.44

1.83 66 3.3489 4356 120.78

1.84 62 3.3856 3844 114.08

1.86 80.5 3.4596 6480.25 149.73

1.86 73 3.4596 5329 135.78

1.88 81 3.5344 6561 152.28

1.88 66 3.5344 4356 124.08

1.88 61 3.5344 3721 114.68

1.9 77.5 3.61 6006.25 147.25

1.9 67 3.61 4489 127.3

1.9 61 3.61 3721 115.9

1.95 65 3.8025 4225 126.75

Σx= 254.695

Σy= 8684

Σx

2

= 423.70938

Σy

2=

513718

Σxy= 14559.115

65

N= 154

Pearsons = 0.807051782

HGS Female Pearson calculation:

x y

Height (m) Weight (kg) x

2

y

2

xy

1.62 44.5 2.6244 1980.25 72.09

1.535 54 2.356225 2916 82.89

1.57 51 2.4649 2601 80.07

1.48 44.5 2.1904 1980.25 65.86

1.53 38 2.3409 1444 58.14

1.41 37.5 1.9881 1406.25 52.875

1.68 56 2.8224 3136 94.08

1.54 50 2.3716 2500 77

1.38 27 1.9044 729 37.26

1.6 47 2.56 2209 75.2

1.53 48 2.3409 2304 73.44

1.51 42 2.2801 1764 63.42

1.51 42 2.2801 1764 63.42

1.52 39 2.3104 1521 59.28

1.4 34 1.96 1156 47.6

1.52 43 2.3104 1849 65.36

1.52 42 2.3104 1764 63.84

1.44 37 2.0736 1369 53.28

1.54 61 2.3716 3721 93.94

1.5 41 2.25 1681 61.5

1.6 37 2.56 1369 59.2

1.52 49 2.3104 2401 74.48

1.65 44.5 2.7225 1980.25 73.425

1.535 42.5 2.356225 1806.25 65.2375

1.65 69 2.7225 4761 113.85

1.67 61.5 2.7889 3782.25 102.705

1.74 65 3.0276 4225 113.1

1.67 67 2.7889 4489 111.89

1.6 45 2.56 2025 72

1.64 46 2.6896 2116 75.44

1.58 33 2.4964 1089 52.14

1.54 43 2.3716 1849 66.22

1.62 42 2.6244 1764 68.04

1.59 48 2.5281 2304 76.32

1.65 49.5 2.7225 2450.25 81.675

1.58 50 2.4964 2500 79

1.59 56.5 2.5281 3192.25 89.835

1.55 49.5 2.4025 2450.25 76.725

1.69 57 2.8561 3249 96.33

1.66 61.5 2.7556 3782.25 102.09

1.61 51 2.5921 2601 82.11

1.625 75.5 2.640625 5700.25 122.6875

1.59 41 2.5281 1681 65.19

1.47 43 2.1609 1849 63.21

1.65 46 2.7225 2116 75.9

1.55 43.5 2.4025 1892.25 67.425

1.53 42 2.3409 1764 64.26

66

1.65 56.5 2.7225 3192.25 93.225

1.54 41.5 2.3716 1722.25 63.91

1.67 66.5 2.7889 4422.25 111.055

1.6 48 2.56 2304 76.8

1.74 64.5 3.0276 4160.25 112.23

1.6 46 2.56 2116 73.6

1.6 43 2.56 1849 68.8

1.58 50 2.4964 2500 79

1.59 55 2.5281 3025 87.45

1.62 57.5 2.6244 3306.25 93.15

1.63 60.1 2.6569 3612.01 97.963

1.62 44.5 2.6244 1980.25 72.09

1.63 50 2.6569 2500 81.5

1.64 46 2.6896 2116 75.44

1.6 52 2.56 2704 83.2

1.6 48.5 2.56 2352.25 77.6

1.7 57.5 2.89 3306.25 97.75

1.65 67.5 2.7225 4556.25 111.375

1.57 38.5 2.4649 1482.25 60.445

1.57 48 2.4649 2304 75.36

1.66 67.5 2.7556 4556.25 112.05

1.55 47 2.4025 2209 72.85

1.65 60 2.7225 3600 99

1.71 59.5 2.9241 3540.25 101.745

1.6 51.5 2.56 2652.25 82.4

1.65 65 2.7225 4225 107.25

1.56 42 2.4336 1764 65.52

1.56 41 2.4336 1681 63.96

1.64 61 2.6896 3721 100.04

1.6 47 2.56 2209 75.2

1.55 46 2.4025 2116 71.3

1.61 56.5 2.5921 3192.25 90.965

1.55 46 2.4025 2116 71.3

1.67 57 2.7889 3249 95.19

1.68 52.5 2.8224 2756.25 88.2

1.65 60 2.7225 3600 99

1.7 69 2.89 4761 117.3

1.69 66 2.8561 4356 111.54

1.59 58 2.5281 3364 92.22

1.65 56 2.7225 3136 92.4

1.79 57 3.2041 3249 102.03

1.67 46 2.7889 2116 76.82

1.615 48.5 2.608225 2352.25 78.3275

1.83 79 3.3489 6241 144.57

1.62 50 2.6244 2500 81

1.62 54 2.6244 2916 87.48

1.46 56 2.1316 3136 81.76

1.51 52 2.2801 2704 78.52

1.55 55 2.4025 3025 85.25

1.53 47 2.3409 2209 71.91

1.63 48.5 2.6569 2352.25 79.055

1.62 57.5 2.6244 3306.25 93.15

1.63 46 2.6569 2116 74.98

1.65 70.14 2.7225 4919.62 115.731

1.59 47.5 2.5281 2256.25 75.525

1.55 50.5 2.4025 2550.25 78.275

1.6 50.5 2.56 2550.25 80.8

67

1.55 45 2.4025 2025 69.75

1.55 51 2.4025 2601 79.05

1.65 62 2.7225 3844 102.3

1.74 55 3.0276 3025 95.7

1.69 47.5 2.8561 2256.25 80.275

1.69 50 2.8561 2500 84.5

1.67 58 2.7889 3364 96.86

1.64 61 2.6896 3721 100.04

1.64 55.5 2.6896 3080.25 91.02

1.7 73.5 2.89 5402.25 124.95

1.65 48 2.7225 2304 79.2

1.69 60 2.8561 3600 101.4

1.59 57 2.5281 3249 90.63

1.64 53.5 2.6896 2862.25 87.74

1.65 50 2.7225 2500 82.5

1.69 60 2.8561 3600 101.4

1.57 65 2.4649 4225 102.05

1.63 59 2.6569 3481 96.17

1.57 44.5 2.4649 1980.25 69.865

1.72 87 2.9584 7569 149.64

1.71 60.5 2.9241 3660.25 103.455

1.66 54 2.7556 2916 89.64

1.58 56 2.4964 3136 88.48

1.67 75 2.7889 5625 125.25

1.56 60 2.4336 3600 93.6

Σx= 207.17

Σy= 6749.74

Σx

2

= 333.4173

Σy

2=

365923.38

Σxy= 10903.052

N= 129

Pearsons = 0.664555

68

- EWMAUploaded byrzmaul
- motivatonUploaded byNiraimathy Ramasamy
- Effect of Cost of Governance on Economic Growth of NigeriaUploaded byEditor IJTSRD
- methodologyUploaded byrohitnarendra
- Relationship between Quality Management System Adoption and Organization Performance of Public Universities in KenyaUploaded byPremier Publishers
- Habib_project__05-11-2011Uploaded byraitara
- Studies of One and Two-handed Work_ I. Grasping Small Parts FromUploaded byJorge Nelson Cuellar Crespo
- Accounting 2012 What Drives Quality of Firm Risk DisclosureUploaded byMarius George Ciubotariu
- Quiz 2G2Uploaded byedniel maratas
- 2007 F4 Add Math ProjectsUploaded byapi-3804926
- 2006-Obesity-14_431Uploaded byGilang Haliza
- activity 1 reflectionUploaded byapi-229410609
- 19 - Correlation and RegressionUploaded byzahoor80
- Nothing ventured, nothing gained. Profiles of online activity, cyber-crime exposure, and security measures of end-users in European Union.pdfUploaded byYans Pangerungan
- Additional MathematicsUploaded byikakumon
- nullUploaded bydrassi
- iee_le05_wise12_slides+commentsUploaded byТаймурАбдуллах
- Week 2 -Individual AssignmentUploaded byGianeris Rivera Marquez
- Correlation Depression vs Internet UseUploaded byDante Luna
- Finance and Monte Carlo SimulatinUploaded byOVVOFinancialSystems
- WERD1400266.pdfUploaded byelezkm
- TorturingExcel.pdfUploaded byFederico Caruso
- Project Report GuidelinesUploaded bySabana Asmi
- Key Terms Chapter 1 PsychUploaded bypinkfrosting741
- Hasil Spss WordUploaded byRika Ariyanti
- T 11 InspectionUploaded bypoojaguptainida1
- BMI.pdfUploaded bydemmon
- SHN Resource Manual June 2012 4th DraftUploaded byAzian Mohd Hanip
- obesity needs assementUploaded byapi-302132284
- BMI & SmokingUploaded byLe Huy

- Challenging 3.5 and Pathfinder Parties 2 17 13.docUploaded byJoe Cryan
- Abigail Walters Princess SheetUploaded byJoe Cryan
- ChampionUploaded byJoe Cryan
- 5e Warlock Arianna BainesUploaded byJoe Cryan
- Areas of ResearchUploaded byJoe Cryan
- Chapter Twelve SUploaded byJoe Cryan
- Year 10 DRAMA work (1)Uploaded byJoe Cryan
- Year 10 DRAMA workUploaded byJoe Cryan