You are on page 1of 10

Weight-Driven BMI Predictions: Exploring

Gender Disparities through Hypothesis Testing

Susmita1 , Gaurav2 , Nilmoni3 , Giriraj4 , Ch. Siva Santhosh5


Department of Mathematics1,2,3,4
G S Sanyal school of Telecommnications5 ,
IIT Kharagpur

November 14, 2023

Abstract

In 2014 P. K. Sarvanam et al [1] did a biometric study on the dataset,


which we took as our dataset here. We construct a relation between BMI
and weight from the dataset using regression and find the correlation using
the Person’s product-moment method.

1 Introduction

Welcome to an intriguing journey into the realm of human biometrics. In this sta-
tistical project, we’re delving into the intricate details of gender, height, weight,
and body mass index (BMI) within a dataset comprising 200 individuals. Guided
by the assumption of a normal distribution, this analysis employs a suite of fun-
damental statistical operations and tests to unearth nuanced insights from the
data.

Our subjects have been carefully selected to mirror the diversity of the broader pop-
ulation, allowing us to harness the power of statistical tools such as mean, median,
quantiles, and normal distribution parameters. These tools enable us to discern
central tendencies, explore variations, and gauge the distributional characteristics
of the variables under consideration.

The exploration is not limited to descriptive statistics; we’re venturing into inferen-
tial statistics territory. Utilizing hypothesis testing, including p-tests and t-tests,
we aim to discern statistically significant relationships within and between gen-
der, height, weight, and BMI. These tests serve as our statistical magnifying glass,
thoroughly examining potential patterns and correlations. By combining statistical
measures’ simplicity with real-world biometric data’s complexity, our objective is
to unravel meaningful insights. This endeavor isn’t just about numbers; it’s about

1
unraveling the underlying stories and connections embedded in the dimensions of
human biometrics.

Join us as we apply statistical rigor to decipher the intricacies of our dataset, pro-
viding a nuanced understanding of how gender, height, weight, and BMI interplay
within a population exhibiting the characteristics of a normal distribution. It’s
not just a statistical exploration; it’s a quest to decode the statistical fingerprints
of humanity.

Figure 1: Male vs Female Bar plot

2 Details for the data “Height”

Summary of the data “Height”

We are trying to figure out some of the important measures for Math scores from
the collected data. Using R-software, we can see that,

Attributes Minimum 1st quantile Median Mean 3rd quantile Maximum


Weight 34.73 58.58 69.54 69.93 77.96 129.41
BMI 14.04 22.88 27.23 27.30 31.21 49.02
Height 123.0 148.9 161.7 160.9 172.6 199.3

Table 1: Summary of the Data

2
Histogram for Height

The following graph is the histogram for the data “Height”.

Figure 2: Histogram for Height

Histogram for “Weight”


The following graph is histogram for the data “Weight”:

Figure 3: Histogram for Weight

3
Histogram for “BMI”
The following graph is the histogram for the data “BMI”:

Figure 4: Histogram for BMI

3 Relation between “Weight” and “BMI”

Now, we are interested in checking whether there is any dependency between the
two variables, Weights and BMI of humans. Fixed Weights as independent vari-
able(x) and BMI as dependent variable (y), now we are trying to see whether there
is a simple linear regression or not between Weights and BMI.

From the scatter plot, we have to visualize that there may be a linear relationship
between BMI and weight.
Let the relationship is y = β0 + β1 x + ϵ
Now, the regression analysis following:

From these results, we see that there is a significant positive relationship be-
tween BMI and Weights with 0.247 unit increase in BMI for every unit increase in
Weights.
Also, the correlation coefficient between BMI and weight is 0.61; that is, BMI de-
pends on weight.

From the residual vs fitted graph of BMI vs Weight, since the red line is not much
different from the dotted line, there can be a linear relationship between BMI and
Weight.
To check whether the dependent variable follows normal distribution or not, we
plot histograms of BMI for both the cases of the male and female populations.

4
Now, we are interested in analyzing the dependency of BMI on the weights of male
and female candidates. Let the relations be,
y = β10 + β11 x
y = β20 + β22 x

We perform a simple linear regression analysis for females and males. The results
are the following:

From these results, we can say that there is a significant positive relationship be-
tween weight and BMI, with 0.416 unit increase in BMI for every unit increase in
weight for females, and for males the increment rate is 0.328. The correlation co-
efficients of BMI and weights for females and males are 0.76 and 0.83, respectively.
From this analysis, we can conclude that the expected increment in BMI as weight
increases, is comparatively high for females.

4 One sample t-test on BMI

In summary, we find that the mean of BMI for females is 28.37 and the mean of
BMI for males is 26.31; we want to test whether we can take µ = 26.50. Here, we

5
Ref : Results after performing codes in R language

choose α = 0.05. We perform one sample t-test for females and males the results
are the following:

We get p-value less than 0.05, so we reject the null hypothesis at level α = 0.05
for females.
We get a p-value greater than 0.05, so the true mean is equal to 26.50 at level
α = 0.05 for males.

5 Variance ratio test for equal sample size:

We want to check whether the data for BMI of males and BMI of females have the
same variance.
So, we performed a variance test, and the results are the following: We find that
for each test, p-values are greater than α = 0.05. So we can conclude that the
variance of BMI for males and females are equal at a significant level α = 0.05.

6
6 Two samples paired t-test:

Since we get the variance of BMI for males and females are the same, we compare
their mean. So we performed a two-sample paired t-test, and the results are the
7
Ref : Results after performing codes in R language

following:
For this, the p-value is less than α = 0.05. So we accept the alternating hypothesis,

8
Ref : Results after performing codes in R language

Ref : Results after performing codes in R language

which means that the mean of BMI for males and females is not equal at level
α = 0.05.

9
References
[1] S Prasanna Kumar and A Ravikumar. Biometric study of the internal di-
mensions of subglottis and upper trachea in adult indian population. Indian
Journal of Otolaryngology and Head & Neck Surgery, 66:261–266, 2014.

10

You might also like