You are on page 1of 4

The 10h IEEE International Conference on Dependable Systems, Services and Technologies, DESSERT’2019

5-7 June, 2019, Leeds, United Kingdom

A Machine Learning Approach for Predicting


Weight Gain Risks in Young Adults
Balbir Singh Hissam Tawfik
School of Computing, Creative Technologies and Engineering School of Computing, Creative Technologies and Engineering
Leeds Beckett UNiversity Leeds Beckett UNiversity
Leeds, United Kingdom Leeds, United Kingdom
B.Singh@leedsbeckett.ac.uk H.Tawfik@leedsbeckett.ac.uk

Abstract— Individuals developing signs of weight gain or reported some stability in obesity [4]. Similar claims were
obesity are at a risk of developing serious illnesses such as type made by [5] and [6]. Some reports even suggest that childhood
2 diabetes, respiratory problems, coronary heart disease and obesity may well be declining as a cumulative result of
stroke. Physical activity and healthy eating can be a increased physical activity, television viewing decline and
fundamental component to maintain a healthy lifestyle. reduction in sugary drink consumption [7]. However, [4] also
Therefore, detecting childhood obesity is of paramount reported that this stability should be observed with caution
importance. This paper utilises the vast amount of data since previous stable phases were followed by further
available via the millennium cohort study. Various regression increases in prevalence of obesity.
methods and artificial neural network models have been
evaluated to predict the teenager BMI from earlier BMI values. Health Survey England 2013 [8] also reported some
The results obtained are encouraging and a prediction accuracy alarming facts about adult obesity. It was reported that 26% of
of over 90% has been achieved. Various issues relating to data men and 24% of women were obese. 41% of men and 33% of
mining and prediction accuracy are discussed. women were overweight. Looking at these combined figures
gives cause for concern since 67% of men and 57% of women
Keywords— Obesity, BMI, regression, machine-learning, are above their normal weight for their height.
data-science, prediction, neural-networks.
Childhood obesity is of great public concern as 22-90% of
I. INTRODUCTION the obese childhood population will continue to be obese as an
A report published by McKinsey Global Institute on adult [9]. Obesity is strongly linked with other negative health
overcoming obesity highlighted some alarming facts [1]. The conditions such as type 2 diabetes, cardiovascular diseases,
report discusses that more than 2.1 billion people are cancers and even death [10, 11, 12, 13]. Considering all this,
overweight or obese worldwide which amounts to almost 30% it is obvious that there is a pressing need to identify individuals
of the global population. If the current trend continues, the at a risk of developing obesity as early as possible. The
overweight and obese population is likely to increase to 41% purpose of this study is to carry out analysis and evaluation
by 2030. The report also states that the cost of addressing using suitable machine learning algorithms to predict future
global obesity is $2.0 trillion which is comparable to the BMI values for young adults from early childhood data with a
impact of smoking or armed conflict, both costing $2.1 trillion high degree of accuracy.
each. The report recommends that the behaviour change
interventions can be cost effective for the society by saving II. RELATED WORK
money from the health care costs and increased productivity. A. Expert systems and AI in health
Such intervention could save around $1.2 billion for the
Long before the proliferation of machine learning
National Health Service (NHS) in the United Kingdom alone.
techniques, expert systems such as earlier artificial
The evidence suggests that the behavioural change
intelligence have been in use for the analysis of medical scans
interventions to combat obesity need further investigation to
and detection of other medical abnormalities. As soon as it
find workable solutions rather than waiting for a perfect
became possible to digitize medical scans and store them in
solution. In the United Kingdom, the data presented in Health
computer memory, medical staff and computer scientists have
Survey for England reports that the percentage of obese
been proactive in building automated systems to analyse them.
children between the age of 2 and 15 has increased
From the 1970s to the 1990s, scanned medical images were
significantly since 1995 [2]. 16% of boys and 15% of girls in
analysed using low level pixel processing (edge and line
this age group were classed as obese. 14% of both genders
detection) and mathematical modelling (fitting lines, circles
were classed as overweight. This results in 30% of boys and
and ellipses) [14]. Systems of this kind have been described as
29% of girls being either overweight or obese. In 1995, 11%
Good Old-Fashioned Artificial Intelligence or GOFAI [15].
of boys and 12% of girls of 2-15 years of age were obese.
Using computational intelligence, there have been numerous
However, [3] highlights that the prevalence of childhood other examples of automated disease diagnosis over the last
overweight seems to be plateauing worldwide. This review decades [16, 17, 18].
included data from 467,294 children from nine countries
B. Machine Learning applications in health
(Australia, China, England, France, Netherlands, New
Zealand, Sweden, Switzerland and USA). Another survey A variety of Artificial Neural Networks (ANNs) have
identified 52 obesity studies worldwide from 25 countries and been employed in the prediction of medical diseases and

978-1-7281-1733-1/19/$31.00 ©2019 IEEE

231

Authorized licensed use limited to: UNIVERSITY OF TENNESSEE. Downloaded on December 06,2020 at 03:57:48 UTC from IEEE Xplore. Restrictions apply.
solve complex problems based on a set of known parameters. algorithms. They compared logistic regression with six data
ANNs relate to machine learning (ML) methods and are mining techniques: Decision Trees, Association Rules,
utilised to correlate input to corresponding output data. Neural Networks, Naïve Bayes, Bayesian Networks and
Several examples of application of ANNs in health and Support Vector Ma-chines. They considered prediction
wellbeing and engineering systems have been reported with sensitivity the most important element in predict-ing obesity
a varying degree of success. For example, ANNs have been for their study. The highest reported sensitivity for their work
applied in various non-linear problem-solving scenarios was 62% in the case of Naïve Bayes and Bayesian Networks.
including robotic decision making, swarm intelligence, This research group used a limited range of demographics
aviation application and Artificial Intelligence (AI) in games (gender) and biometrics (weight, height and BMI) and the
[19]. ANNs have also been used to identify diseases subjects were 2 year old children. It is envisaged that the
associated with the brain activity such as Parkinson’s, prediction accuracy can be further improved by using a
Schizophrenia, and Huntington’s disease from the CNV different set of parameters, using big data and other machine
response in electroencephalograph [20]. Multilayer learning techniques such as deep learning to handle big data.
perceptron (MLP) and probabilistic neural network (PNN) Reference [29] applied machine learning techniques to
have been utilized for the prediction of osteoporosis with measure and monitor physical activity in children. They
bone densitometry [21]. Reference [22] evaluated a range of evaluated Multilayer Perceptrons (MLPs), Support Vector
machine learning algorithms to predict accurate amount of Machines, Decision Trees, Naïve Bayes and K=3 Nearest
medication dosage required for patients suffering from sickle Neighbour algorithms. It was reported that MLPs
cell disease. Administering accurate amount of medication outperformed all the other algorithms yielding an overall
based on patient’s condition is of paramount importance. A accuracy of 96%, sensitivity of 95% and specificity of 99%.
range of machine learning techniques were investigated to It should be noted that the sample size in this case was also
accurately predict the correct amount of medication required. relatively small (22 participants). The investigation of Deep
The algorithms investigated included Random Forests, Sup- Learning techniques for future work was proposed.
port Vector Machines and a few variants of Neural Networks.
It was reported that Multilayer Perceptron Neural Networks III. METHODS
trained with Levenberg-Marquardt algorithm and Random A. Data
Forests provided the best results.
The data for this study are used from the Millennium
C. Machine Learning techniques for tackling obesity Cohort Study (MCS). The MCS followed every child born in
year 2000 and 2001. A vast amount of data were collected
Literature survey shows that only a limited number of
including weight and height. Six sweeps were conducted at
studies exist reporting machine learning to detect childhood
various steps of each child’s growth as shown in Table I.
obesity. Because of the complex nature of the problem,
machine learning techniques provide much more robust TABLE I. MILLENNIUM COHORT STUDY PROFILES
prediction accuracy rather than using simpler techniques such
as linear regression or other statistical methods [23]. Millennium Cohort Study Profiles
Reference [24] applied machine learning techniques to data MCS1 MCS2 MCS3 MCS4 MCS5 MCS6
collected from children before the age of 2 to predict future 9 months 3 years 5 years 7 years 11 years 14 years
obesity. In this study, they used data collected on children
prior to the second birthday using a clinical support system. Data mining involved combining data samples from each
They reported an accuracy of 85%, sensitivity of 89%, of the sweeps and discarding samples containing outliers. The
positive predictive value of 84% and negative predictive study was designed to predict the future value for the BMI of
value of 88% using the ID3 algorithm for decision trees a given sample based on earlier BMI values. Hence, it made
without pruning. The other algorithms tested were Naïve sense to use MCS1 through MCS5 as inputs and MCS6 as
Bayes, Random Trees, Random Forests, C4.5 decision trees target. It was, however, discovered that the data contained in
with pruning and Bayes Net. Several other studies also MCS1 were not very meaningful since the BMI of a 9-month
reported the use of machine learning algorithms used for old baby was not of much significance. Therefore, only
predicting obesity. Reference [25] suggested that Radial MCS2 through MCS5 were used as inputs to result in a
Basis ANNs (RBANNs) are far more efficient than classical meaningful prediction. The datasets are available in to
Back Propagation ANNs (BPANNs) but very large datasets researchers in SPSS, STATA and TAB formats.
would be required to train such systems. This study discussed
algorithms only; results were not reported. Reference [26] B. Methodology and results
discussed several algorithms for predicting childhood
obesity. They recommended the suitability of ANNs, Naïve 1) Multivariate linear regression
Bayes and Decision Trees. Several optimisation techniques
have also been applied to achieve better prediction accuracy. The study involved the use of multivariate linear
For example, Genetic Algorithms were employed by [27] to regression and a number of other algorithms to carry out a
improve the prediction accuracy to 92%. However, it must be comparative analysis of prediction accuracies. The data were
noted that they used a very small sample size of 12 subjects. divided into train and test sets using 75:25 train to test ratio.
A comprehensive study, possibly the best one identified so During the learning phase, the linear regression resulted in the
far, was carried out by [28] to apply machine learning following model parameters:
techniques to predict childhood obesity. They compared the
performance metrics of several machine learning prediction

232

Authorized licensed use limited to: UNIVERSITY OF TENNESSEE. Downloaded on December 06,2020 at 03:57:48 UTC from IEEE Xplore. Restrictions apply.
Intercept: 1.228 MLS: Minimum Leaf Size
Model co-efficient 1: -0.037 SDS: Surrogate Decision Splits
Model co-efficient 2: -0.013 NL: Number of Learners
Model co-efficient 3: -0.349 LR: Learning Rate
Model co-efficient 4: -0.777
These values were "learned" during the model fitting step As it can be seen from table II, the lowest MAE of 1.586
using the "least squares" criterion. Then, the fitted model was was achieved in the case of Linear SVM with a training time
used to make predictions. One important phase of any of 33.209 seconds. The Cubic SVM took the longest (973.9 s)
machine learning study is to measure the prediction accuracy.
to train with a slightly worse MAE. The quickest algorithm to
Mean squared errors or mean absolute errors are most
train was the Fine Trees but the MAE was the worst.
commonly used in determining such accuracies since their
scale is directly associated with the scale of the data [30]. The
following metrics were considered to measure the accuracy of Multi-Layer Perceptron Feed Forward Artificial Neural
the prediction models: Network: In this case, minimum, maximum and mean MAE
was calculated after 30 runs using the Levenberg-Maquardt
Mean Absolute Error (MAE): MAE is the mean of the algorithm. A 5% randomly selected data were kept for testing
absolute value of the errors given by the following equation: for added experiment robustness. The remaining dataset was
divided into train, validate and test sets with ratios of 70%,
1
| | 15% and 15% respectively. The experiment was repeated for
5, 10, 15, 20, 25, 30, 50 and 100 neurons in the hidden layer.
It was discovered that the lowest MAE of 1.42 was achieved
Mean Squared Error (MSE): MSE is the mean of the
for 20 neurons in the hidden layer.
squared errors given by the following equation:
TABLE III. MAE AFTER 30 RUNS OF MLP
1
Neurons 5 10 15 20 25 30 50 100
Min 1.48 1.45 1.48 1.42 1.49 1.53 1.50 1.50
Max 1.72 1.79 1.80 1.79 1.79 1.80 1.79 2.09
Root Mean Squared Error (RMSE): RMSE is the square Mean 1.60 1.62 1.61 1.59 1.61 1.63 1.62 1.70
root of the mean of the squared errors:
Further simulations were carried out using 20 neurons in
1 the hidden layer and Mean Squared Error as a performance
function. A value of 0.9 for rgularisation was used to avoid
overfitting. This parameter is a fraction between 0 and 1
indicating the proportion of performance attributed to
Using these equations, the following results were
weight/bias values. The larger this value the network will be
obtained:
penalised for large weights, and the more likely the network
MAE: 1.598 function will avoid overfitting. The predicted and actual BMI
MSE: 4.729 values for 100 instances are plotted on the graph. The neural
RMSE: 2.175 net-work was trained using 20 neurons in the hidden layer and
Levenberg-Marquardt algorithm. As it can be seen from the
The Mean Absolute Error metric is far more meaningful graph, the predicted values correlate very closely with the
since it relates to the output directly since it is in the same units actual hidden data values.
as the target output. The mean value for the predicted output
is 21.258 which results in a prediction error of 7.52%.
2) Additional regression algorithms:
A number of additional regression algorithms were
employed to evaluate the effectiveness in terms of prediction
accuracy and training time. These included Linear Support
Vector Machines, Quadratic Support Vector Machines,
Actual and predicted BMI

Decision Trees and Ensemble algorithms. The results


obtained are shown in table II:

TABLE II. ADDITIONAL REGRESSION ALGORITHMS


Algorithm RMSE MSE MAE Training
time
Linear SVM 2.262 5.116 1.586 33.21 s
Quadratic SVM 2.230 4.970 1.589 97.15 s
Cubic SVM 2.233 4.986 1.597 973.90 s
Fine Tree MLS=8, NL=30 2.654 7.045 1.982 1.75 s Data instance
Ensemble Bagged Trees 2.172 4.716 1.604 6.92 s
Fig. 1. Predicted and actual BMI values
MLS=8, NL=30
Ensemble Boosted Trees 2.350 5.520 1.712 7.84 s
MLS=8, NL=30, LR=0.1
Ensemble Bagged Trees 2.160 4.663 1.593 7.53 s
MLS=15, NL=50

233

Authorized licensed use limited to: UNIVERSITY OF TENNESSEE. Downloaded on December 06,2020 at 03:57:48 UTC from IEEE Xplore. Restrictions apply.
IV. CONCLUSION AND FURTHER WORK [11] Engeland A, Bjorge T, Sogaard AJ, Tverdal A. Body Mass Index in
adolescence in relation to total mortality: 32-year follow-up of 227,000
In this paper we have experimented with various Norwegian boys and girls. Am J Epidemiol. 2003;157:517–23.
multivariate regression algorithms and multi-layer perceptron [12] Butland B, Jebb S, Kopelman P, McPherson K, Thomas S, Mardell J,
feed forward artificial neural networks (MLPFFANN). The et al. Tackling obesities: future choices – project report. In:
Government Office for Science. Foresight: London; 2007. p. 1–164.
results obtained indicate that the MLPFFANN algorithm
[13] Freedman DS, Mei Z, Srinivasan SR, Berenson GS, Dietz WH.
using 20 neurons in the hidden layer results in the lowest Cardiovascular risk factors and excess adiposity among overweight
mean absolute error after taking average of 30 runs. The data children and adolescents: the Bogalusa Heart study. J Pediatr.
was randomized for each run and the neural network was 2007;150:12–7.
initialized with new random parameters. Additional [14] Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Arnaud Arindra
regression algorithms were also tried, however, Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen
A.W.M. van der Laak, Bram van Ginneken, Clara I. Sánchez, A survey
implementing a complex algorithm didn’t achieve any better on deep learning in medical image analysis, Medical Image Analysis,
results. Some of these complex algorithms took a lot longer Volume 42, 2017, Pages 60-88
to train but the mean absolute error didn’t improve. The best [15] Haugeland J, 1985. Artificial Intelligence: The Very Idea The MIT
prediction accuracy based on a MAE of 1.42 in the case of Press, Cambridge, Mass (1985)
MLPFFANN equates to an accuracy of 93.4%. This is very [16] Szolovits P, Patil RS, Schwartz WB (1988) Artificial intelligence in
medical diagnosis. Ann Intern Med 108(1):80–87
encouraging, but more work needs to be done to ensure the
robustness of the prediction network. Further work is [17] Ishak WHW, Siraj F (2002) Artificial intelligence in medical
application: an exploration. Health Inform Eur J 16:1–9
proposed to be done by employing additional hidden layers
[18] Jarvis-Selinger S, Bates J, Araki Y, Lear SA (2011) Internet-based
to create a deeper network. Increasing the size of dataset may support for cardiovascular disease management. Int J Telemed Appl
also result in higher accuracies. This can be done by 2011:1–9
implementing multiple imputations to salvage the data [19] Kumar K. and Thakur G.S.M., 2012. Advanced Applications of Neural
samples which were originally discarded because of the Networks and Artificial Intelligence: A Review, IJITCS, vol.4, no.6,
pp.57-68.
presence of outliers.
[20] Jervis, B.W., Saatchi, M.R., Lacey, A., Roberts, T., Allen, E.M.,
ACKNOWLEDGEMENTS Hudson, N.R., Oke, S. and Grimsley, M. (1994) ‘Artificial neural
network and spectrum analysis methods for detecting brain diseases
The authors are grateful to ‘The Centre for Longitudinal from the CNV response in the electroencephalogram’, IEEE
Studies, Institute of Education’ for the use of data used in Proceedings Science, Measurement and Technology, pp.432–440.
this study and to the ‘UK Data Archive Service’ for making [21] Mantzaris, D.H., Anastassopoulos, G.C. and Lymberopoulos, D.K.,
2008 ‘Medical disease prediction using artificial neural networks’, 8th
them available. However, they are in no way responsible for IEEE International Conference on BioInformatics and
the analysis and interpretation of these data. BioEngineering, pp.1–6.
[22] Mohammed Khalaf, Abir Jaafar Hussain, Robert Keight, Dhiya Al-
Jumeily, Paul Fergus, Russell Keenan, and Posco Tso. 2017. Machine
REFERENCES learning approaches to the application of disease modifying therapy for
sickle cell using classification models. Neurocomput. 228, C (March
2017), 154-164. DOI: https://doi.org/10.1016/j.neucom.2016.10.043
[1] MGI - Overcoming obesity: An initial economic analysis, MGI report [23] Michie D, Spiegelhalter DJ, Taylor CC. Machine learning, neural and
available from: http://www.mckinsey.com/~/media/mckinsey/business statistical classification. 1994.
functions/economic studies temp/our insights/how the world could
better fight obesity/mgi_overcoming_obesity_full_report.ashx [24] Dugan TM, Mukhopadhyay S, Carroll AE, Downs SM. Machine
[accessed 14 August 2019] learning techniques for prediction of early childhood obesity. Appl Clin
Inform 2015; 6: 506–520
[2] Boodha G. Children’s body mass index, overweight and obesity.
Chapter 11 in Craig R, Mindell J (eds). Health Survey for England [25] B. Novak and M. Bigec, "Application of artificial neural networks for
2013. Health and Social Care Information Centre, Leeds, 2014. childhood obesity prediction," Proceedings 1995 Second New Zealand
International Two-Stream Conference on Artificial Neural Networks
[3] Olds T, Maher C, Zumin S et al. Evidence that the prevalence of and Expert Systems, Dunedin, New Zealand, 1995, pp. 377-380. doi:
childhood overweight is plateauing: data from nine countries. J Pediatr 10.1109/ANNES.1995.499512
Obes. 2011;6:342-60
[26] Adnan M, Husain W, Damanhoori F. A survey on utilization of data
[4] Rokholm B, Baker JL, Sørenson TIA. The levelling off of the obesity mining for childhood obesity prediction. Information and
epidemic since the year 1999 – a review of evidence and perspectives. Telecommunication Technologies (APSITT) 2010; 1–6.
Obes Rev. 2010;11:835-46.
[27] Adnan MHBM, Husain W, Rashid N. Parameter Identification and
[5] Blüher, S, Meigen, C, Gausche, R et al. Age-specific stabilization in Selection for Childhood Obesity Prediction Using Data Mining. 2nd
obesity prevalence in German children: A cross-sectional study from International Conference on Management and Artificial Intelligence
1999 to 2008. International Journal Of Pediatric Obesity, Volume 6, 2012.
Issue sup3, 2011, pp. 199-206.
[28] Zhang S, Tjortjis C, Zeng X, Qiao H, Buchan I, Keane J. Comparing
[6] Moss, A., Klenk, J., Simon, K. et al. Declining prevalence rates for data mining methods with logistic regression in childhood obesity
overweight and obesity in German children starting school. Eur J prediction. Information Systems Frontiers 2009; 11: 449–460.
Pediatr. 2012; 171: 289.
[29] Paul Fergus, Abir J. Hussain, John Hearty, Stuart Fairclough, Lynne
[7] Wabitsch, M., Moss, and A. Kromeyer-Hauschild, K. Unexpected Boddy, Kelly Mackintosh, Gareth Stratton, Nicky Ridgers, Dhiya Al-
plateauing of childhood obesity rates in developed countries. BMC Jumeily, Ahmed J. Aljaaf, Jenet Lunn, A machine learning approach to
Medicine. 2014; 12:17. measure and monitor physical activity in children, Neurocomputing,
[8] Moody, A. Adult anthropometric measures, overweight and obesity. Volume 228, 2017, Pages 220-230
Health Survey for England 2013. Health and Social Care Information [30] Hyndman, R. J., & Koehler, A. B., “Another look at measures of
Centre, Leeds, 2014. forecast accuracy”, International journal of forecasting, Volume 22,
[9] Singh AS, Mulder C, Twisk JWR, Van Mechelen W, Chinapaw MJM. Issue 4, 2006, pp. 679-688
Tracking of childhood overweight into adulthood: a systematic review
of the literature. Obes Rev. 2008;9:474–88.
[10] Dietz WH. Health consequences of obesity in youth: childhood
predictors of adult disease. Pediatrics. 1998;101:518–25. doi:
10.1109/JBHI.2016.2636665

234

Authorized licensed use limited to: UNIVERSITY OF TENNESSEE. Downloaded on December 06,2020 at 03:57:48 UTC from IEEE Xplore. Restrictions apply.

You might also like