Electronics: Smartfit: Smartphone Application For Garment Fit Detection

electronics
Article
SmartFit: Smartphone Application for Garment Fit Detection
Kamrul H. Foysal 1 , Hyo Jung Chang 2 , Francine Bruess 2 and Jo Woon Chong 1, *
1 Department of Electrical and Computer Engineering, Texas Tech University, Lubbock, TX 79409, USA;
kamrul.foysal@ttu.edu
2 Department of Hospitality and Retail Management, Texas Tech University, Lubbock, TX 79409, USA;
julie.chang@ttu.edu (H.J.C.); francine.bruess@ttu.edu (F.B.)
* Correspondence: j.chong@ttu.edu; Tel.: +1-617-800-4583
Abstract: The apparel e-commerce industry is growing day by day. In recent times, consumers are
particularly interested in an easy and time-saving way of online apparel shopping. In addition,
the COVID-19 pandemic has generated more need for an effective and convenient online shopping
solution for consumers. However, online shopping, particularly online apparel shopping, has several
challenges for consumers. These issues include sizing, fit, return, and cost concerns. Especially, the
fit issue is one of the cardinal factors causing hesitance and drawback in online apparel purchases.
The conventional method of clothing fit detection based on body shapes relies upon manual body
measurements. Since no convenient and easy-to-use method has been proposed for body shape
detection, we propose an interactive smartphone application, “SmartFit”, that will provide the
optimal fitting clothing recommendation to the consumer by detecting their body shape. This
optimal recommendation is provided by using image processing and machine learning that are solely
dependent on smartphone images. Our preliminary assessment of the developed model shows an
accuracy of 87.50% for body shape detection, producing a promising solution to the fit detection
problem persisting in the digital apparel market.
Keywords: body shape detection; online shopping; image processing; machine learning; smartphone

Citation: Foysal, K.H.; Chang, H.J.;

Bruess, F.; Chong, J.W. SmartFit:
Smartphone Application for Garment 1. Introduction
Fit Detection. Electronics 2021, 10, 97. Recently, societal demands for purchasing apparel products online have increased
https://doi.org/10.3390/
with technological progress. As a result, the apparel e-commerce industry is booming more
electronics10010097
than ever before [1]. Therefore, the online presence of apparel companies is necessary for
the business to succeed. This is especially during the COVID-19 pandemic, while online
Received: 16 October 2020
apparel shopping for consumers is quickly and gradually increasing [2]. The revenue in
Accepted: 28 December 2020
apparel e-commerce in 2020 is USD 110.6 billion and is expected to increase to USD 153.6
Published: 5 January 2021
billion by 2024 [3]. However, one of the main obstacles to the path of increasing revenue in
Publisher’s Note: MDPI stays neu-
this industry is the large amount of returns. In addition, the industry faces issues with the
tral with regard to jurisdictional clai-
cost of merchandising, shipping, and processing [4]. There has been a study showing that
ms in published maps and institutio- around 30% of the apparel purchased online is returned due to poor fit [5]. The current
nal affiliations. existing online shopping websites do not provide any means of measuring body size
and shape. Tech startups like True Fit (www.truefit.com) and Fits.me (www.fits.me) have
developed virtual fitting rooms using virtual mannequins to mimic the body measurements
of shoppers and display what a certain garment, i.e., apparel, might look like when it is
Copyright: © 2021 by the authors. Li- worn by the consumer [6]. However, these types of solutions fail in the practical size
censee MDPI, Basel, Switzerland.
assessment at the industrial level due to technical barriers such as an overcomplicated
This article is an open access article
process, inefficient technology, low user adaption and costly expenses required to adopt
distributed under the terms and con-
the technology [7]. Even advanced technologies such as Sizer (www.sizer.me) and MTailor
ditions of the Creative Commons At-
(www.mtailor.com) do not always provide actual fit information and size measurements,
tribution (CC BY) license (https://
failing to completely solve the poor fit issue [8].
creativecommons.org/licenses/by/
4.0/).
Electronics 2021, 10, 97. https://doi.org/10.3390/electronics10010097 https://www.mdpi.com/journal/electronics

individual body shapes [9]. Therefore, there is a tremendous a
the poor fit and associated return issues. In fact, 50% of the tim
Electronics 2021, 10, 97
in returns, which in turn, cost time and money to both2 the of 15
cons
may cost around USD 10–15 in total per garment, which is a hug
the consumers as well as businesses [10].
Online apparel fit detection has been a burning issue due to the lack of technology
Asdetect
that can most of the
consumer body garments are made
shapes. Consumers’ garmentuniversally
fit largely dependsinonconventio
their
individual body shapes [9]. Therefore, there is a tremendous amount of profit loss due to
there
the poorisfitaand
need to solve
associated fit issues
return issues. in ofa the
In fact, 50% personalized way
time, online purchases starting
result in f
allreturns,
the which
wayinto the
turn, costretailers
time and money and consumers.
to both the consumer and Aside
retailer.from
Returnstailori
may cost around USD 10–15 in total per garment, which is a huge burden to be split among
manufacturers
the consumers as well are adapting
as businesses [10]. to the patterns of producing garm
As most of the garments are made universally in conventional body shapes and sizes,
anthropometric
there is a need to solve data [11,12].
fit issues The four
in a personalized mostfrom
way starting common tradition
apparel manufac-
[13],
tures namely
all the way to inverted
the retailerstriangle,
and consumers. pear,
Aside hourglass, and rectangle
from tailoring apparel stores,
apparel manufacturers are adapting to the patterns of producing garments based on shape-
The
based body shapedata
anthropometric plays anTheimportant
[11,12]. four most common roletraditional
in understanding
body shapes for the
females [13], namely inverted triangle, pear, hourglass, and rectangle [14] are shown in
individual consumers [15]. However, current technologies cann
Figure 1. The body shape plays an important role in understanding the optimal fit of
information related
garments for individual to the
consumers [15].garment fit [10].
However, current technologies cannot provide
the body shape information related to the garment fit [10].
Figure 1. Four common body shapes for female.

Figure 1. Four common body shapes for female.
In this paper, we proposed a novel body shape detection algorithm for garment fit,
which extracts consecutive feature points using speeded up robust features (SURF) [16],
Invisual
detects thiswords
paper, we proposed
from extracted ausing
features points novel body shape
the bag-of-features detection
model [17],
and classifies body shapes using a machine learning technique, such as the k-nearest neigh-
which extracts consecutive feature points using speeded up ro
bor (k-NN) [18]. We also propose a convolutional neural network (CNN)-based [19] body
detects visual
shape detection words
algorithm. from extracted
Specifically, we suggested features pointsofusing
a unique combination image the b
processing and machine learning based body shape detection using 2D smartphone im-
and
ages. classifies
Especially, SURFbody shapestheusing
feature detection, a machine
bag-of-features learning
model, and machine techniq
learning
technique have not been proposed to differentiate the body shapes for garment fit. The
neighbor (k-NN) [18]. We also propose a convolutional neural
conventional method for body shape detection is limited to a physical measurement-based
body shape
approach [20] or adetection algorithm.
multiple photo-based Specifically,
3D reconstruction approach [15],we
whichsuggested
are mostly a
inaccurate and impractical due to the expensive computation [21]. Therefore, in this paper,
age processing
we suggest and machine
a unique combination learning
of image processing and based bodybased
machine learning shapebody detec
shape detection using 2D smartphone images. Figure 2 describes the overall procedure of
images. Especially, SURF feature detection, the bag-of-features
the proposed method.
ing technique have not been proposed to differentiate the bo
The conventional method for body shape detection is limited t
based approach [20] or a multiple photo-based 3D reconstruc
are mostly inaccurate and impractical due to the expensive co
in this paper, we suggest a unique combination of image proces
based body shape detection using 2D smartphone images. Fig
procedure of the proposed method.
Electronics 2021, 10, x FOR PEER REVIEW 3
Electronics 2021, 10, 97 3 of 15
Figure 2.2.Our
Figure Ourproposed
proposedbody shape
body detection
shape method
detection which uses
method which image
usesprocessing and machine
image processing and mach
learning algorithms with our database. The proposed detection method first trains itself with training
learning algorithms with our database. The proposed detection method first trains itself with
images (see (a) Training), and then the trained machine predicts the body shapes of unknown input
training images (see (a) Training), and then the trained machine predicts the body shapes of un
images (see (b) Prediction).
known input images (see (b) Prediction).
This paper is organized as follows: Section 2 describes data acquisition and pre-
This paper
processing is organized
procedures. as follows:
Our proposed methodSection 2 describes
consisting of featuredata acquisition
detection and pre-
and clas-
cessing
sificationprocedures.
algorithms is Our proposed
explained method
in Section consisting
3. Results of featurearedetection
and discussions presentedand
in classi
Sections 4 and 5, respectively.
tion algorithms is explained in Section 3. Results and discussions are presented in Sect
42. and
Data5,Acquisition
respectively.
and Pre-Processing
Experimental images have been acquired following the IRB protocol (IRB2020-482)
2.
at Data Acquisition
Texas Tech and
University Pre-Processing
which allows for the usage of the images acquired from public
onlineExperimental
data. A total ofimages
160 images havewere originally
been acquired acquired from online
following sources
the IRB (Hourglass
protocol (IRB2020-
body shape, Available from: www.abeautifulbodyshape.com/hourglass-body-shape-explained/,
at Texas Tech University which allows for the usage of the images acquired from pu Pear
body shape, Available from: www.bodyvisualizer.com/) and the Google search engine, which are
online data. A total of 160 images were originally acquired from online sources (Hour
publicly available (Our acquired dataset, Available from: www.youtu.be/204kb29laBo). Then,
body
a totalshape, Available
of 80 images werefrom: www.abeautifulbodyshape.com/hourglass-body-shape-explained/,
manually selected among them by the expert in apparel design
body
basedshape,
on theAvailable
body shape from: www.bodyvisualizer.com/)
determinant and the
criteria [22]. These images Google
were search
selected engine, w
because
are
they publicly
fell into theavailable
four types (Our
of bodyacquired
shapes for dataset,
femalesAvailable from:focused
we particularly www.youtu.be/204kb29la
on, namely:
inverted
Then, a triangle,
total of pear, hourglass,
80 images wereandmanually
rectangle, selected
since females
amonghavethem
more issues
by thewith the in app
expert
fit and these are the most common body shapes for them. Based
design based on the body shape determinant criteria [22]. These images were selected on the reference [23],
the authors chose 80 images (20 for each category) for the proof-of-concept study in this
cause they fell into the four types of body shapes for females we particularly focused
paper. The major vote method [24] was used to define the label of the images and the
namely: inverted
voting criteria cametriangle,
from the pear, hourglass,
conventional and rectangle,
measurement methodsince
[25]. females have
Specifically, wemore is
with
used thethe major
fit andvote
these are the
among fourmost common
individuals body shapes
including for them.
two experts Baseddesign
of apparel on the refer
[23], the authors chose 80 images (20 for each category) for the proof-of-concept
and merchandising and two individuals trained by the expert in apparel design. The stud
conventional method we used in this paper is based on the relations
this paper. The major vote method [24] was used to define the label of the images and among four body
measurement areas, i.e., shoulder, bust, waist, and hip [13,22,26]. First, the inverted triangle
voting criteria came from the conventional measurement method [25]. Specifically
body shape was identified if the shoulders are wider than the hips. Second, the pear body
used the major vote among four individuals including two experts of apparel design
shape was determined when the hips were wider than the shoulders. Third, the hourglass
merchandising and two
body shape was classified individuals
when the width oftrained
shoulders by and
the hips
expertwereinsimilar,
apparel butdesign.
the waistThe con
tional method we
was significantly usedFourth,
smaller. in thisthepaper is based
rectangle on thewas
body shape relations among
determined whenfour body meas
the sizes
of the shoulder, bust, waist, and hip appear to be similar.
ment areas, i.e., shoulder, bust, waist, and hip [13,22,26]. First, the inverted triangle b
shape For wastheidentified
homogeneous processing
if the shoulders of are
the body
widershape
thandetection,
the hips. we first performed
Second, the pear body sh
preprocessing techniques including image resizing (Image Resize, Available from: www.
was determined when the hips were wider than the shoulders. Third, the hourglass b
mathworks.com/help/images/ref/imresize.html), edge detection, and smoothing on the acquired
shape
images was usingclassified
MATLAB (MATLABwhen the width
2020, of shoulders
Available and hips were similar, but the w
from: www.mathworks.com/products/matlab.
was
html). Specifically, the image resizing changes different sizes shape
significantly smaller. Fourth, the rectangle body wasinto
of images determined
the same when
sizes of the shoulder, bust, waist, and hip appear to be similar.
For the homogeneous processing of the body shape detection, we first perfor
preprocessing techniques including image resizing (Image Resize, Available f
www.mathworks.com/help/images/ref/imresize.html), edge detection, and smoothing on
the edges are detected on the same sized-images using an edge de
Sobel filter [27]. For smoothing, the Gaussian 2D image filter whos
Electronics 2021, 10, 97
point (x, y) of the image I is applied and given in Equation4 of(1)
15
[28]:
1 𝑥 2 +𝑦 2
−
size, i.e., 300 × 600 pixels. To focus on the outer𝐺borderline
σ (𝑥, 𝑦) in
=body shape 2𝜎 2 , the
𝑒 detection,
2𝜋𝜎 2
edges are detected on the same sized-images using an edge detection algorithm, e.g., Sobel
filter [27]. For smoothing, the Gaussian 2D image filter whose filter response at the point
where 𝜎 image
(x, y) of the is theI isstandard deviation
applied and given of(1)
in Equation the Gaussian distribution. Us
[28]:
the edge detected image is smoothedx2for noise reduction [29]. The s
1 − +2y2
image filter is taken as σGσ=(x,0.5. y) =Figure
2πσ2 3 shows an example of smooth
e 2σ , (1)
filter
whereon σ isone of thedeviation
the standard body shape images
of the Gaussian in our database.
distribution. Specifically,
Using the Gaussian filter, w
ianthefilter with image
edge detected σ = 0.5 to the for
is smoothed edgenoisedetected image
reduction [29]. shown
The sigma in Figure 3a
for the Gaussian
image filter is taken as σ = 0.5. Figure 3 shows an example of smoothing using the Gaussian
itsfilter
smoothing
on one of the outcome.
body shape images in our database. Specifically, we applied this
Gaussian filter with σ = 0.5 to the edge detected image shown in Figure 3a, and Figure 3b
shows its smoothing outcome.
(a) (b)
Figure 3. (a) The edge detected image, and (b) Gaussian smoothed image with σ = 0.5.
Figure 3. (a) The edge detected image, and (b) Gaussian smoothed image w
In this paper, we considered four types of body shapes: inverted triangle, pear,
hourglass, and rectangle shapes. Each category of body shape consists of 20 labeled
In this
images. paper,
The obtained we considered
database was divided intofour types
training of body
and testing shapes:
sets. For inverte
validation,
glass, and rectangle
both hold-out shapes.
and leave-one-out Each
approaches category
were assessed. Forofthebody shape
hold-out consists
validation
approach, 70% of the total number of images was used to train the model and 30% was
The
usedobtained
for validationdatabase was
[30] (see Figure 4). divided intofrom
Hence, 14 images training andresulting
each category testing sets
hold-out
in 14 × 4 = 56and leave-one-out
images approaches
were used for training the model. On were assessed.
the other hand, 30% For
of the the h
data were randomly held out for validation [31]. As a result, the remaining 24 images (six
proach, 70% ofhas
for each category) the total
been usednumber of model.
to validate the images Thewas used to
leave-one-out train
method wasthe mo
foralsovalidation
assessed where the machine
[30] learning model
(see Figure is applied14
4). Hence, once for each image
images fromfromeachthe categ
database, using all other images as the training data, and the selected image as a single test
= image
56 images weretheused
[32]. Therefore, forwastraining
algorithm simulated 80the model.
times, each timeOn the
taking otheras hand,
79 images
training data held
randomly and remaining
out for image as the testing[31].
validation image, As
consecutively.
a result, the remaining 2
category) has been used to validate the model. The leave-one-ou
sessed where the machine learning model is applied once for each
base, using all other images as the training data, and the selected
image [32]. Therefore, the algorithm was simulated 80 times, each
as training data and remaining image as the testing image, consecu
Electronics 2021, 10, x FOR PEER REVIEW 5 of 16
Figure 4. Training and validation datasets: 30% of the data were held out for validation.
Figure 4.
Figure 4. Training
Training and
and validation
validation datasets:
datasets: 30%
30% of
of the
the data
data were
were held
held out
out for
for validation.
validation.
3. Methods
3.
3. Methods
Methods
Our proposed method first extracts the feature points from the preprocessed images
Our
Ourinproposed
proposed method first extracts the
the feature points from the preprocessed images
obtained Section 2,method
and then first extracts
uses the extracted feature points
features tofrom thethe
classify preprocessed
images intoimages
body
obtained in Section 2, and then uses the extracted features to classify the images into body
obtained
shape in Section
categories. The2,feature
and then uses theprocedure
extraction extracted features to classify
is described the images
in Sections 3.1 andinto
3.2, body
and
shape categories. The feature extraction procedure is described in Sections 3.1 and 3.2, and
shape
the bodycategories. The featureprocedure
shape classification extractionis procedure
explainedisin described in Sections 3.1 and 3.2, and
Section 3.3.
the body shape classification procedure is explained in Section 3.3.
the body shape classification procedure is explained in Section 3.3.
3.1.
3.1.Feature
FeaturePoints
PointsDetection
Detection
3.1. SURF
Featurefeature
Points Detection
points were
wereextracted
extractedfrom from
SURF feature points thethe preprocessed
preprocessed images
images to them
to use use them as
as input
input SURF
for the feature points
classifier. We were
adoptedextracted
SURF from
since the
it is preprocessed
widely used in images to
computer
for the classifier. We adopted SURF since it is widely used in computer vision [33] due to its use them
vision as
[33]
input
due to for
its the
fasterclassifier.
and moreWe adopted
accurate SURF since
characteristics it is widely
compared used
to in
other
faster and more accurate characteristics compared to other methods, such as SIFT [34] and computer
methods, vision
such [33]
as
due [34]
SIFT
Harristocorner
itsand
faster and corner
Harris
detection more accurate
detectioncharacteristics
[35] methods. [35]SURF
The methods.
feature compared
The SURF
point tofeature
otherprocedure
extracting methods, such as
point extracting
consists
SIFT
of the[34]
procedure and Harris
consists
generation ofofthecorner
the detection
generation
integral image, [35]integral
of the
the methods.
approximation Theof
image, SURF
the feature determinant
approximation
the Hessian pointofextracting
the Hes-
and
procedure
sian consists
determinant andof the generation
performing of the
non-maximal integral
performing non-maximal suppression [36] as shown in Figure 5. image,
suppression the
[36]approximation
as shown in of the5.Hes-
Figure
sian determinant and performing non-maximal suppression [36] as shown in Figure 5.
Blockdiagram
Figure5.5.Block
Figure diagram ofof the
the SURF
SURF feature
feature point
point detection.
detection. TheThe SURF
SURF feature
feature point
point extracting
extracting
procedure
procedure consists
consists
Figure 5. Block of the
of
diagramthe generation
generation
of the SURFof an
an integral
offeature pointimage,
integral image, the
theapproximation
detection. approximation ofofHessian
The SURF feature Hessiandeterminant,
determi-
point extracting
and
nant, performing
procedure non-maximal
and performing
consists suppression
ofnon-maximal [36]. [36].
suppression
the generation of an integral image, the approximation of Hessian determi-
nant, and performing non-maximal suppression [36].
Theintegral
The integralimage
image was was used
used for
for the
the fast
fast calculation
calculation of of the
the sum
sumofofthe
thepixel
pixelvalues
valuesinina
given
a given image
Theimage or a
integral rectangular
image was used
or a rectangular subset
subset of the
forofthe given
thefast image.
calculation
given image. TheThe characteristics
of the sum of the of
characteristics of the
pixel integral
the values
integral in
image
a given permit
image the
or afast computation
rectangular subsetof box
of theconvolution
given image.filters
The [37]. The box
characteristics
image permit the fast computation of box convolution filters [37]. The box filter was used filter
of was
the used
integral
for
for detecting
image blobs
permitblobs
detecting orfinding
the fast
or finding pointsof
computation
points ofofbox
interest inthe
theimage
convolution
interest in image [38,39].
filters[38,39].
[37]. The box filter was used
Specifically,
for detecting
Specifically, corners,
blobs edges,
or finding
corners, and
points
edges, blobs
andofblobs required
interest for body
in the image
required for body shape detection
[38,39].
shape can becan
detection found
be
using the box
foundSpecifically,filter.
using the box The
corners, box filter
edges,
filter. The is
boxandinterpolated
filterblobs in the
required for
is interpolated scale (the
body
in the up-scaling
shape
scale filter
(the detection size,
up-scalingcan i.e.,
be
filter
× 9, 15
9found × 15,
using the × 21,
21box 27 ×The
filter. 27, box
etc.).filter
The is scale space model
interpolated is implemented
in the as up-scaling
scale (the up-scaling filter
Electronics 2021, 10, x FOR PEER REVIEW 6
size, i.e., 9 × 9, 15 × 15, 21 × 21, 27 × 27, etc.). The scale space model is implemented as up-
Electronics 2021, 10, 97 scaling box filters, with a resulted image convoluted with box filters of different 6 ofsizes,
15
which are shown

size, i.e., 9in× 9,
Figure 6. Each
15 × 15, detected
21 × 21, feature
27 × 27, point
etc.). The in the
scale range
space of filter
model size scales as
is implemented
is detected and stored in a feature vector.
scaling box filters, with a resulted image convoluted with box filters of different si
box filters,
which are with a resulted
shown image
in Figure 6. convoluted with feature
Each detected box filters of different
point in the sizes,
rangewhich are size sc
of filter
shown in Figure 6. Each detected feature point in the range of filter size scales is detected
is detected and stored in a feature vector.
and stored in a feature vector.
Figure 6. Scale invariance due to scale space model. The box type convolution filters are upscaled.
The filtered images are given. Size of the box filters 9 × 9, 15 × 15 and 21 × 21 are applied.
6. Scale
Figure 6.
Figure Scaleinvariance duedue
invariance to scale spacespace
to scale model. The box
model. Thetype
boxconvolution filters are filters
type convolution upscaled.
are upscal
TheThenon-maximal
The filtered images
filtered suppression
imagesareare
given. isthe
SizeSize
given. of required
ofbox boxto
thefilters 9 ×localize
9×
9, 15
filters 15the
× 9, and
15 ×most
21 21dominant
15 ×and are 21 arefeature
applied.
21 × applied.in
the scale space model. For the localization of feature points, a non-maximum suppression
The non-maximal
in a 3 × 3 × 3 neighborhood suppression
[40] is used to is find
required to localize the mostin adominant 𝜎)feature in the
scaleThe
spacenon-maximal
model. For the suppression isthe
localization of feature
local
required maxima
points,to localize
a non-maximum
(x,
they,most space applied
dominant
suppression in a featur
at the scale
the of thespace
detected feature
Forispoint as shown in of Figure 7a. Figure 7b shows the input
3 ×scale model.
3 × 3 neighborhood [40] the
used localization
to find the local feature
maximapoints,
in a (x, ay, non-maximum
σ) space applied at suppress
image within detected
thea scale
3 × 3 of feature
× 3the points
neighborhood
detected feature using
[40] non-maximal
is used
point to find
as shown suppression.
the local
in Figure 7a. maxima
Figure 7bin y, 𝜎)
a (x, the
shows space app
input
image
at with detected
the scale feature points
of the detected using
feature non-maximal
point as shownsuppression.
in Figure 7a. Figure 7b shows the in
image with detected feature points using non-maximal suppression.
(a) (b)
Figure 7. Non-maximal suppression
Figure 7. Non-maximal in a neighborhood
suppression in a neighborhoodofof2626points:
points: (a) thedetected
(a) the detected feature
feature point
point is
is in the center, and their
in the center, corresponding
and their (a)
corresponding scale
scalespace
spacemodel, and(b)
model, and (b)the
the input
input image
image (b)
withwith the de-
the detected
tected feature points.
feature points.
Figure 7. Non-maximal suppression in a neighborhood of 26 points: (a) the detected feature po
is in the center,
In this andatheir
paper, corresponding
database of trainingscale
bodyspace
imagesmodel,
withandtheir(b)corresponding
the input image with the de
SURF
In this paper,
tected a database
feature points. of training body images with their corresponding SURF fea-
features is formed. Figure 8 shows the strongest 40 feature points (see green points) with
tures is formed. Figure
corresponding 8 shows
scales (shown the
in strongest 40 and
green circles) feature pointswhich
orientation (see green points) with
are automatically
corresponding
detected scales
using
In this (shown
SURFafrom
paper, in green of training body images with their correspondingde-
an image.
database circles) and orientation which are automatically SURF
tected using SURF
Feature from
pointsan
areimage.
the dominant attribute of an image. Particularly,
tures is formed. Figure 8 shows the strongest 40 feature points (see green points) w the feature points
are the dominant marks on the outline of the preprocessed images of the subjects. However,
corresponding scales (shown in green circles) and orientation which are automatically
they do not provide information about the body shape. These feature points are used in the
tected
furtherusing SURF
procedure from body
to extract an image.
shape information and it is explained in Section 3.2 below.
SURF feature
Figure 8. SURF feature points
pointsdetected
detected(shown
(shownas
asgreen
greencolored
coloreddots)
dots)from
froma asample
sampleinput
inputimage.
image.
The neighborhoods of feature points are drawn
drawn by
by green
green circles.
circles.
3.2. Bag-of-Features
Feature points are the dominant attribute of an image. Particularly, the feature points are
In our proposed
the dominant marks onmethod,
the outline theofbag-of-features
the preprocessedmodel imagesisofutilized to categorize
the subjects. However,bodythey
shape categories. The bag-of-features model is an image patch-based
do not provide information about the body shape. These feature points are used in the further classification method,
where
proceduremosttooccurring
extract bodyimage patches
shape are used
information andto it
classify an image
is explained [41]. The
in Section 3.2 bag-of-features
below.
algorithm first obtains image patches from the feature points detected using the SURF
algorithm
3.2. (described in Section 3.1). Then, from the feature points in the training images,
Bag-of-Features
we get feature vectors. Balancing the number of features is done across all image categories
In our proposed method, the bag-of-features model is utilized to categorize body shape
to improve clustering. Among the categories, the minimum number n of the strongest
categories. The bag-of-features model is an image patch-based classification method, where
number of features is chosen to calculate the total number of descriptors using Equation (2):
most occurring image patches are used to classify an image [41]. The bag-of-features algorithm
first obtains image patches Number from the feature
of Features = points
Number detected using ×
of Category then,SURF algorithm (de- (2)
scribed in Section 3.1). Then, from the feature points in the training images, we get feature
vectors.one
where Balancing the number
of the categories hasof the
features
least is done across
number all image
of strongest categoriesof
descriptors to n.
improve clus-
tering. Among the categories, the minimum number 𝑛 of the
The descriptors of each feature from all the training image database are then strongest number of features
clustered is
chosen to calculate the total number of descriptors using Equation
for developing the ‘visual words.’ Each SURF descriptor provides a vector with a length of(2):
64. A total of 500 visual words was obtained, each of which is determined by the center of
Number of Features = Number of Category × 𝑛, (2)
the clustered feature descriptors in the known category of body shapes (inverted triangle,
where one of the categories
pear, hourglass, and rectangle) has the leastk-means
using number clustering
of strongest[32]. descriptors
In our ofbodyn. detection
The descriptors
experiment, of each
the numbers offeature from and
descriptors all the training
clusters image
were database
around areand
78,000 then clustered
500 words,
for developingand
respectively, the the
‘visual words.’ Each
dimension of eachSURFworddescriptor
was 64provides
[42]. The a vector
visualwith
words a length
were of 64.
then
A total ofin
included 500 visual dictionary.
a visual words wasThe obtained,
number each of which
of visual is determined
words by the was
in the dictionary center of the
taken as
an experimental
clustered feature parameter.
descriptorsIfinthe thenumber
known was too small,
category visual
of body words
shapes wouldtriangle,
(inverted not represent
pear,
all patches,
hourglass, and and if too large,
rectangle) usingquantization
k-means clusteringartifacts
[32].and overfitting
In our will occur.
body detection Figurethe
experiment, 9
shows the
numbers offeature classification
descriptors and clusters of the
were k-NN
aroundalgorithm
78,000 and with 500decision
words, boundaries
respectively,(feature
and the
descriptorsofare
dimension mapped
each word was in two dimensions)
64 [42]. The visualfor all the
words wereimages of the four
then included in a types
visualof body
diction-
shapes in our database.
ary. The number of visual words in the dictionary was taken as an experimental parameter. If
the number was too small, visual words would not represent all patches, and if too large,
3.3. Body Shape
quantization Classification
artifacts and overfitting will occur. Figure 9 shows the feature classification of the
3.3.1. k-Nearest
k-NN algorithm with Neighborhood (k-NN) (feature descriptors are mapped in two dimensions)
decision boundaries
for allThe
the dictionary
images of the of visual
four typeswords was used
of body shapes toinclassify a subject’s body shape category
our database.
which is developed from training images. To classify an input image, the proposed method
detects and extracts features from the input image and then using the k-nearest neighbor
algorithm, constructs a histogram of features for each image with 500 words. The visual
words (cluster centers of feature descriptors) are always the same and obtained from all
the images. The feature histogram uses the k-NN algorithm to calculate the frequency
of occurrence of individual visual words. Equation (3) shows the Euclidean distance
equation [43] for error correction for accurate classification using k-NN:
v
u k
d = t ∑ ( x i − y i )2 ,
u
(3)
i =1
where k is the number of nearest neighbors, x is the data point, y is the neighboring point
and d is the distance. An input image is classified by determining the maximum occurrence
of the visual words (feature patches) that corresponds to a specific body shape category.
The occurrence of those visual words in that image is calculated using k-NN. Figure 10
shows the histogram of the visual word vocabulary occurring in the input image, with the
bar heights corresponding to the frequency of occurrence.
nics 2021, 10, x FOR PEER REVIEW The

Figure9.9.The
Figure k-nearest
k-nearest neighbor
neighbor algorithm
algorithm for body
for body shape
shape classification.
classification. 9 The
of 16 bag-of-features
The bag-of-features
descriptors are mapped in 2 dimensions in the figure, each cluster center pointing
descriptors are mapped in 2 dimensions in the figure, each cluster center pointing to one to one visual
visual
word using
word using k-means
k-meansclustering,
clustering,and
andthe k-nearestneighbor
thek-nearest neighborisisused
usedfor
forgenerating
generatingthethehistogram
histogramofof
visual
visual words.
words.
3.3. Body Shape Classification

3.3.1. k-Nearest Neighborhood (k-NN)
The dictionary of visual words was used to classify a subject’s body shape category
which is developed from training images. To classify an input image, the proposed
method detects and extracts features from the input image and then using the k-nearest
neighbor algorithm, constructs a histogram of features for each image with 500 words.
The visual words (cluster centers of feature descriptors) are always the same and obtained
from all the images. The feature histogram uses the k-NN algorithm to calculate the fre-
quency of occurrence of individual visual words. Equation (3) shows the Euclidean dis-
tance equation [43] for error correction for accurate classification using k-NN:
𝑑 = √∑𝑘𝑖=1(𝑥𝑖 − 𝑦𝑖 )2 , (3)
Hourglasswhere k is the number of nearest neighbors, x is the data point, y is the neighboring point
(a) and d is the distance. An input (b) image is classified by determining (c) the maximum occur-
rence of the visual words (feature patches) that corresponds to a specific body shape cat-
Figure
Figure 10. (a) Input10. (a) Input
image image
of body of body
shape shape (b)
hourglass, hourglass,
histogram(b) of
histogram of visual
visual words words occurrence
occurrence for the inputforimage with a
egory. a The occurrence
of 500ofvisual
thosewords
visualwith
words in that image is calculated using k-NN. Figure
vocabulary the input
of 500 image
visual withwith
words vocabulary
3 most frequently occurring visual 3words,
most frequently
and (c) theoccurring visual
location of the 3 visual words
10 shows thethe
histogram of the occurring
visual word vocabulary occurring andinlocation
the input image, with
occurring inwords, andimage
the input (c) the location
(index andoflocation
3 visual
markedwords in the
in the input image). input image
Note that the(index
maximum occurrence of visual
the bar heights corresponding to the frequency of occurrence.
marked in the input image). Note that the maximum occurrence of visual words corresponding to
words corresponding to a body shape classifies the image to that category.
a body shape classifies the image to that category.
The bag-of-features model provides a method for counting the visual word occurrences
The bag-of-features
in an imagemodel provides
by encoding the aimage
method
intofor counting
a feature the For
vector. visual word occur-
classification, each input image
rences in an image by encoding the image into a feature vector. For classification,
is assigned the majority class of the k-nearest visual words in the descriptor each space, using
input image isthe
assigned the majority
Euclidian class of the in
distance mentioned k-nearest
Equationvisual words
(3). The in the descriptor
parameter k controls how many of
space, using the
the nearest neighbors vote for the output class. The histogram binsk are
Euclidian distance mentioned in Equation (3). The parameter controls
incremented based
how many of the nearest neighbors vote for the output class. The histogram bins are in-
cremented based on the distance of the feature descriptor to a particular cluster center.
The histogram length corresponds to the number of visual words of the dictionary.
The k-nearest neighbor algorithm was trained on our training dataset. The extracted
The bag-of-features model provides a method for counting the visual word occur
rences in an image by encoding the image into a feature vector. For classification, each
input image is assigned the majority class of the k-nearest visual words in the descripto
Electronics 2021, 10, 97 space, using the Euclidian distance mentioned in Equation (3). The parameter k control
9 of 15
how many of the nearest neighbors vote for the output class. The histogram bins are in
cremented based on the distance of the feature descriptor to a particular cluster center
Theonhistogram
the distance length
of the corresponds to the
feature descriptor to a number
particularof visual
cluster words
center. Theof the dictionary.
histogram length
The k-nearest
corresponds to theneighbor
number ofalgorithm
visual words was of trained on our training dataset. The extracted
the dictionary.
featuresThefromk-nearest neighborimages
the training algorithm wasused
were trainedtoon our training
develop the dataset.
feature The
map.extracted
The k-neares
features from the training images were used to develop the
neighbor was calculated using the Euclidian distance from Equation (3). When feature map. The k-nearest
a new im
neighbor was calculated using the Euclidian distance from Equation (3). When a new
age is obtained, the feature is extracted from the image and mapped on the feature map
image is obtained, the feature is extracted from the image and mapped on the feature map.
TheThe
distance
distance is iscalculated
calculated among
among the theadjacent
adjacent points
points fornew
for the the image.
new image. The distance
The distance is i
thenthen
sorted, and the k-nearest neighbors are taken. A major voting
sorted, and the k-nearest neighbors are taken. A major voting system was used to system was used to
classify thethe
classify newnewimage
image to to one
oneofofthethe four
four classes.
classes. The following
The following block diagram
block diagram in Figure 11 in Figure
describes the k-NN algorithm for body shape
11 describes the k-NN algorithm for body shape detection. detection.
Figure 11. The

Figure k-nearest
11. The k-nearestneighbor
neighbor block diagram
block diagram forfor classifying
classifying bodybody shape.
shape.
For the classification, we considered four different types of body shapes: inverted
triangle, pear, hourglass, and rectangle. Hence, a feature point consisting of multiple
dimension parameters is then sorted into the specific class that is closest to the visual word
following the k-NN rule. In this way, our proposed algorithm detects the body shape.
3.3.2. Convolutional Neural Network (CNN)

The CNN algorithm uses a neural network consisting of input layers, hidden layers,
and output layers with a bank of the convolutional filter and pooling operations. CNN
applies convolutional filters to an input image to create the feature map of summarized
features detected in the input. To reduce dimensionality, a pooling layer was used. Chang-
ing the values of weights of the neurons when applied through the neural network enables
CNN to differentiate between classes. The block diagram of CNN for body shape detection
is given below.
The convolutional layers extract the feature maps using convolution filters, and
fully connected layers are used to adjust the weights of a classifier. Here, the Alexnet
architecture [44] has been adopted, which has eight layers including s convolutional layers
with n depths and m max pooling layers, as shown in Figure 12. The activation function
for each node in the fully connected layers is a sigmoid function which is expressed in
Equation (4):
−1
σ ( x ) = 1 + e− x (4)
where x is the input value and σ ( x ) is the output value between 0 and 1. In each stage,
the convolution function is used to extract features, and the activation function is used
for classification.
tion is given below.
The convolutional layers extract the feature maps using convolution filters, and fully
connected layers are used to adjust the weights of a classifier. Here, the Alexnet architec-
ture [44] has been adopted, which has eight layers including s convolutional layers with
Electronics 2021, 10, 97 n depths and m max pooling layers, as shown in Figure 12. The activation function 10 offor
15
each node in the fully connected layers is a sigmoid function which is expressed in
Equation (4):
Figure 12. Training procedure of convolutional neural network (CNN).
𝜎(𝑥) = (1 + e−𝑥 )−1

where x is the input value and 𝜎(𝑥) is the output value between 0 and 1.
the convolution function is used to extract features, and the activation funct
classification.
Figure 12.Figure 13 shows
Training procedure the training
convolutional
of convolutional and testing
neural network
neural network (CNN). process of the CNN. In
(CNN).
method,
Figure the CNN
13 shows themodel wastesting
training and trained onof56
process theimages taken
CNN. In the randomly
proposed method, with l
−𝑥 −1
database
the CNN model as training
was traineddata.
on 56 For𝜎(𝑥)
images =
each(1 + e )
takenofrandomly
the inputwith training
labels fromimages,
the database(4)
this ope
as training
where x is data.
the For
input each
valueof the
and input
𝜎(𝑥) training
is the images,
output this
value operation
between
updating the weight values of the CNN. For testing, the remaining 24 w 0 enables
and 1. updating
In each the
stage,
weight values offunction
the convolution the CNN. For testing,
is used thefeatures,
to extract remainingand24thewere taken and
activation validated
function with
is used for
validated
the labels.
classification.
with the labels.
Figure 13 shows the training and testing process of the CNN. In the proposed
method, the CNN model was trained on 56 images taken randomly with labels from the
database as training data. For each of the input training images, this operation enables
updating the weight values of the CNN. For testing, the remaining 24 were taken and
validated with the labels.
Figure Training
Figure 13.13. and testing
Training and procedure of the CNN model
testing procedure of theusing
CNN the model
acquired using
database.
the acquired da
Figure
Figure 14
14 shows
shows the body shape classification procedure of our proposed CNN algo-
Figure 13. Training and testing procedure of the CNN model using the acquired database.
rithm.
rithm. The trained CNN differentiates the
the body
body shapes
shapes of
of input
input test
test images,
images, as
as shown
shown in
in
Figure
Figure 14.
14.
Figure 14.The
Figure14. Theconvolutional
convolutionalneural
neuralnetwork
network forfor
body shape
body detection.
shape TheThe
detection. convolution layers
convolution are obtained
layers by subsampling
are obtained by subsam-
and pooling,
pling and theand
and pooling, fully
theconnected layer classifies
fully connected the input
layer classifies theimage.
input image.
4. Results
In our proposed method, k-NN or CNN is adopted for a body shape detection algo-
rithm due to their relatively high accuracy performance. k-NN is a representative of a
group of machine learning classifiers which is widely used in classifications in various
fields including handwritten digit recognition from images [45] and tumor classification
from images [46] and simple in procedure, calculating the Euclidian distance between da-
Electronics 2021, 10, 97 11 of 15
4. Results
In our proposed method, k-NN or CNN is adopted for a body shape detection algo-
rithm due to their relatively high accuracy performance. k-NN is a representative of a
group of machine learning classifiers which is widely used in classifications in various
fields including handwritten digit recognition from images [45] and tumor classification
from images [46] and simple in procedure, calculating the Euclidian distance between
dataset input images [47]. Both k-NN and CNN are implemented in a computer having
Intel (R) CoreTM i5-8250U CPU @ 1.60 GHz (eight CPUs) ~ 1.8 GHz (Intel Corporation,
Santa Clara, California, USA) and 8192 MB RAM.
The k-NN was implemented with the input of the bag-of-features model. A minimum
of 252,343 features were taken from each category, resulting in a total of 1,009,372 features
(252,343 features/category × 4 categories) for the whole database (see Equation (2)). The
proposed machine learning method with bag-of-features model and k-NN classifier results
in an 87.50% accuracy whereas the support vector machine (SVM) [48] and multilayer
perceptron (MLP) [49] delivers 72.17% and 62.50% accuracy, respectively. Therefore, our
proposed method performs better in terms of accuracy (17.52% and 28.57% for SVM and
MLP, respectively). The accuracy comparison of the proposed method to SVM and MLP is
given in Table 1.
Table 1. Comparative analysis of the accuracy of the proposed method support vector machine
(SVM) and multilayer perceptron (MLP).
Method Accuracy
SVM 72.17%
MLP 62.50%
Proposed Method 87.50%
Figure 15 shows the prediction accuracy of two body shape categories using the
bag-of-features
Electronics 2021, 10, x FOR PEER REVIEW model and k-NN classifier. As shown in Table 2, our classifier detects12the
of 16
inverted triangle and rectangle body shape with 100% accuracy, and the hourglass and
pear with 67 and 83% accuracy, respectively, resulting in an overall 87.50% testing accuracy
on average. It takes 133.13 s to train the k-NN with the training datasets.
(a) (b)
Figure
Figure 15.15. Prediction
Prediction accuracy
accuracy for for the example
the example of 2 body
of 2 body shapeshape types k-NN
types using using classifier
k-NN classifier with bag-of-features:
with bag-of-features: (a) inverted(a)
inverted triangle and (b)
triangle and (b) hourglass. hourglass.
Table 2. Confusion matrix for the classification of 4 body shapes using the k-NN classifier with the
bag-of-features.
Predicted Class
Confusion Matrix
Inverted Triangle Pear Hourglass Rectangle
Electronics 2021, 10, 97 12 of 15
Table 2. Confusion matrix for the classification of 4 body shapes using the k-NN classifier with the
bag-of-features.
Predicted Class
Confusion Matrix
Inverted Triangle 100% 0% 0% 0%
Pear 17% 83% 0% 0%
Actual Class
Hourglass 33% 0% 67% 0%
Rectangle 0% 0% 0% 100%
Our simulation results show that the leave-one-out approach gives 88.75% accuracy
while the hold-out approach gives 87.50% accuracy. In terms of computational time, the
hold-out approach took around 60 times less compared to the leave-one-out approach.
The CNN model developed for body shape detection was trained on the same database.
The training procedure uses nine epochs (the number of complete cycles through the
training dataset) with a minimum batch size (the number of sample images processed
before the model is updated for the next iteration) of six. The initial learning rate, which is
the amount that the weights are updated during training as the step size, is taken as 0.0001.
As shown in Table 3, the CNN classifier detects an inverted triangle, pear, hourglass, and
rectangle body shape with 100% accuracy. It takes 59 s for the CNN to train itself with the
training datasets.
Table 3. Confusion matrix for the classification of 4 body shapes using the CNN model.
Predicted Class
Confusion Matrix
Inverted Triangle 100% 0% 0% 0%
Pear 0% 100% 0% 0%
Actual Class
Hourglass 0% 0% 100% 0%
Rectangle 0% 0% 0% 100%
5. Discussion
The conventional technology to detect body shapes relies on body measurements. The
dimensional body scanning model depends on these characteristics to classify body shapes.
For example, the inverted triangle is detected by this scanner when the heaviest part is the
upper body, shoulders are wider than hips with the actual measurements [50]. In [22], the
body shape detection method based on the parameters of female body measurements is
suggested. However, our proposed method does not depend upon body measurements,
instead using only the acquired image to detect the body shape. Our proposed method is
more convenient as it does not require exact body measurements taken manually from users.
Conventional methods have limitations in that incorrect measurements can be taken for
body shape detection since the actual measurement is necessary to calculate the body shape,
requiring the rigorous measurements of the shoulder, bust, hips and waist of a person.
Collecting this measurement information is usually inconvenient and difficult to acquire
due to the lack of measurement tools or professional tailor measurement skills. Since there
is no universal body measurement table for predicting the body shape of a person, each
method relies on different measurement criteria for the parameters for predicting the body
shape, which is widely inaccurate [11]. However, our approach is an image processing and
machine learning-based method which does not require any measurement input from the
user’s perspective and detects the actual body shape from a smartphone camera image.
The major vote method [51] was used to assess the label of the subject, i.e., the body shape
category. The labels are also supported by the traditional method of measurements [25],
introducing repeatability among different observers. Our proposed technology does not
rely on any precalculated criteria. The comparative evaluation of the proposed method with
Electronics 2021, 10, 97 13 of 15
the state-of-art methods is presented in Table 4. Hidayati et al. [11] introduced a study on
body shape detection and styling using measurements of 3150 female celebrities obtained
from a website (www.bodymeasurements.org). The gold standard of the body shape labels
was not defined in the paper, rather an unsupervised feature learning method was used
by combining different measurements. The intention to use unsupervised learning is to
extract the natural grouping of a set of body measurements. Therefore, the clustering of
measurement data, e.g., affinity propagation, has been presented as a viable solution for
body shape detection. As shown in Table 4, the accuracy of our proposed method is higher
than the method proposed by Hidayati et al. [11].
Table 4. Comparison of the proposed method with the conventional method.
Proposed Method with Proposed Method

Criteria Hidayati et al. [11]
Bag-of-Features Model and k-NN (CNN)
Measurement Based on Image Processing and Image Processing and
Body Shape Detection Algorithm
Combination of Criteria Machine Learning Deep Learning
Requires Exact Measurement Yes No No
Testing Accuracy 76.83% 87.5% 100%
6. Conclusions
In this paper, we proposed an effective method to detect body shapes using a smart-
phone, which will solve overall fit issues, increase consumer convenience and improve
overall online apparel shopping experiences for consumers. Our proposed method has
been shown to provide 87.50% accuracy on average in classifying four different body
shapes: inverted triangle, pear, hourglass, and rectangle. The proposed methods expect to
increase the feasibility of smartphone-based online apparel shopping by providing optimal
fit garment suggestions in a more accurate and personalized way [52], which will finally
benefit online apparel retailers by increasing their revenue and decreasing returns due to
poor fit. The results of this study show the potential contribution to the textile and apparel
industry by providing contact-free body shape detection with high accuracy. Further
development of the proposed method will generate a model enabling the detection of other
body types, incorporating with additional data on other existing body shapes. The next
step of this research is to analyze the shape and fit of the garment itself to match this with
the body shape detection data, so that it can provide optimal fit garment information for
consumers.
7. Limitations and Future Research

There are some limitations of this study which can be explored in the future research.
First, we only classified the four body shapes of females. Hence, future research should
be conducted for body shapes of males as well as additional body shapes for female, such
as the round shape. Second, we used 2D images to determine the body shapes. However,
height and side silhouette in addition to front silhouette, which was examined in this paper,
can affect the fit issues. Hence, further research of fit detection related to the height and
side silhouette is needed. In addition, the examination of body shape detection with a
larger sample size can increase the accuracy of the results.
Author Contributions: K.H.F. designed the analysis, developed the software, wrote the original
and revised manuscript, and conducted data analysis and details of the work. H.J.C. designed
the research experiment, verified data, and conducted statistical analysis. F.B. collected the data
and conducted the analysis. J.W.C. conceptualized and designed the research experiment, wrote
the original/revised drafts, designed, re-designed, and verified image data analysis, and guided
direction of the work. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Electronics 2021, 10, 97 14 of 15
Institutional Review Board Statement: The study was conducted following the Institutional Review
Board (IRB) protocol (IRB2020-482) at Texas Tech University.
Informed Consent Statement: Not applicable.
Data Availability Statement: No new data were created or analyzed in this study. Data sharing is
not applicable to this article.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Wagner, G.; Schramm-Klein, H.; Steinmann, S. Online retailing across e-channels and e-channel touchpoints: Empirical studies of
consumer behavior in the multichannel e-commerce environment. J. Bus. Res. 2020, 107, 256–270. [CrossRef]
2. Ecola, L.; Lu, H.; Rohr, C. How Is COVID-19 Changing Americans’ Online Shopping Habits? RAND Corporation: Santa Monica, CA,
USA, 2020.
3. Fashion-Ecommerce. Online Apparel Industry Market US. Available online: https://www.statista.com/statistics/278890/us-
apparel-and-accessories-retail-e-commerce-revenue (accessed on 8 December 2020).
4. Tuunainen, V.K.; Rossi, M. eBusiness in apparel retailing industry-critical issues. In Proceedings of the ECIS 2002, Gdańsk, Poland,
6–8 June 2002; p. 136.
5. De Leeuw, S.; Minguela-Rata, B.; Sabet, E.; Boter, J.; Sigurðardóttir, R. Trade-offs in managing commercial consumer returns for
online apparel retail. Int. J. Oper. Prod. Manag. 2016, 36, 710–731. [CrossRef]
6. Dabolina, I.; Silina, L.; Apse-Apsitis, P. Evaluation of Clothing Fit; IOP Publishing: Bristol, UK, 2018; Volume 459, p. 012077.
7. Gunatilake, H.; Hidellaarachchi, D.; Perera, S.; Sandaruwan, D.; Weerasinghe, M. An ICT Based Solution for Virtual Garment
Fitting for Online Market Place. Int. J. Inf. Technol. Comput. Sci. 2018, 10, 60–72. [CrossRef]
8. Kim, D.-E.; Labat, K. An exploratory study of users’ evaluations of the accuracy and fidelity of a three-dimensional garment
simulation. Text. Res. J. 2012, 83, 171–184. [CrossRef]
9. Petrova, A.; Ashdown, S.P. Three-dimensional body scan data analysis: Body size and shape dependence of ease values for pants’
fit. Cloth. Text. Res. J. 2008, 26, 227–252. [CrossRef]
10. Ashdown, S.P.; Calhoun, E.; Lyman-Clarke, L.; Piller, F.T.; Tseng, M.M. Virtual Fit of Apparel on the Internet: Current Technology
and Future Needs. In Handbook of Research in Mass Customization and Personalization; World Scientific Pub Co Pte Lt.: Singapore,
2009; Volume 2, pp. 731–748.
11. Hidayati, S.C.; Hsu, C.-C.; Chang, Y.-T.; Hua, K.-L.; Fu, J.; Cheng, W.-H. What dress fits me best? Fashion recommendation on the
clothing style for personal body shape. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Korea,
22–26 October 2018; pp. 438–446.
12. Zakaria, N.; Ruznan, W.S. Developing apparel sizing system using anthropometric data: Body size and shape analysis, key
dimensions, and data segmentation. In Anthropometry, Apparel Sizing and Design; Elsevier B.V.: Woodhead Publishing: Cambridge,
UK, 2020; pp. 91–121.
13. Connell, L.J.; Ulrich, P.V.; Brannon, E.L.; Alexander, M.; Presley, A.B. Body Shape Assessment Scale: Instrument Development
Foranalyzing Female Figures. Cloth. Text. Res. J. 2006, 24, 80–95. [CrossRef]
14. Pisut, G.; Connell, L.J. Fit preferences of female consumers in the USA. J. Fash. Mark. Manag. Int. J. 2007, 11, 366–379. [CrossRef]
15. Sattar, H.; Pons-Moll, G.; Fritz, M. Fashion Is Taking Shape: Understanding Clothing Preference Based on Body Shape from Online
Sources. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI,
USA, 7–11 January 2019; pp. 968–977.
16. Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-Up Robust Features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359.
[CrossRef]
17. Stanciu, S.G.; Xu, S.; Peng, Q.; Yan, J.; Stanciu, G.A.; Welsch, R.E.; So, P.T.C.; Csucs, G.; Yu, H. Experimenting Liver Fibrosis
Diagnostic by Two Photon Excitation Microscopy and Bag-of-Features Image Classification. Sci. Rep. 2014, 4, 1–12. [CrossRef]
18. Amato, G.; Falchi, F.; Gennaro, C. Geometric consistency checks for kNN based image classification relying on local features.
In Proceedings of the Fourth International Conference on Similarity Search and Applications, Lipari, Italy, 30 June–1 July 2011;
pp. 81–88.
19. Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology.
Insights Imaging 2018, 9, 611–629. [CrossRef] [PubMed]
20. Vuruskan, A.; Bulgun, E.Y. Identification of female body shapes based on numerical evaluations. Int. J. Cloth. Sci. Technol. 2011,
23, 46–60. [CrossRef]
21. Hu, P.; Kaashki, N.N.; Dadarlat, V.; Munteanu, A. Learning to Estimate the Body Shape Under Clothing from a Single 3D Scan.
IEEE Trans. Ind. Inform. 2020, 1. [CrossRef]
22. Devarajan, P.; Istook, C.L. Validation of female figure identification technique (FFIT) for apparel software. J. Text. Appar. Technol.
Manag. 2004, 4, 1–23.
23. Manju, B.; Meenakshy, K.; Gopikakumari, R. Prostate Disease Diagnosis from CT Images Using GA Optimized SMRT Based
Texture Features. Procedia Comput. Sci. 2015, 46, 1692–1699. [CrossRef]
Electronics 2021, 10, 97 15 of 15
24. Chong, J.W.; Dao, D.; Salehizadeh, S.M.A.; McManus, D.D.; Darling, C.E.; Chon, K.H.; Mendelson, Y. Photoplethysmograph
Signal Reconstruction Based on a Novel Hybrid Motion Artifact Detection–Reduction Approach. Part I: Motion and Noise
Artifact Detection. Ann. Biomed. Eng. 2014, 42, 2238–2250. [CrossRef]
25. Yin, L.; Annett-Hitchcock, K. Comparison of body measurements between Chinese and U.S. females. J. Text. Inst. 2019, 110,
1716–1724. [CrossRef]
26. Body Shape Calculator. Available online: https://www.harperloren.com/fashion/calculate-your-body-shape/ (accessed on 16
December 2020).
27. Vincent, O.; Folorunso, O. A Descriptive Algorithm for Sobel Image Edge Detection. In Proceedings of the 2009 InSITE Conference,
Macon, GA, USA, 12–15 June 2009; pp. 97–107.
28. Deng, G.; Cahill, L. An adaptive Gaussian filter for noise reduction and edge detection. In Proceedings of the 1993 IEEE
Conference Record Nuclear Science Symposium and Medical Imaging Conference, San Francisco, CA, USA, 31 October–6
November 1993; pp. 1615–1619.
29. Kharlamov, A.; Podlozhnyuk, V. Image Denoising; NVIDIA: Santa Clara, CA, USA, 2007.
30. Sakr, S.; Elshawi, R.; Ahmed, A.; Qureshi, W.T.; Brawner, C.; Keteyian, S.; Blaha, M.J.; Al-Mallah, M.H. Using machine learning on
cardiorespiratory fitness data for predicting hypertension: The Henry Ford ExercIse Testing (FIT) Project. PLoS ONE 2018, 13,
e0195344. [CrossRef]
31. Hsiao, W.-L.; Grauman, K. ViBE: Dressing for Diverse Body Shapes. In Proceedings of the 2020 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), Virtual, Seattle, WA, USA, 14–18 June 2020; pp. 11056–11066.
32. Yadav, S.; Shukla, S. Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification.
In Proceedings of the 2016 IEEE 6th International Conference on Advanced Computing (IACC), Bhimavaram, India, 27–28
February 2016; pp. 78–83.
33. Pang, Y.; Li, W.; Yuan, Y.; Pan, J. Fully affine invariant SURF for image matching. Neurocomputing 2012, 85, 6–10. [CrossRef]
34. Lindeberg, T. Scale Invariant Feature Transform. Scholarpedia 2012, 7, 10491. [CrossRef]
35. Mistry, S.; Patel, A. Image stitching using Harris feature detection. Int. Res. J. Eng. Technol. 2016, 3, 2220–2226.
36. Bay, H.; Tuytelaars, T.; Van Gool, L. Surf: Speeded up robust features. In Proceedings of the European Conference on Computer
Vision, Graz, Austria, 7–13 May 2006; pp. 404–417.
37. Derpanis, K.G. Integral image-based representations. Dep. Comput. Sci. Eng. York Univ. Pap. 2007, 1, 1–6.
38. Teke, M.; Temizel, A. Multi-spectral Satellite Image Registration Using Scale-Restricted SURF. In Proceedings of the 2010 20th
International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2310–2313.
39. Fragoso, V.; Srivastava, G.; Nagar, A.; Li, Z.; Park, K.; Turk, M. Cascade of Box (CABOX) Filters for Optimal Scale Space
Approximation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus,
OH, USA, 24–27 June 2014; pp. 126–131.
40. Sledevic, T.; Serackis, A. SURF algorithm implementation on FPGA. In Proceedings of the 2012 13th Biennial Baltic Electronics
Conference, Tallinn, Estonia, 3–5 October 2012; pp. 291–294.
41. Arora, G.; Dubey, A.K.; Jaffery, Z.A.; Rocha, A. Bag of feature and support vector machine based early diagnosis of skin cancer.
Neural Comput. Appl. 2020, 1–8. [CrossRef]
42. Mukherjee, J.; Mukhopadhyay, J.; Mitra, P. A survey on image retrieval performance of different bag of visual words indexing
techniques. In Proceedings of the 2014 IEEE Students’ Technology Symposium, Kharagpur, India, 28 February–2 March 2014;
pp. 99–104.
43. Hu, L.-Y.; Huang, M.-W.; Ke, S.-W.; Tsai, C.-F. The distance function effect on k-nearest neighbor classification for medical datasets.
SpringerPlus 2016, 5, 1–9. [CrossRef] [PubMed]
44. Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Van Esesn, B.C.; Awwal, A.A.S.; Asari, V.K. The history
began from alexnet: A comprehensive survey on deep learning approaches. arXiv 2018, arXiv:1803.01164 2018.
45. Makkar, T.; Kumar, Y.; Dubey, A.K.; Rocha, Á.; Goyal, A. Analogizing time complexity of KNN and CNN in recognizing
handwritten digits. In Proceedings of the 2017 Fourth International Conference on Image Information Processing (ICIIP), Shimla,
India, 21–23 December 2017; pp. 1–6.
46. Rani, K.V.; Jawhar, S.J. Novel Technology for Lung Tumor Detection Using Nanoimage. IETE J. Res. 2019, 1–15. [CrossRef]
47. Liu, F.; Yang, S.; Ding, Y.; Xu, F. Single sample face recognition via BoF using multistage KNN collaborative coding. Multimed.
Tools Appl. 2019, 78, 13297–13311. [CrossRef]
48. Noble, W.S. What is a support vector machine? Nat. Biotechnol. 2006, 24, 1565–1567. [CrossRef]
49. Ruck, D.W.; Rogers, S.K.; Kabrisky, M. Feature selection using a multilayer perceptron. J. Neural Netw. Comput. 1990, 2, 40–48.
50. Istook, C.L. Female figure identification technique (ffit) for apparel part I: Describing female shapes. J. Text. Appar. Technol. Manag.
2004, 4, 1–16.
51. Tabei, F.; Zaman, R.; Foysal, K.H.; Kumar, R.; Kim, Y.; Chong, J.W. A novel diversity method for smartphone camera-based heart
rhythm signals in the presence of motion and noise artifacts. PLoS ONE 2019, 14, e0218248. [CrossRef] [PubMed]
52. Saarijärvi, H.; Sutinen, U.-M.; Harris, L.C. Uncovering consumers’ returning behaviour: A study of fashion e-commerce. Int. Rev.
Retail. Distrib. Consum. Res. 2017, 27, 284–299. [CrossRef]

Electronics: Smartfit: Smartphone Application For Garment Fit Detection

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Electronics: Smartfit: Smartphone Application For Garment Fit Detection

Uploaded by

Copyright:

Available Formats

electronics

Citation: Foysal, K.H.; Chang, H.J.;

Electronics 2021, 10, 97. https://doi.org/10.3390/electronics10010097 https://www.mdpi.com/journal/electronics

Figure 1. Four common body shapes for female.

which are shown

nics 2021, 10, x FOR PEER REVIEW The

3.3. Body Shape Classification

Figure 11. The

3.3.2. Convolutional Neural Network (CNN)

Figure 12. Training procedure of convolutional neural network (CNN).

𝜎(𝑥) = (1 + e−𝑥 )−1

Electronics 2021, 10, x FOR PEER REVIEW 11 of 16

Table 4. Comparison of the proposed method with the conventional method.

Proposed Method with Proposed Method

7. Limitations and Future Research

You might also like