Topsakal 2021

International Journal of Computer Assisted Radiology and Surgery (2021) 16:1381–1391
https://doi.org/10.1007/s11548-021-02423-z
ORIGINAL ARTICLE
Evaluating the agreement and reliability of a web‑based facial analysis

tool for rhinoplasty
Oguzhan Topsakal1 · Mustafa İlhan Akbaş2 · Bria Synae Smith1 · Michael Francis Perez1 · Ege Can Guden3 ·
Mehmet Mazhar Celikoyar3
Received: 12 January 2021 / Accepted: 28 May 2021 / Published online: 19 June 2021
© This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2021
Abstract
Purpose Rhinoplasty is one of the most common and challenging plastic surgery procedures. Facial analysis is a crucial step
in planning. Utilizing three-dimensional (3D) model of a patient’s face is an emerging way of performing facial analysis. This
paper evaluates the agreement and reliability of facial measurements taken using a web app, located at digitized-rhinoplasty.
com, that utilizes 3D models of the patient’s face.
Methods Eleven measurements were calculated on 16 human subjects. Three methods of measurements were performed:
direct measurements on human subjects’ faces, measurements on 2D photographs, and measurements on 3D models of face
scans. The Bland–Altman plot is used for testing the agreement between the web app and the well-known Blender 3D mod-
eling software. Intra-rater and inter-rater reliability was calculated and compared for 2D and 3D methods using the intraclass
correlation coefficient (ICC) method. The statistical analysis methods were checked for the normality and homoscedasticity
assumptions.
Results The results indicate that the web app and Blender software show agreement within 95% confidence limits. The web
app performs well in intra-rater and inter-rater reliability statistical analysis. The web app’s reliability scores are consistently
better than facial analysis software which was found highly reliable in a previous study. We also compare the methods of
measurements in terms of time, ease of use, and cost.
Conclusion The utilization of 3D computer modeling for facial analysis has its advantages and started to become more com-
mon due to recent advances in technology. The web app utilizes 3D face scans for pre-operative planning and post-operative
evaluation of facial surgeries. The web app performs well in agreement and inter-/intra-reliability analysis and performs
consistently better than software that works utilizing 2D photographs. The web app provides accurate, repeatable, affordable,
and fast facial measurements for facial analysis when compared to direct and 2D methods.
Keywords Surgery planning · Facial analysis · Rhinoplasty · Agreement · Reliability · Facial measurements · 3D model ·
Intraclass correlation coefficient · Bland–Altman
* Oguzhan Topsakal Mehmet Mazhar Celikoyar

otopsakal@floridapoly.edu mazhar.celikoyar@gmail.com
Mustafa İlhan Akbaş 1
Florida Polytechnic University, Lakeland, FL 33805, USA
akbasm@erau.edu
2
Embry-Riddle Aeronautical University, Daytona Beach,
Bria Synae Smith
FL 32114, USA
bsmith4696@floridapoly.edu
3
Florence Nightingale Hospital, Istanbul, Turkey
Michael Francis Perez
mperez7561@floridapoly.edu
Ege Can Guden
egecanguden@gmail.com
13
Vol.:(0123456789)

1382 International Journal of Computer Assisted Radiology and Surgery (2021) 16:1381–1391
Introduction and related work can be replicated. [36] The intraclass correlation coefficient
(ICC) score is the most frequently used method for meas-
Rhinoplasty is one of the most common yet complex sur- uring reliability [4, 24, 36] .There has been a significant
gical procedures that aim to alter the shape of the nose amount of work to evaluate the reliability of 3D solutions
while achieving the aesthetic harmony and functionality of such as the studies that tested 3dMDFace [3, 8, 15, 16, 27,
nasal structures [9]. Facial analysis is an important part of 34], Canfield Vectra [25], Konica Minolta Vivid [17, 29],
facial reconstructive and cosmetic surgery that helps in pre- and LifeViz [5]. Despite their promising results, these new
operative planning and the post-operative evaluation of the facial analysis and measurement solutions also have disad-
outcome. vantages. For instance, they require complex equipment and
Facial analysis can be performed directly (manually) on software that are not affordable by most healthcare practices
the patient’s face using rulers or calipers. However, there and they provide only a limited number of facial features and
are several disadvantages of direct measurements such as measurements.
the discomfort to the patients, inconsistency, difficulty in This paper tested the agreement and reliability of a new
repetition, and time inefficiency [34]. 2D computer imaging web app, Face Analyzer,6 that utilizes 3D face scans for
(photography) has been a recommended tool for rhinoplasty facial analysis [30]. We compared the agreement of meas-
facial analysis; however, it has limitations due to represent- urements taken via the web app against an established 3D
ing a 3D structure (nose) in 2D medium [14, 21]. Moreover, modeling and rendering software, Blender [7]. We tested the
measurements from 2D imaging have to be recalibrated to inter- and intra-reliability of measurements performed on 3D
actual size by using measurements of known distances (a models utilizing the web app and compared the reliability
ruler in the photograph or the expected diameter of the iris tests against Rhinobase which was found highly reliable in a
and cornea) which potentially introduces errors. previous study [23]. We also report the time needed, effort,
Recent advances in 3D computer modeling address some and cost of direct, 2D, and 3D measurements.
of the limitations inherent to 2D imaging [26] and provide
the ability to perform absolute measurements without the
need for a reference object [31]. Besides distances and Materials and methods
angles, it is possible to calculate volumes and topographic
distances using 3D models [26, 31]. Given these advantages, To test the agreement and reliability of the web app, we
surgery planning and evaluation has been progressing from conducted an experimental crossover controlled trial. In
2D images to 3D models [20, 21, 26, 33]. the following subsections, we first explain the details of the
Modern software and hardware solutions enabled 3D scan methodology used for the agreement (“Agreement analysis”
and analysis of a patient’s face. Some examples are 3dMD- section) and then the reliability study (“Intra- and inter-relia-
Face,1 Canfield Vectra,2 Crisalix,3 Di3D,4 and LifeViz.5 bility analysis” section). We then introduce the independent
Tzou et al. evaluated and compared some of these solutions variables (measurements) (“Facial measurements” section)
in terms of software, hardware, speed, cost, etc. [32]. and the subjects (“Subjects” section) involved in this study.
While these new solutions emerge, it is imperative to Finally, we explain how we acquired the anthropometric
compare them to an established approach and see whether models and performed the measurements for the direct, 2D,
they agree sufficiently for the new to replace the old in the and 3D methods (“Direct Measurements” Getting 2D Images
clinical field [22]. Bland–Altman plot is the most popular and Measurements, and “3D Face Scans and Measurements”
and recommended statistical method used to assess agree- sections).
ment in medicine to ensure that the new method of meas-
urement is as accurate as the current or gold standard [13, Agreement analysis
28, 35].
Besides agreement, reliability is an important param- An agreement analysis is performed to see whether a meas-
eter in determining the quality of an instrument. Reliabil- urement tool is good enough when compared to a gold stand-
ity measures precision or the extent to which test results ard. Our goal is to test the correctness of the measurements
taken by a web app for facial analysis that works on 3D face
models. The 3D face model needs to be created using a 3D
1
3dMDFace: https://3dmd.com/3dmdface/. scanner. The 3D scanner might not be error-free while creat-
2
Canfield Vectra:https://www.canfieldsci.com/imaging-systems/. ing the 3D models. We do not aim to test the accuracy and
3
10. Crisalix:https://crisalix.com.
4
Di3D: https://www.di4d.com.
5 6
Lifeviz Mini: https://www.quantificare.com/3d-photography-syste Face Analyzer: https://www.digitized-rhinoplasty.com/app/analy
ms/lifeviz-mini/. zer.html.
13
International Journal of Computer Assisted Radiology and Surgery (2021) 16:1381–1391 1383
Fig. 1 On the left, a measure-

ment is taken via Blender
software, and on the right, the
same measurement is taken via
the web app
Fig. 2 Facial feature points marked with red dots on the texture image of the 3D model to reduce the marking errors during the agreement analy-
sis study
precision of 3D scanners. We believe the performance of the (red lines in Fig. 7) indicate 95% confidence limits (upper
low-cost 3D scanners, especially the ones utilizing smart- limit = mean + 1.96 × SD, lower limit = mean − 1.96 ×
phone cameras, will improve over time. Since our focus is SD). An important assumption of the Bland–Altman limits
on taking the measurements from a 3D model, we identify of agreement is that the differences are normally distributed.
our gold-standard measurement tool as Blender software [7]. We checked the normality using the Shapiro–Wilk statistical
Blender is established 3D modeling software with over 60 method. After plotting the Bland–Altman graph, we need to
versions since 1994. We took the measurements on subjects’ know whether there is a trend between points being above
3D models using Blender and the web app and then ran a versus below the mean difference. The existence of a trend
statistical analysis to test the agreement between these two shows that there is a proportional bias. We performed linear
types of software. We wanted to minimize the variations of regression analysis by choosing the difference as the depend-
the locations of feature points between the markings made ent variable and mean as the independent variable to check
using Blender and web app (Fig. 1). Therefore, we annotated the proportional bias.
the texture image of the 3D models to put a red dot for each
feature point utilized for measurements using both types of Intra‑ and inter‑reliability analysis
software. This helped us mark the same location for each
feature point. Figure 2 shows the texture images that have We used intraclass correlation coefficient (ICC) test for the
the red dots. We utilized the Bland–Altman plot, which is intra-rater and inter-rater reliability analysis of the web app.
a scatter plot of the difference between the two measure- Two raters performed the measurements, and each rater took
ments against the average of the two measurements [22]. two sets of measurements at least one day apart to avoid
In the Bland–Altman plot, we place three lines. The mid- memory bias. The intra-rater reliability was calculated
dle line (blue line in Fig. 7) indicates the mean difference between a rater’s two measurements, and inter-rater reliabil-
between the two measurements. The upper and lower lines ity was calculated between the two raters using the average
13

Table 1 Measurements Measurement Description
Base bony width The distance from nasal parenthesis—left (np_l) to nasal parenthesis—right
(np_r)
Columellar length The distance between subnasale (sn) and columella breakpoint (c)
Infralobule The distance from pronasale (prn) to columella breakpoint (c)
Interalar distance The distance from alar flare—left (al_l) to alar flare—right (al_r)
Intercanthal distance The distance from endocanthion—left (en_l) to endocanthion—right (en_r)
Lower facial height The distance between subnasale (sn) and menton (me)
Mid-facial height The distance between subnasale (sn) and glabella (g)
Nasal bridge length The distance from pronasale (prn) to nasion (n)
Premaxilla—left The distance between subnasale (sn) and alar crease—left (ac_l)
Radix projection The height at the nasion (n) is measured from the anterior corneal plane (cp)
Upper facial height The distance between trichion (tr) and glabella (g)
of their first and second set of measurements. The texture of these subjects were removed and the analyses have been
images of the 3D models were not annotated as done for the performed on 16 subjects (eight men and eight women). The
agreement analysis, and hence, 3D models did not have any problems were due to the improper capture of the 3D struc-
red dots for the feature points. tures of the face or due to the issues on the texture image.
We compared the results from the web app against the Figure 3 shows examples of these issues.
ICC analysis results of Rhinobase [2] since it was found
highly reliable [23]. To check whether the data conforms to Direct measurements
normality and stable variance assumptions to perform ICC
analysis [4, 10], we utilized the Shapiro–Wilk statistical The direct measurements were taken with the aid of a caliper
method and utilized Levene’s test to check the homogeneity or a ruler, based on what seemed to be more likely to give an
of variances (homoscedasticity). After checking the assump- accurate result as shown in Figure 4. The average time spent
tions, we performed the ICC analysis and reported the values with the subjects for measurements is reported in Table 3.
as well as their 95% confidence intervals. The intra-rater reli-
ability was calculated using a single-rater, absolute agree- Getting 2D images and measurements
ment, two-way mixed-effects model. The inter-rater reliabil-
ity was calculated using a two-rater, absolute agreement, 2D images (pictures) were taken with a Nikon D600 camera
two-way mixed-effects model [19]. body, with a Nikon AF Nikkor lens of f2.8 105 mm speci-
fications mounted and flash of Nikon Speedlight SB-R200.
Facial measurements Pictures were taken while subjects were seated, with Frank-
fort Horizontal parallel to the floor. The camera height was
We selected facial parameters from the pertaining litera- adjusted according to the subject’s height so that the sub-
ture [6, 11] with an emphasis on selecting the feature points ject’s head was horizontal to the camera lens. Each subject
(landmarks) and measurements that a rhinoplasty surgeon held a 10-cm ruler next to her face as a reference object for
would prefer and that were possible to measure using cali- calibration. 2D images were used to calculate the measure-
pers. The angles were not included since their direct meas- ments using Rhinobase, and the process was timed for each
urement on the patient was difficult and error-prone. Fifteen subject.
facial landmarks are marked and used to calculate 11 meas-
urements. The definitions of the measurements are listed in 3D face scans and measurements
Table 1.
We selected Bellus3D for its easy-to-use and fast scan-
Subjects ning option with satisfactory accuracy [1], after testing
other low-cost scanner options such as Intel Realsense7 and
The study was done on 28 Caucasian volunteering sub-
jects. The sample consisted of 17 women and 11 men,
aged between 18 and 60 years. No exclusion criteria were
applied. However, due to problems with some of the 3D 7
Intel Realsense: https://www.intelrealsense.com/depth-camera-
face scans, which were noticed after the volunteers left, 12 d435i/.
13
Fig. 3 Face 3D scanning issues: First row shows samples of issues related to the texture image and second row shows issues about the 3D mor-
phology around the subnasale area
Fig. 4 Taking direct measure-

ments
13

Fig. 5 3D Face scan using Bellus3D iPhone App. The subject rotates his/her head for the 3D scanning
Structure Sensor.8 We utilized Bellus3D Face Camera Pro, transferred to a computer. No processing was performed on
model FCP01, for scanning 18 volunteers. The camera was the 3D models. The 3D model files were uploaded to the
mounted on a Samsung Galaxy S7 Edge phone (Samsung web app at digitized-rhinoplasty.com. [30]. The web app
Electronics, Seoul, South Korea). The camera is used via an uses 3D face scans as input to enable the 3D measurements.
Android smartphone app. Ten more volunteers were scanned Utilizing 3D models eliminates the requirement of a scale
using an Apple iPhone X smartphone (Apple Inc., Califor- reference such as a ruler.
nia, USA) and Bellus3D Face iOS App downloaded from A 3D model uploaded to the web app is shown in Fig-
the Apple App Store. iPhone X and later versions have a ure 6. As the feature points were marked on the web app,
true-depth camera that enables scanning 3D objects with- the corresponding measurements were calculated automati-
out an external camera. Figure 5 shows a subject scan his cally and presented on the screen. The total time needed
face using Bellus3D iOS app by rotating his face. Face 3D to calculate all measurements for each subject was noted.
models generated using the custom Bellus3D camera and The measurements for each subject were entered into an MS
the Android app have around 240 K triangles, while the face Excel file and imported into the statistical analysis software.
3D models generated using the iPhone’s true-depth camera
and the iOS App have around 200 K triangles. A triangle is
a simple polygon that forms a 3D model. The more triangles Results
a 3D model has, the more detailed it becomes.
The scanned 3D models were exported from the smart- The statistical analysis of agreement and inter-/intra-relia-
phone app as an .obj file with texture in .jpg format and then bility that are presented in this section were performed using
IBM SPSS Statistics, version 26 (IBM Corp., Armonk, NY,
USA).
8
Structure Sensor: https://structure.io/structure-sensor.
13
Fig. 6 A 3D model is being analyzed using the web app at digitized-rhinoplasty.com
Statistical analysis of agreement Statistical analysis of intra‑ and inter‑rater reliability
Eleven measurements were taken for 16 subjects using both We utilized ICC analysis to test the reliability. To see
the web app and the Blender. Figure 7 shows the Bland–Alt- whether data conforms to normality and stable variance
man plots for analyzing the agreement between the measure- assumptions, we used the Shapiro–Wilk statistical method
ments taken using Blender and the web app. The blue line for the normality check and used Levene’s test for the homo-
shows the mean of the differences, and the red lines show scedasticity check.
the 95% confidence limits for the measures. We see that All of the Shapiro–Wilk test p-values for the intra-rater
data falls into these confidence limits, hence showing an measurements with the web app are not significant and
agreement. hence conform to normality. The Shapiro–Wilk test p-values
Normality assumption was tested using the Shapiro–Wilk for some of the measurements for the inter-rater measure-
test, and it did not show a significant departure from normal- ments were significant. Therefore, we checked the skew-
ity. W(16) values are between 0.92 and 0.98, and p-values ness and kurtosis for these measurements and concluded
are well above 0.05. Based on skewness values, the distri- that skewness and kurtosis values are in acceptable ranges
bution is approximately symmetric and moderately skewed. for normal distribution.
The kurtosis values range between − 1.15 and 0.70. Both A few of the p-values for the Shapiro–Wilk test of the
skewness and kurtosis values are acceptable for normal dis- intra-rater measurements with Rhinobase were significant.
tribution [12]. Therefore, we checked the skewness and kurtosis and con-
To check the proportional bias, we ran linear regression cluded that skewness and kurtosis values are in acceptable
using SPSS and selected the ‘difference’ as the dependent ranges for normal distribution according to these values.
variable and the ‘mean’ as the independent variable. The Some of the inter-rater measurements, taken using Rhin-
unstandardized coefficient beta values (representing the obase, have significant p-values (less than 0.05) for the
slope of the linear regression line) for the mean are very Shapiro–Wilk test. The skewness and kurtosis values were
close to 0.00 (ranges between − 2.26E−02 and 1.71E−02), within acceptable range for all, but one measurement (mid-
and p-values are not statistically significant. This shows that facial height) had high kurtosis value. This is an indicator
there is no proportional bias. that data has heavy tails or outliers.
We performed Levene’s test to check the homoscedastic-
ity assumption for the ICC. Levene’s test showed that the
13

Fig. 7 Bland–Altman plots for each measurement
13
Table 2 Results of the ICC statistical analysis direct (manual), 2D (photography), and 3D facial analysis
n = 16 Face analyzer Rhinobase
methods.
Measurement Inter-rater
Base bony width 0.491 (− 0.201–0.837) 0.32 (− 0.189–0.721) Discussion
Columella length 0.654 (− 0.025–0.882) 0.334 (− 0.306–0.724)
Infratip lobule 0.678 (0.123–0.885) 0.008 (− 0.625–0.531) The literature emphasizes the advantages of using 3D mod-
Interalar distance 0.762 (− 0.099–0.933) 0.597 (− 0.185–0.889) els for facial analysis [21, 26, 30, 31, 33]. The web app
Intercanthal distance 0.402 (− 0.239–0.777) 0.509 (− 0.243–0.84) works with 3D models and benefits from these advantages.
Lower facial height 0.602 (− 0.247–0.883) 0.922 (0.75–0.974) As reported in Table 3, the time spent taking the measure-
Mid-facial height 0.701 (− 0.228–0.918) 0.653 (− 0.044–0.882) ments is considerably less and repeatable. Also, the meth-
Nasal bridge length 0.847 (0.553–0.947) 0.624 (− 0.006–0.866) odology causes less discomfort for the patient. Moreover,
Premaxilla left 0.719 (0.079–0.907) 0.719 (0.206–0.901) the time to acquire the image/model is less. The web app is
Radix projection 0.663 (0.051–0.882) 0.791 (0.406–0.927) free software publicly available at digitized-rhinoplasty.com.
Upper facial height 0.852 (0.538–0.95) 0.574 (− 0.152–0.85) However, there is a cost associated with acquiring 3D mod-
Intra-rater els, as listed in Table 3. This cost is expected to decrease as
Base bony width 0.927 (0.788–0.974) 0.899 (0.602–0.968) more smartphone apps become available for facial 3D scan-
Columella length 0.634 (− 0.096–0.874) 0.435 (− 0.467–0.795) ning. Current true-depth smartphone cameras generate a 3D
Infratip lobule 0.301 (− 0.902–0.751) 0.625 (− 0.067–0.869) mesh with around 200K polygons.9 As the cameras improve,
Interalar distance 0.989 (0.967–0.996) 0.995 (0.985–0.998) the number of polygons will increase, and hence, the preci-
Intercanthal distance 0.917 (0.769–0.971) 0.9 (0.711–0.965) sion will become better. As the precision increases, we will
Lower facial height 0.787 (0.415–0.924) 0.96 (0.889–0.986) be able to measure smaller distances. Even with the current
Mid-facial height 0.965 (0.693–0.991) 0.853 (0.59–0.948) technology, we are able to measure distances between two
Nasal bridge length 0.951 (0.861–0.983) 0.829 (0.391–0.945) feature points as close as 0.3mm.
Premaxilla left 0.743 (0.268–0.91) 0.434 (− 0.269–0.781) The agreement between measurements taken using the
Radix projection 0.924 (0.779–0.973) 0.948 (0.855–0.982) web app and the established open-source 3D modeling soft-
Upper facial height 0.987 (0.962–0.995) 0.932 (0.805–0.976) ware, Blender, shows that the measurements taken using the
web app are accurate. The intra-reliability scores of the web
app for seven measurements are considered excellent, two
variances for the measurement were equal. The p-value for measurements are good, one measurement is fair, and one
the columella length measurements using Rhinobase is 0.04 measurement is poor, as listed in Table 2 [18]. The inter-
(slightly below 0.05). Heteroscedasticity shows the presence reliability score of the web app for three measurements is
of outlier in the data. Our sample size is small (n = 16), considered good, that for six measurements is fair, and that
and the effect of outliers would decrease as the sample size for two measurements is close to fair, as listed in Table 2.
increases. The reliability scores for the web app are consistently better
Table 2 presents the results of the ICC analysis for the than those for Rhinobase software and have shown not to be
inter- and intra-reliability of the web app and Rhinobase. inferior to the Rhinobase which was found highly reliable in
An ICC of less than 0.5 is considered as poor, 0.50 to 0.75 a previous study [23]. This not inferiority conclusion might
as fair, 0.75 to 0.90 as good, and 0.90 to 1.00 as excellent have been caused by the lack of potency of the study due to
reliability [18]. small sample size.
Compared to Rhinobase for the same measurements on One of the main limitations of this study is that it is a
the same subjects, the web app has shown not to be inferior small-scale study (n = 16). This makes the statistical analy-
in terms of intra-rater and inter-rater reliability. sis sensitive to outliers. Also, randomization was not per-
formed due to the small number of subjects. Due to the
Comparison of direct, 2D, and 3D facial analysis nature of the study, applying blinding methods was not pos-
methods sible, and hence, raters were aware of the methodology while
taking the measurements.
We evaluated the direct, 2D (photography), and 3D meth-
ods for facial analysis during our study. We have timed the
calculation of measurements for each method and listed
the average values in Table 3. The table also presents the
cost, discomfort, and required expertise information for the
9
Using an iPhone 11 and Bellus3D FaceApp version 2.0.2.25P.
13

Table 3 Comparison of manual, 2D, and 3D methods for facial analysis
Measurement type Direct (manual) 2D image (photography) 3D Scan
Time to get the image N/A 10 min 20 s

Cost to acquire the image Caliper Minimal if a smartphone is utilized Around $600 if a low-cost
or the cost of the photograph 3D camera is used with an
camera Android smartphone or $1 per
3D model using iPhone X or
higher [Costs are for a single
Bellus3D camera and a single
3D model export from the
Bellus3D iOS app as of April
2021.]
Time to measure 15–20 min 2 min 1.5 min
Effort to repeat the measurements Might be high [It might be difficult Same as getting the first measure- Same as getting the first meas-
to arrange another meeting with ments urements
the subject to repeat taking the
measurements.]
Discomfort to subject Considerable Minimal Minimal
Requires expertise Yes Yes Minimal
Conclusion Declarations
The utilization of 3D computer modeling for facial analysis Conflict of interest The authors declare that they have no conflict of
interest.
before and after a facial surgery has its advantages and has
started to become more common due to recent advances Ethical approval The study was approved by the Clinical Research Eth-
in 3D technologies. This paper presents an agreement and ics Committee of Demiroglu Bilim University, Istanbul, Turkey (Deci-
reliability study to demonstrate the performance of the Face sion number 03.03.2020/2020.05.03). Written informed consent for
performing the direct measurements, 2D photography, 3D scans, data
Analyzer web app that is designed for pre-operative planning analysis, and publication of associated results was obtained beforehand
and post-operative evaluation of facial surgeries using 3D from all volunteers.
face models [30].
The agreement analysis using the Bland–Altman plot Informed consent In accordance with the provisions of the General
Data Protection Regulation (EU) 2016/679, all subjects showed in the
shows that the web app and the Blender software agree for images and the proprietaries of all the personal data showed in this
11 measurements taken from 16 subjects. The web app’s article gave their written consent to conduct its publication. All authors
inter- and intra-reliability was analyzed using ICC test. The have participated in (a) conception and design, or analysis and inter-
reliability of the web app for 11 measurements is mostly pretation of the data; (b) drafting the article or revising it critically for
important intellectual content; and (c) approval of the final version.
good to excellent. Compared to software which was found This manuscript has not been submitted to, nor is under review at,
highly reliable, the web app performs consistently better for another journal or other publishing venue. The authors have no affilia-
reliability and has shown not to be inferior. We also com- tion with any organization with a direct or indirect financial interest in
pared the direct (manual), 2D, and 3D measurements in the subject matter discussed in the manuscript.
terms of accuracy, ease of use, and cost. The results show
that the web app, which is available at https://d igiti zed-r hino
plasty.com, provides accurate, repeatable, affordable, and References
fast facial analysis.
1. Amornvit P, Sanohkan S (2019) The accuracy of digital face scans
Acknowledgements We would like to thank Elif Topsakal for her help obtained from 3d scanners: an in vitro study. Int J Environ Res
in statistical analysis. We would like to thank Julian Maniquis for help- Public Health 16:5061. https://doi.org/10.3390/ijerph16245061
ing with the agreement analysis using the Blender. We thank all the 2. Apaydin F, Akyildiz S, Hecht DA, Toriumi DM (2009) Rhinobase:
volunteers who took part in this study. We also would like to thank the a comprehensive database, facial analysis, and picture-archiving
reviewers of IJCARS for their valuable and constructive feedback on software for rhinoplasty. Arch Facial Plast Surg 11(3):203–211.
the manuscript. https://doi.org/10.1001/archfacial.2009.35
3. Baysal A, Sahan AO, Ozturk MA, Uysal T (2016) Reproducibility
Funding This study is not funded by any grant. and reliability of three-dimensional soft tissue landmark iden-
tification using three-dimensional stereophotogrammetry. Angle
Orthod 86(6):1004–1009. https://doi.org/10.2319/120715-833.1
13
4. Bobak C, Barr P, O’Malley A (2018) Estimation of an inter- 24. Mokkink L, Prinsen C, Bouter L, De Vet H, Terwee C (2016)
rater intra-class correlation coefficient that overcomes common The consensus-based standards for the selection of health meas-
assumption violations in the assessment of health measure- urement instruments (COSMIN) and how to select an outcome
ment scales. BMC Med Res Methodol. https://doi.org/10.1186/ measurement instrument. Braz J Phys Ther 20:105–113
s12874-018-0550-6 25. Othman SA, Ahmad R, Merican A, Jamaludin M (2013) Repro-
5. Ceinos R, Tardivo D, Bertrand MF, Lupi-Pegurier L (2016) Inter- ducibility of facial soft tissue landmarks on facial images captured
and intra-operator reliability of facial and dental measurements on a 3d camera. Aust Orthod J 29:58–65
using 3d-stereophotogrammetry. J Esthet Restor Dent 28(3):178– 26. Persing S, Timberlake A, Madari S, Steinbacher D (2018) Three-
189. https://doi.org/10.1111/jerd.12194 dimensional imaging in rhinoplasty: a comparison of the simu-
6. Celikoyar MM, Perez MF, Akbas MI, Topsakal O (2021) Facial lated versus actual result. Aesthetic Plast Surg 42(5):1331–1335
surface anthropometric features and measurements with an 27. Plooij J, Swennen G, Rangel F, Maal T, Schutyser F, Bronkhorst
emphasis on rhinoplasty. Aesthetic Surg J. https://d oi.o rg/1 0.1 093/ E, Kuijpers-Jagtman A, Berge S (2009) Evaluation of reproduc-
asj/sjab190 ibility and reliability of 3d soft tissue analysis using 3d stereopho-
7. Community BO (2021) Blender—a 3d modelling and rendering togrammetry. Int J Oral Maxillofac Surg 38:267–273
package. http://www.blender.org. Accessed 21 Mar 2021 28. Stralen K, Dekker F, Zoccali C, Jager K (2012) Measuring
8. Dindaroglu F, Kutlu P, Duran G, Gorgulu S, Aslan E (2015) Accu- agreement, more complicated than it seems. Nephron Clin Pract
racy and reliability of 3d stereophotogrammetry: a comparison 120:c162–c167. https://doi.org/10.1159/000337798
to direct anthropometry and 2d photogrammetry. Angle Orthod. 29. Toma A, Zhurov A, Playle R, Ong E, Richmond S (2009) Repro-
https://doi.org/10.2319/041415-244.1 ducibility of facial soft tissue landmarks on 3d laser-scanned facial
9. Dobratz E, Tran V, Hilger P (2010) Comparison of techniques images. Orthod Craniofac Res 12(1):33–42
used to support the nasal tip and their long-term effects on tip 30. Topsakal O, Akbas IM, Demirel D, Nunez R, Simith B, Perez
position. Arch Facial Plast Surg 12(3):172–179 M, Celikoyar MM (2020) Digitizing rhinoplasty: a web appli-
10. Dogan N (2018) Bland-Altman analysis: a paradigm to understand cation with three-dimensional preoperative evaluation to assist
correlation and agreement. Turk J Emerg Med. https://doi.org/10. rhinoplasty surgeons with surgical planning. Int J Comput
1016/j.tjem.2018.09.001 Assist Radiol Surg 15(11):1941–1950. https://doi.org/10.1007/
11. Farkas L (1994) Examination. In: Anthropometry of the head and s11548-020-02251-7
face, 2 edn. Raven Press, New York, pp 3–56 (1994) 31. Toriumi DM, Dixon TK (2011) Assessment of rhinoplasty tech-
12. George D, Mallery P (2009) SPSS for windows step by step: a niques by overlay of before-and-after 3D images. Facial Plast Surg
simple study guide and reference, 17.0 Update, 10th edn. Allyn Clin North Am 19(4):711–723
amp; Bacon, Inc., USA 32. Tzou CHJ, Artner NM, Pona I, Hold A, Placheta E, Kropatsch
13. Giavarina D (2015) Understanding bland altman analysis. Bio- WG, Frey M (2014) Comparison of three-dimensional surface-
chemia Medica. https://doi.org/10.11613/BM.2015.015 imaging systems. J Plast Reconstr Aesthet Surg 67(4):489–497.
14. Goffart Y (2010) Morphing in rhinoplasty: predictive accuracy https://doi.org/10.1016/j.bjps.2014.01.003
and reasons for use. B-ENT 6(Suppl 15):13–19 33. Willaert RV, Opdenakker Y, Sun Y, Politis C, Vermeersch H
15. Heike C, Cunningham M, Hing A, Stuhaug E, Starr J (2009) (2019) New technologies in rhinoplasty: a comprehensive work-
Picture perfect? Reliability of craniofacial anthropometry using flow for computer-assisted planning and execution. Plast Recon-
three-dimensional digital stereophotogrammetry. Plast Reconstr struct Surg Glob Open 7:3
Surg 124:1261–1272. https://doi.org/10.1097/PRS.0b013e3181 34. Wong DJY, Oh DAK, Ohta DE, Hunt DAT, Rogers DGF, Mulliken
b454bd DJB, Deutsch DCK (2008) Validity and reliability of craniofa-
16. Hong C, Choi K, Kachroo Y, Kwon T, Nguyen A, McComb R, cial anthropometric measurement of 3d digital photogrammetric
Moon W (2017) Evaluation of the 3dmdface system as a tool for images. Cleft Palate Craniofac J 45(3):232–239. https://doi.org/
soft tissue analysis. Orthod Craniofac Res 20(S1):119–124 10.1597/06-175 (PMID: 18452351)
17. Kau C, Richmond S, Zhurov A, Knox J, Chestnutt I, Hartles F, 35. Zaki R, Bulgiba A, Nordin N, Ismail N (2012) Statistical meth-
Playle R (2005) Reliability of measuring facial morphology with ods used to test for agreement of medical instruments measuring
a 3-dimensional laser scanning system. Am J Orthod Dentofac continuous variables in method comparison studies: A systematic
Orthop 128(4):424–430 review. PLoS ONE 7(5):37908
18. Koo T, Li M (2016) A guideline of selecting and reporting intra- 36. Zaki R, Bulgiba A, Nordin N, Ismail N (2013) A systematic
class correlation coefficients for reliability research. J Chiropr review of statistical methods used to test for reliability of medi-
Med 15(2):155–163 cal instruments measuring continuous variables. Iran J Basic Med
19. Landers R (2015) Computing intraclass correlations (ICC) as esti- Sci 16:803–807
mates of interrater reliability in SPSS. Winnower. https://doi.org/
10.15200/winn.143518 Publisher’s Note Springer Nature remains neutral with regard to
20. Lekakis G, Claes P, Hamilton GS, Hellings PW (2016) Evolution jurisdictional claims in published maps and institutional affiliations.
of preoperative rhinoplasty consult by computer imaging. Facial
Plast Surg 32(1):80–87
21. Lekakis G, Hens G, Claes P, Hellings PW (2019) Three-dimen-
sional morphing and its added value in the rhinoplasty consult.
Plast Reconstruct Surg Glob Open 7:1
22. Martin Bland J, Altman D (1986) Statistical methods for assessing
agreement between two methods of clinical measurement. Lan-
cet 327(8476):307–310 (Originally published as Volume 1, Issue
8476)
23. Meruane M, Ayala M, Garcia-Huidobro M, Andrades P (2016)
Reliability of nasofacial analysis using rhinobase software.
Aesthetic Plast Surg 40:149–156. https:// d oi. o rg/ 1 0. 1 007/
s00266-015-0569-6
13

Topsakal 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topsakal 2021

Uploaded by

Copyright:

Available Formats

International Journal of Computer Assisted Radiology and Surgery (2021) 16:1381–1391

Evaluating the agreement and reliability of a web‑based facial analysis

* Oguzhan Topsakal Mehmet Mazhar Celikoyar

Fig. 1 On the left, a measure-

Table 1 Measurements Measurement Description

Fig. 4 Taking direct measure-

Fig. 6 A 3D model is being analyzed using the web app at digitized-rhinoplasty.com

Statistical analysis of agreement Statistical analysis of intra‑ and inter‑rater reliability

Fig. 7 Bland–Altman plots for each measurement

Table 3 Comparison of manual, 2D, and 3D methods for facial analysis

Measurement type Direct (manual) 2D image (photography) 3D Scan

Time to get the image N/A 10 min 20 s

You might also like

Topsakal 2021

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topsakal 2021

Uploaded by

Copyright:

Available Formats

International Journal of Computer Assisted Radiology and Surgery (2021) 16:1381–1391

Evaluating the agreement and reliability of a web‑based facial analysis

* Oguzhan Topsakal Mehmet Mazhar Celikoyar

Fig. 1 On the left, a measure-

Table 1 Measurements Measurement Description

Fig. 4 Taking direct measure-

Fig. 6 A 3D model is being analyzed using the web app at digitized-rhinoplasty.com

Statistical analysis of agreement Statistical analysis of intra‑ and inter‑rater reliability

Fig. 7 Bland–Altman plots for each measurement

Table 3 Comparison of manual, 2D, and 3D methods for facial analysis

Measurement type Direct (manual) 2D image (photography) 3D Scan

Time to get the image N/A 10 min 20 s

You might also like

Fig. 1 On the left, a measure-

Table 1 Measurements Measurement Description

Fig. 4 Taking direct measure-

Fig. 6 A 3D model is being analyzed using the web app at digitized-rhinoplasty.com

Fig. 7 Bland–Altman plots for each measurement

Table 3 Comparison of manual, 2D, and 3D methods for facial analysis