Professional Documents
Culture Documents
of the human skeletal form RESULTS: All skeletal proportions (SPs) are high-
ly heritable (~30 to 50%), and genome-wide as-
Eucharist Kun, Emily M. Javan, Olivia Smith, Faris Gulamali, Javier de la Fuente, Brianna I. Flynn, sociation studies of these traits identified 145
Kushal Vajrala, Zoe Trutner, Prakash Jayakumar, Elliot M. Tucker-Drob, Mashaal Sohail, independent loci. These loci are enriched in genes
Tarjinder Singh*, Vagheesh M. Narasimhan* that regulate skeletal development as well as those
that are associated with rare human skeletal dis-
INTRODUCTION: Humans are the only bipedal RATIONALE: One approach to studying skeletal eases and abnormal mouse skeletal phenotypes.
great apes, owing to our distinctive skeletal form. form is to obtain a map of regions in the ge- Genetic correlation and genomic structural equa-
Morphological changes that contribute to our nome that affect skeletal development and mor- tion modeling indicated that limb proportions
skeletal form have been studied extensively in phology. Previously, this has been examined exhibited strong genetic sharing but were genet-
paleoanthropology. With the exception of stand- mainly through animal models and compara- ically independent of width and torso proportions.
ing height, examining the genetic basis for dif- tive genomics, but these approaches are largely Phenotypic and polygenic risk score analyses
ferential and specific growth of individual bones low throughput. A complementary approach identified specific associations between osteo-
and their evolution has been challenging be- is to examine the genetic basis of variation in arthritis of the hip and knee, which are the lead-
cause of limited sample sizes. skeletal traits in humans. In this work, we ap- ing causes of adult disability in the United States,
and SPs of the corresponding regions. We also
Knee OA (M17)
of knee (M23)
Hip OA (M16)
H
such as osteoarthritis (OA), the leading cause of
umans are the only primates who are been studied extensively. Early work using adult disability in the United States (25, 26),
normally bipedal, owing to our distinc- forward genetic screens in Drosophila iden- are thought to be influenced by a variety of risk
tive skeletal form, which stabilizes the tified homeobox genes as key regulators of factors that range across obesity, mechanical
upright position. Bipedalism is enabled anatomical development in invertebrates (8). stresses, genetic factors, and even the geometric
by specific anatomical properties of the Subsequent experiments in vertebrates, includ- structure of certain bones (27). Although some
human skeleton, including shorter arms rela- ing fish, chickens, and mice, identified addition- small studies have examined the relationship
tive to legs, a narrow body and pelvis, and the al gene families that are crucial in the regulation of certain skeletal element lengths such as
orientation of the vertebral column (1–3). These of skeletal development and form (9, 10). leg-length discrepancy and OA (28), how the
broad changes to skeletal proportions (SPs) Comparative genomic and evolutionary devel- skeletal frame may exacerbate an individual’s
likely began to occur around the separation opmental biology approaches have produced development of osteoarthritic disease has not
of the human and chimpanzee lineages, and several insights into the genetic basis of skel- been fully studied (27, 29).
as a result, may have facilitated the use of tools etal structure, from the underpinnings of conver- In this study, we applied methods in com-
and accelerated cognitive development (4, 5). gent limb loss in snakes and limbless lizards puter vision to derive comprehensive human
Fossil evidence showing major morphological (11, 12) to increased limb lengths in jerboas skeletal measurements from full-body dual-
changes in the length of the limbs, torso, and when compared with mice (13). However, these energy x-ray absorptiometry (DXA) images at
body width suggest that these changes were approaches do not provide an unbiased and biobank scale. We then performed genome-
gradual, with incremental development over comprehensive map of the genetic loci that wide scans on 23 generated phenotypes to
the course of several million years (6, 7). However, regulate SPs and overall body plan. In addition, identify loci associated with variation in the
despite more than a hundred years of effort in many of these approaches largely focus on skeletal form. Using summary statistics from
paleoanthropology documenting morphological examining the impact of loss-of-function muta- these IDPs, we identified biological processes
changes of the skeletal form in human evolution, tions, which often have widespread effects on linked with human SPs and studied the pheno-
evidence of genomic change has been elusive. the entire skeleton. The subset of genes re- typic and genetic correlation between these
In developmental biology, the mechanisms sponsible for differential and specific growth measures and a range of external phenotypes,
and processes underlying animal limb develop- of individual bones remains unknown. with an emphasis on musculoskeletal dis-
ment, morphology, and broad body plan have Genome-wide association studies (GWASs) orders. Finally, we investigated the impact of
of human skeletal traits are a direct and comple- natural selection on these traits to understand
1
mentary approach to characterizing the genetic how skeletal morphology is linked to human
Department of Integrative Biology, The University of Texas
at Austin, Austin, TX, USA. 2Icahn School of Medicine at
basis of traits. Twin studies suggest that the evolution and bipedalism.
Mount Sinai, New York, NY, USA. 3Department of heritability of SPs range between 0.40 and 0.80
Psychology, The University of Texas at Austin, Austin, TX, (14), similar to the heritability of standing Results
USA. 4Department of Surgery and Perioperative Care, The
height (15), a skeletal trait that has served as an A deep-learning approach for quality control and
University of Texas at Austin, Austin, TX, USA. 5Centro de
exemplary quantitative trait in human genetics. quantification of biobank-scale imaging data
Ciencias Genómicas (CCG), Universidad Nacional Autónoma
de México (UNAM), 62209 Cuernavaca, Mexico. 6Department Meta-analysis of more than 5 million individu- To study the genetic basis of human SPs, we
of Psychiatry, Columbia University Irving Medical Center,
als has identified a saturated map of common jointly analyzed DXA and genetic data from
New York, NY, USA. 7The New York Genome Center, New
York, NY, USA. 8Mortimer B. Zuckerman Mind Brain Behavior genetic variants that are associated with stand- 42,284 individuals in the UK Biobank (UKB).
Institute at Columbia University, New York, NY, USA. ing height (16). However, height is among the Individuals from this dataset are between 40
9
Department of Statistics and Data Science, The University most straightforward and accurate of quantita- and 80 years old and reflect adult skeletal
of Texas at Austin, Austin, TX, USA.
*Corresponding author. tsingh@nygenome.org (T.S.); tive traits to measure. Other skeletal elements, morphology. We report baseline information
vagheesh@utexas.edu (V.M.N.) such as limb, torso, and shoulder lengths, are about our analyzed cohort in (30) and in table
S1. We acquired 328,854 DXA scan images Validation of human skeletal length estimates overall height, we also computed ratios of seg-
across eight imaging modalities comprising After training and validating the deep-learning ments with each other and obtained a total of
full-body transparent images, full-body opaque model on the 297 manually annotated images, 21 different ratio IDPs along with the angle
images, anteroposterior (AP) views of the left we applied this model to predict the 14 land- measure (TFA) (table S3). These ratios are
and right knees, AP views of the hips, and AP marks on the rest of the 39,172 full-body DXA referred to in the text as Segment:Segment (Hip
and lateral views of the spine. For quality con- images. We then calculated pixel distances Width:Height, Torso Length:Legs, and so on).
trol (QC), we first developed a deep learning– between pairs of landmarks that corresponded We then examined differences in SPs across
based multiclass predictor to select full-body to seven bone and body-length segments (30) sex and age. In line with well-known observations,
transparent images from the pool of eight total (Fig. 1B and table S3). We also computed an Hip Width:Height (Student’s t test p < 10−15)
imaging modalities. We developed a second angle measure between the tibia and the femur and Torso Length:Height (Student’s t test p <
deep-learning classifier to remove cropping (tibiofemoral angle, or TFA) (Fig. 1B). To stan- 10−15) were significantly larger in women than in
artifacts. Finally, we excluded images with atyp- dardize images with different aspect ratios, we men (36), but we also observed that Humerus:
ical aspect ratios and padded them to uniform rescaled pixels into centimeters for each image Height was also significantly larger in women
sizes (30) (Fig. 1A). After our QC process, we resolution by regressing the height in pixels than in men (Student’s t test p = 1.45 × 10−5) (30)
were left with 39,469 images for analysis. against standing height in centimeters as mea- (table S8). In addition, we found that all body
After image QC, we manually labeled 14 sured by the UKB assessment (30). We then proportions vary slightly but significantly as a
landmarks at pixel-level resolution on 297 removed individuals with any skeletal measure- function of age (30) (table S9). We also exam-
images for use as training data. These labels ments that were more than four standard de- ined how body proportions vary as a function of
were independently validated by an orthope- viations from the mean. overall height and found that Torso Length:Legs
dic team. The 14 landmarks include the major After outlier removal, we validated the accu- decreases with height [Pearson correlation (r) =
joints—the wrist, elbow, shoulder, hip, knee, racy of our measurements on the remaining −0.21], suggesting that increases in height are
and ankle—and the position of each eye. The samples in four ways. First, the error rate for driven more by increasing leg length rather than
segments connecting these landmarks reflect segment length from the model compared with torso length (Fig. 2A). Arms:Legs also decreases
A
280K DXA images 42,284 Full body 39,469 padded
DXA images full body DXA
images post-QC
Full body
transparent
Full body
opaque ResNet-152 ResNet-152
Multi-class classifier Incorrect Additional classifiers
AP
to select skeletal cropping to remove cropping
spine
full body images and other artifacts
Lateral
spine
AP
femur
5px
AP Amputation 5px
HUMERUS
TORSO LENGTH
FOREARM
HRNet
Landmark estimation
to detect 14 keypoints
on full body images HIP WIDTH
FEMUR
TIBIOFEMORAL ANGLE
TIBIA
HEIGHT
*length measures analyzed as
proportions of overall height
C D E F
70
Second visit measurements (cm)
200 60
65 Skeletal element
Right side measurements
55 Femur
190 60
50 Forearm
55
180 45 Hip Width
50
Height
40 45 Humerus
170
35 40 Shoulder Width
160 30 35 Tibia
25 30 Torso Length
150
r = 0.877 20 r = 0.996 25 r = 0.998
20 30 40 50 60 70 20 25 30 35 40 45 50 55 60 20 30 40 50 60 70
Tibia length: 40.79 ± 0.37 cm Phenotype measurements (cm) Left side measurements First visit measurements (cm)
Fig. 1. Deep learning–based image quantification. (A) QC process. Deep model to perform automatic annotation of landmarks on the rest of images
learning–based classifiers were used to select full-body images from a pool of in the dataset from which measurements of skeletal length and other
DXA images of different body parts, as well as to remove images with artifacts, measurements were calculated. (C) Average HRNet measurement error when
resolution, or cropping issues. Full-body images were then padded to standardize compared with human-derived measurements of the tibia across 100 validation
image pixel size before phenotyping (the image presented here shows padding of images. (D) Correlation of length measurements and height. (E) Correlation
5 pixels on each side). (B) Image quantification. Deep learning–based image between left- and right-side measurements of the femur, humerus, forearm,
landmark estimation using the HRNet architecture is shown. During this process, and tibia. (F) Correlation of lengths measured from the first and second
297 training images annotated with specific landmarks were used to train the imaging visits for the same individual.
Shoulder Width
Torso Length
Hip Width
Humerus
p-value
Forearm
Femur
1e-09
Tibia
Height
1e-08
Arms:Torso Length 1e-07
Forearm:Humerus 1e-06
Bonferroni 1e-05
Shoulder Width:Torso Length corrected 0.0001
Tibia:Femur
Arms:Legs
Shoulder Width:Arms Humerus
Arms
Shoulder Width:Legs
Torso Length:Legs Forearm
Phenotypic Correlations
Hip Width:Torso Length
Shoulder Width
Hip Width:Shoulder Width
Torso
Hip Width:Arms Hip Width
Hip Width:Legs
Torso Length
-0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2
Phenotypic correlation Femur
Legs
Tibia
Genetic Correlations
C 0.21 0.26 1
0.95 0.97
Skeletal
Arms Hip Shoulder Torso factor
Femur
Forearm
Hip Width
Humerus
Shoulder Width
Tibia
Torso Length
Fig. 2. Genetic architecture. (A) Correlation of SPs and overall height. Bars show ±2 SE. (B) Genotype (lower-left triangle) and phenotype (upper-right triangle)
correlation of SPs. Overall correlation is shown in color, and the p value of the correlation is visualized by size. A Bonferroni-corrected threshold is also shown.
(C) Solution for a genomic SEM model for the genetic covariance structure shown in (B) shows one common factor loading for arms, an additional factor for legs,
and independent factors for each of the torso-related traits (hip width, shoulder width, and torso length). (D) Sex-specific analysis showing the ratio of the
standardized effect size of the polygenic score on each trait (±2 SE) in males to the effect in females in a hold-out dataset.
loci, 145 are independently significant [linkage ment and found that 95% of genome-wide Estimates from LDSC and GCTA-REML were
disequilibrium (r2) < 0.1] across all eight pheno- significant loci had the same direction of effect virtually identical (fig. S10); in this work, we
types (92 after Bonferroni correction for eight when carrying out GWASs in these alternate report estimates from GCTA-REML. Limb pro-
traits). Of the 145 independent loci, 37 loci are ways (30). portions had positive genetic correlations with
only significant in SPs after conditioning on each other (rg = 0.34 to 0.55). Upper arms and
all SNPs discovered in a saturated GWAS for Genetic correlations and factor analysis of SPs legs (Humerus:Height–Femur:Height rg = 0.55,
height (16, 30) (table S13). As a sensitivity anal- We calculated the genetic correlation between p = 1.59 × 10−66) and lower arms and legs
ysis, we also examined the genetic effect of each pair of traits to investigate the degree of (Forearm:Height–Tibia:Height rg = 0.51, p =
skeletal lengths before and after height adjust- genetic sharing between each skeletal measure. 6.01 × 10−50) were significantly more correlated
Skeletal proportion
A Femur
Forearm
1
Hip Width
X
PA
Humerus
60 Shoulder Width
Tibia
Torso Length
S1
20
30
LE
E1 8
C
AD MT
AC 1
AS
H BX
AB
1
TE G
C
T
C
A R3
L1 1
7
2
FB P3B
H DIR N1 TB
P
log10 p
X
N
F SA
1
IT ISP 12 3
N F1
1A
O
TE EB
PR P
20
6
2
L1
3
AC C3
G RS
M
W OC GA 3
TS
M
S M IG
O
X9
SC S 2
O
D
C
AM
H LR
F2
EC
SO
TG M3
D 2
EI M C
3
AD
EF KD 2
PH
2
M E C
C
B8
1
P B
C
N
P7
D
S1 P
S
F
ER
D
G
D
BM
3
IZ
M
C
5
51
A1
N
1
M
ZM
SY
U
AD
EY
LI
10
O
LE
PD
FT
R
D
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Chromosome
Fig. 3. Genome-wide association results. (A) Manhattan plot of a GWAS performed across seven SPs and TFA; the lowest p value for any trait at each SNP is annotated.
Loci over the genome-wide significance threshold that are close to only a single gene are annotated. (B) Shown are mean values of proportion and angle traits across
individuals, the total number of genome-wide significant loci per trait, heritability (GCTA-REML), l (from LDSC), and associated genes of loci that are specific to each skeletal
trait (again annotating only loci that map to a region with a protein-coding gene within 20 kb of each clumped region). Illustration was created with BioRender.com.
than upper arms and lower legs (Humerus: der Width:Height, were largely uncorrelated was ≥0.0022, which was above our Bonferroni
Height–Tibia:Height rg = 0.38, p = 5.18 × 10−23) with limb-length proportions (30). No corre- threshold). Correlations between leg and width
or lower arms and upper legs (Forearm:Height– lations involving any pairwise combination of traits were marginally significant in three out of
Femur:Height rg = 0.34, p = 1.49 × 10−18). Body- arm and width traits were significant (the four comparisons, with the maximal correla-
width proportions, Hip Width:Height and Shoul- minimum p value across all such correlations tion (Hip Width:Height–Tibia:Height) being
0.23 (Fig. 2B and table S13). In addition, we To test for pervasive differences in the mag- with increased Forearm:Height. Mouse models
also computed phenotypic correlations between nitude of genetic effects, we performed sex- of MEIS−/− mice are specifically associated with
our traits, which were highly concordant with specific GWASs of all the skeletal traits and abnormal forelimb development (47). Similarly,
genetic correlations (r = 0.98). evaluated these polygenic scores in both sexes a common variant (rs1891308) near ADGRG6,
We used genomic structural equation model- in a hold-out dataset (30). This method had which encodes a G protein–coupled receptor, is
ing (genomic SEM) to produce an empirically recently been applied to examine sex-specific associated with increased torso length. Mice
derived low-dimensional representation of the effects in biobank traits (42). Across all SPs with conditional knockouts in ADGRG6 have
genetic covariance structure of the individual that we tested, polygenic scores had a signi- spine abnormalities that reduce torso length
SPs (41). We performed exploratory factor analy- ficantly larger standardized effect size (standard- (48). Thus, our GWAS of SPs identifies genes
sis to identify the likely number of factors and ized in males and females separately) in males that were previously associated with skeletal
built confirmatory models using odd-numbered compared with females (Student’s t test p < 1 × developmental biology and Mendelian skeletal
chromosomes for model building and even- 10−3 for all comparisons) (Fig. 2D). These results phenotypes, demonstrating the potential for
numbered chromosomes for validation, which are in line with previous work suggesting that future functional and knockout studies.
we compared using a range of model fit indices SPs, like other anthropometric traits, have clear Next, we conducted a transcriptome-wide
(30). Our preferred model of the genetic co- differences in the magnitude of sex-specific ef- association study (TWAS) that linked predicted
variance structure revealed five main factors fects when compared with other quantitative gene expression in skeletal muscle [based on
that governed SPs. First, we identified a single traits in the UKB (42). the Genotype-Tissue Expression project (GTEx v.7)
broad factor (Skeletal factor) that represents (49)] with our SP GWAS. In total, we identified
dimensions of genetic variation that are sta- Biological insights from skeletal associations 30 genes that were significantly associated with
tistically pleiotropic; that is, genetic variation We performed gene set enrichment analyses any one of our skeletal traits at a Bonferroni-
represented in each factor contributes to varia- in 10,678 gene sets using functional mapping corrected significance threshold across the total
tion in not just one phenotype but to variation and annotation (FUMA) of GWASs to identify number of gene and trait combinations (30)
in multiple phenotypes (30). All limb traits [both biological processes and pathways enriched in (table S21). Among the strongest TWAS asso-
Internal derangement
Internal derangement
0.80 1.20
Dorsalgia (M54)
Dorsalgia (M54)
Knee OA (M17)
Knee OA (M17)
of knee (M23)
of knee (M23)
Hip OA (M16)
Hip OA (M16)
Knee pain
Knee pain
Back pain
Back pain
Hip pain
Hip pain
p-value
1.0e-05
5.0e-05
Bonferroni 1.0e-04
Skeletal Skeletal
phenotype phenotype corrected ≥ 5.0e-04
Height Height
Humerus Humerus
Forearm Forearm
Femur Femur
Tibia Tibia
TFA TFA
risk based on polygenic scores derived from with an increased risk of arthritis and pain cardiovascular disease, cancer, and overall
our GWAS on SPs on the imaged set of indi- phenotypes in those specific areas. height (Fig. 5A).
viduals (Fig. 4B). We generated polygenic scores Second, we examined heritability enrich-
with Bayesian regression and continuous shrink- Evolutionary analysis ment using LDSC on genomic annotations that
age priors (51) using the significantly associated As human SPs are an important part of our reflect divergence at different time points in
SNPs and ran a phenome-wide association study transformation to bipedalism, we next inves- human evolution (Fig. 5B) following an approach
of the generated risk scores and traits, adjusting tigated whether variants associated with SPs outlined in Sohail (56) and Hujoel et al. (57).
for the first 20 principal components of ancestry have undergone accelerated evolution in hu- These annotations include regions that differ
and imputed sex (30). Polygenic scores of Hip mans in two ways. First, following a proce- in gene regulation between humans and pri-
Width:Height and TFA were associated with dure by Richard et al. (52) and Xu et al. (53), mates through stages of early development
an increased incidence of hip and knee OA, we examined whether genes associated with (58), regions that differ in expression between
respectively (p = 7.92 × 10−5, OR = 1.04; p = 1.73 × SPs overlapped human accelerated regions adult humans and macaques (59), and regions
10−4, OR = 1.04), in line with the phenotypic (HARs) more than expectation. HARs are seg- that are enriched and depleted of ancestry from
associations. In addition, we also saw signif- ments of the genome that are conserved through- archaic humans (60, 61). We then computed
icant association between back pain [both out vertebrate and great ape evolution but are heritability enrichment, h2(C), which measures
recorded on the ICD-10 (International Classi- notably different in humans (54). We gener- the proportion of heritability in an annotation
fication of Diseases, Tenth Revision) code and ated a null distribution by randomly sampling set divided by the proportion of SNPs in the
self-reported] and Torso Length:Height ( p = regions matched for overall gene length (30) annotation. In our analysis, we also simulta-
5.59 × 10−5, OR = 1.05; p = 5.71 × 10−6, OR = (Fig. 5A). For comparison, we also performed neously incorporated other regulatory elements,
1.02) (table S23). Neither the OA nor the mus- the same analysis on summary statistics from measures of selective constraint, and linkage
culoskeletal pain phenotypes that we tested the ENIGMA Consortium (55) and several com- statistics (baseline LDv2.2 with 97 annotations)
were significantly associated with overall height mon quantitative and disease traits from the (57, 62–64) to estimate h2(C) while minimizing
in this analysis [phenotypic associations: 1.10 × UKB (table S25). Genetic signals from several bias due to model misspecification (30).
10−2 < p < 8.51 × 10−1; polygenic risk score of the SP traits, in particular arm or leg length, Meta-analyzing across all our SP traits, we
associations: 2.17 × 10−3 < p < 3.88 × 10−1] ex- were significantly enriched in HARs (Arms: found enrichment in fetal human–gained en-
cept for polygenic risk scores of height and Legs, Humerus:Height, Arms:Height, Hips: hancers and promoters at early time points
back pain (p = 5.76 × 10−10) (tables S22 and Legs, Tibia:Femur, and Hip Width:Height had [7, 8.5, and 12 postconception weeks (pcw):
S23). In genomic SEM analyses, we observed FDR-adjusted p < 0.05). We also observed nom- h2(C) = 8.08, p = 5.91 × 10−44; h2(C) = 3.60, p =
similar patterns of genetic associations with inal enrichment for traits related to hair pig- 2.55 × 10−4; h2(C) = 3.65, p = 3.55 × 10−4; table
musculoskeletal diseases at the level of gen- mentation (FDR-adjusted p = 0.013), which has S26] but not in adults, suggesting that genes
eral genetic factors (30) (fig. S13 and table S24). also changed substantially in humans com- associated with SPs are differentially expressed
Taken together, these analyses suggest that in- pared with the great apes, and for schizophre- in early development between apes and hu-
creases in the length of skeletal elements that nia (FDR-adjusted p = 1.61 × 10−34). However, mans. Although we acknowledge that the anno-
are associated with the hip, knee, and back as a no enrichment (FDR-adjusted p > 0.05) was tations of differentially regulated elements are
ratio of overall height are exclusively associated observed for HARs in autoimmune disorders, from developing brain and not skeletal tissues,
Primates Apes
A B
Category Phenotypes
Skeletal Arms:Legs
Rhesus Homo Neanderthals
Humerus:Height Significance macaque Chimpanzee erectus
Arms:Height FDR<0.05 Denisovans
Hip Width:Legs FDR>0.05
Tibia:Femur Modern humans
Hip Width:Height 25 MYA 5 MYA 2 MYA ~500 KYA
Tibia:Height 10
Forearm:Height 9 *
Shoulder Width:Legs
Shoulder Width:Arms 8
Heritability Enrichment
Hip Width:Arms *
Shoulder Width:Height 7
Hip Width:Shoulder Width
Arms:Torso Length 6
Torso Length:Legs *
Torso Length:Height 5 *
Legs:Height
4
Femur:Height
Height 3
Dermatological Black Hair
Skin Pigmentation 2
Endocrine Diabetes Mellitus Type 2
Neurological Schizophrenia 1 * *
Caudate Volume
0
Brain Stem Volume
Neanderthal depleted
Neanderthal introgressed
Ancient selective sweeps
Brain Surface Area
Cancer Breast Cancer
Metabolic Vitamin D
Lymphocyte Count
Calcium
LDL-C
Autoimmune Rheumatoid Arthritis
Multiple Sclerosis
Gastrointestinal Inflammatory Bowel Disease
0.0 0.5 1.0 1.5 2.0 2.5 >3
-log(FDR-adjusted p-values)
Fig. 5. Evolutionary analyses. (A) Shown are p values of enrichment for Blue and orange intervals mark epigenetic annotations, whereas the other
overlap of HARs with genes associated with SP, autoimmune, dermatological, color intervals mark genetic annotations. Asterisks show significance at
neurological, endocrine, gastrointestinal, metabolic, psychiatric, and cancer- FDR < 0.05. A dashed line is drawn at y = 1 (no heritability enrichment). This
related traits compared with randomly sampled genes of comparable length. analysis was jointly performed with all genomic annotations in the baseline
Traits below the FDR-corrected threshold (0.05) are shown in orange, and LDv2.2 model. KYA, thousand years ago; MYA, million years ago. (C) Heritability
nonsignificant traits are shown in blue. (B) Meta-analysis of LDSC heritability enrichment analysis in human-gained enhancers and promoters at 7 pcw for each
enrichment across 21 SP traits for different evolutionary annotations that trait analyzed. Asterisks show significance at FDR < 0.05 across all genomic
represent different divergence points in human evolution. Annotations repre- annotations and traits analyzed in this study. A dashed line is drawn at x = 1 (no
sented in colors refer to fetal human–gained enhancers and promoters (blue), heritability enrichment). Error bars show 1 SE around each estimate. (D) Arm:Leg
adult human–gained enhancers and promoters (orange), ancient selective ratio and Hip Width:Height are the only two skeletal traits that show significant
sweeps (purple), putatively introgressed variants from Neanderthals (teal), and enrichment in both types (HARs and heritability across differentially regulated regions
genomic regions depleted in Neanderthal and Denisovan ancestry (teal). at 7 pcw) of evolutionary analysis. Illustration was created with BioRender.com.
fetal human–gained brain regulatory elements the different annotations, controlling for multi- depletion in regions of the genome that were
and adult human skeletal regulatory elements ple hypothesis correction at the level of FDR < depleted for Neanderthal and Denisovan an-
are correlated at 58% (56, 65). Moreover, we only 0.05. Out of 21 of our SP traits (Hip Width: cestry, particularly for overall leg length [h2(C) =
observed enrichment in developing, but not Height, Hip Width:Shoulder Width, Arms:Legs, 0.44, p = 5.89 × 10−5] (table S27). These results
adult, tissues, suggesting that the enrichment Shoulder Width:Torso Length, Hip Width:Arms, were consistent with another analysis that
is not driven by confounders of tissue type but Shoulder Width:Height, Hip Width:Legs, Shoul- showed a depletion of Neanderthal informa-
by differences in development between the two der Width:Legs, Shoulder Width:Arms), 9 were tive markers in contrast with modern human
species. As a second line of analysis, we also ex- significantly enriched at 7 pcw at FDR < 0.05 mutations, particularly for anthropometric traits
amined enrichment of individual traits across (Fig. 5C and table S27). In addition, we saw (66), and are suggestive of purifying selection.
The proportion traits that were significantly lation scale, enabling their utility for automated arise from pleiotropic increases or decreases
enriched across both types of evolutionary anal- phenotyping as imaging data becomes more in other skeletal elements that affect overall
ysis were associated with Arms:Legs and Hip integrated into large population biobanks. height. Thus, in interpreting our results, it is
Width ratios (Fig. 5D). These results suggest Beyond methodological improvements for important to only view each of our phenotypes
that specific SPs, but not overall height or sev- biobank-scale analysis, our results provide new as proportions of height rather than directly
eral other quantitative and disease traits exam- insights into musculoskeletal biology. Despite associated with individual skeletal element
ined by us or Sohail (56), underwent human more than a century of work in genetics in- lengths themselves.
lineage-specific evolution since the separation vestigating the development of limbs and the Epidemiological studies indicate that OA of
of humans from the great apes. overall body plan, a comprehensive genetic map the hip and the knee frequently do not occur
of variation that shapes the overall skeletal form together or in combination with OA in other
Discussion has been absent. Specifically, which genes and large joints, suggesting that local factors are
In this study, we used deep learning to under- how their expression regulates modular devel- important in OA pathogenesis (75–80). Speci-
stand the genetic basis of skeletal elements opment of the forelimb, hindlimb, and other fic abnormalities in skeletal morphology are
that make up the human skeletal form using long bones have not been fully characterized. now recognized as major biomechanical risk
DXA imaging data in a large population-based Additionally, whether natural selection has factors for the development of OA (81–86).
biobank. We carried out genetic correlation acted on these genes to alter the development The findings presented here of the association
and factor analysis to characterize the joint of limb proportions, thus allowing us to walk between specific SPs, but not overall height,
genetic architecture of these skeletal traits. upright, is still unknown. Our work provides and joint-specific OA highlight the biomechanical
We identified 145 independent genetic loci asso- a genotype-to-phenotype map of SPs and lays role that these proportions play in shaping
ciated with SPs. We then showed that OA of the the foundation for future assays of the genes stresses on the joints themselves and highlight
hip and knee are associated with specific SPs discovered to understand how they contrib- specific risk factors of clinical relevance.
that comprise each of those joints. Lastly, we ute functionally to overall phenotype. Across both types of evolutionary analyses,
performed an analysis to link SPs with regions The moderate genetic correlations (a maxi- the most significant SP traits were those asso-
deep-learning models (31, 35) used for classi- 370, 20140063 (2015). doi: 10.1098/rstb.2014.0063; 25. Centers for Disease Control and Prevention (CDC), Prevalence
fication and landmark estimation by adding pmid: 25602067 and most common causes of disability among adults—United
2. L. Aiello, C. Dean, An Introduction to Human Evolutionary States, 2005. MMWR Morb. Mortal. Wkly. Rep. 58, 421–426
final additional training layers with limited Anatomy (Academic Press, 1990). (2009). pmid: 19407734
manual annotation. We used classification mod- 3. N. M. Young, G. P. Wagner, B. Hallgrímsson, Development and 26. A. A. Guccione et al., The effects of specific medical conditions
els to filter images that were poor in quality the evolvability of human limbs. Proc. Natl. Acad. Sci. U.S.A. on the functional limitations of elders in the Framingham
107, 3400–3405 (2010). doi: 10.1073/pnas.0911856107; Study. Am. J. Public Health 84, 351–358 (1994). doi: 10.2105/
or incorrectly cropped, and we used the land- AJPH.84.3.351; pmid: 8129049
pmid: 20133636
mark estimation model to extract 23 different 4. D. J. Futuyma, M. Kirkpatrick, Evolution (Sinauer Associates, 27. D. Chen et al., Osteoarthritis: Toward a comprehensive
IDPs that include all long-bone lengths as well ed. 4, 2017). understanding of pathological mechanism. Bone Res. 5, 16044
5. G. A. Orban, F. Caruana, The neural basis of human tool use. (2017). doi: 10.1038/boneres.2016.44; pmid: 28149655
as hip and shoulder width, which we analyzed
Front. Psychol. 5, 310 (2014). doi: 10.3389/fpsyg.2014.00310; 28. K. J. Murray, M. F. Azari, Leg length discrepancy and
while controlling for height. pmid: 24782809 osteoarthritis in the knee, hip and lumbar spine. J. Can.
After filtering UKB participants and genotype 6. W. E. H. Harcourt-Smith, L. C. Aiello, Fossils, feet and the evolution Chiropr. Assoc. 59, 226–237 (2015). pmid: 26500356
data for QC, we ran GWASs using BOLT-LMM of human bipedal locomotion. J. Anat. 204, 403–416 (2004). 29. J. C. Baker-LePain, N. E. Lane, Role of bone architecture
doi: 10.1111/j.0021-8782.2004.00296.x; pmid: 15198703 and anatomy in osteoarthritis. Bone 51, 197–203 (2012).
(90) for each phenotype and estimated the 7. F. E. Grine, C. S. Mongle, J. G. Fleagle, A. S. Hammond, The doi: 10.1016/j.bone.2012.01.008; pmid: 22401752
heritability and genetic correlations of these taxonomic attribution of African hominin postcrania from the 30. See supplementary materials.
traits with each other using GCTA (91). To fur- Miocene through the Pleistocene: Associations and 31. K. Sun, B. Xiao, D. Liu, J. Wang, “Deep high-resolution representation
assumptions. J. Hum. Evol. 173, 103255 (2022). doi: 10.1016/ learning for human pose estimation” in Proceedings of the IEEE
ther investigate the joint genetic architecture of Computer Society Conference on Computer Vision and Pattern
j.jhevol.2022.103255; pmid: 36375243
skeletal traits, we used genomic SEM to analyze 8. J. Garcia-Fernàndez, The genesis and evolution of Recognition (IEEE, 2019), pp. 5686–5696.
the genetic factor structure of the limb and body homeobox gene clusters. Nat. Rev. Genet. 6, 881–892 (2005). 32. J. Deng et al., “ImageNet: A large-scale hierarchical image
doi: 10.1038/nrg1723; pmid: 16341069 database” in 2009 IEEE Conference on Computer Vision and
measurements independent of height. Moving Pattern Recognition (IEEE, 2010), pp. 248–255.
9. R. L. Johnson, C. J. Tabin, Molecular models for vertebrate
forward, we focused our remaining analyses limb development. Cell 90, 979–990 (1997). doi: 10.1016/ 33. T.-Y. Lin et al., “Microsoft COCO: Common Objects in Context” in
on limb and body measurements as ratios of S0092-8674(00)80364-5; pmid: 9323126 Computer Vision – ECCV 2014, Lecture Notes in Computer Science,
10. G. Struhl, A homoeotic mutation transforming leg to antenna in vol. 8693, D. Fleet, T. Pajdla, B. Schiele, T. Tuytelaars, Eds.
height (30). (Springer, 2014), pp. 740–755.
Drosophila. Nature 292, 635–638 (1981). doi: 10.1038/
We used GCTA-COJO (92) followed by link- 34. M. Andriluka, L. Pishchulin, P. Gehler, B. Schiele, “2D human
48. Z. Liu et al., An adhesion G protein-coupled receptor is 69. M. Frysz et al., Machine learning-derived acetabular dysplasia experimental assessment of Allen’s rule. J. Hum. Evol. 53, 286–291
required in cartilaginous and dense connective tissues to and cam morphology are features of severe hip osteoarthritis: (2007). doi: 10.1016/j.jhevol.2007.04.005; pmid: 17706271
maintain spine alignment. eLife 10, e67781 (2021). Findings from UK Biobank. J. Bone Miner. Res. 37, 1720–1732 90. P. R. Loh et al., Efficient Bayesian mixed-model analysis
doi: 10.7554/eLife.67781; pmid: 34318745 (2022). doi: 10.1002/jbmr.4649; pmid: 35811326 increases association power in large cohorts. Nat. Genet. 47,
49. GTEx Consortium, Genetic effects on gene expression across 70. R. Robinson et al., Automated quality control in image 284–290 (2015). doi: 10.1038/ng.3190; pmid: 25642633
human tissues. Nature 550, 204–213 (2017). doi: 10.1038/ segmentation: Application to the UK Biobank cardiovascular 91. J. Yang, S. H. Lee, M. E. Goddard, P. M. Visscher, GCTA:
nature24277; pmid: 29022597 magnetic resonance imaging study. J. Cardiovasc. Magn. Reson. 21, A tool for genome-wide complex trait analysis. Am. J. Hum.
50. T. Funck-Brentano, M. Nethander, S. Movérare-Skrtic, 18 (2019). doi: 10.1186/s12968-019-0523-x; pmid: 30866968 Genet. 88, 76–82 (2011). doi: 10.1016/j.ajhg.2010.11.011; pmid:
P. Richette, C. Ohlsson, Causal factors for knee, hip, and hand 71. F. Alfaro-Almagro et al., Image processing and Quality 21167468
osteoarthritis: A Mendelian randomization study in the UK Control for the first 10,000 brain imaging datasets 92. J. Yang et al., Conditional and joint multiple-SNP analysis of
Biobank. Arthritis Rheumatol. 71, 1634–1641 (2019). from UK Biobank. Neuroimage 166, 400–424 (2018). GWAS summary statistics identifies additional variants
doi: 10.1002/art.40928; pmid: 31099188 doi: 10.1016/j.neuroimage.2017.10.034; pmid: 29079522 influencing complex traits. Nat. Genet. 44, 369–375 (2012).
51. T. Ge, C. Y. Chen, Y. Ni, Y. A. Feng, J. W. Smoller, Polygenic 72. M. Marchini et al., Impacts of genetic correlation on the doi: 10.1038/ng.2213; pmid: 22426310
prediction via Bayesian regression and continuous shrinkage independent evolution of body mass and skeletal size in 93. S. Purcell et al., PLINK: A tool set for whole-genome association
priors. Nat. Commun. 10, 1776 (2019). doi: 10.1038/s41467- mammals. BMC Evol. Biol. 14, 258 (2014). doi: 10.1186/s12862- and population-based linkage analyses. Am. J. Hum. Genet. 81,
019-09718-5; pmid: 30992449 014-0258-0; pmid: 25496561 559–575 (2007). doi: 10.1086/519795; pmid: 17701901
52. D. Richard et al., Evolutionary selection and constraint on human 73. H. Aschard, B. J. Vilhjálmsson, A. D. Joshi, A. L. Price, P. Kraft, 94. C. A. de Leeuw, J. M. Mooij, T. Heskes, D. Posthuma, MAGMA:
knee chondrocyte regulation impacts osteoarthritis risk. Cell 181, Adjusting for heritable covariates can bias effect estimates in Generalized gene-set analysis of GWAS data. PLOS Comput.
362–381.e28 (2020). doi: 10.1016/j.cell.2020.02.057; pmid: 32220312 genome-wide association studies. Am. J. Hum. Genet. 96, 329–339 Biol. 11, e1004219 (2015). doi: 10.1371/journal.pcbi.1004219;
53. K. Xu, E. E. Schadt, K. S. Pollard, P. Roussos, J. T. Dudley, (2015). doi: 10.1016/j.ajhg.2014.12.021; pmid: 25640676 pmid: 25885710
Genomic and network patterns of schizophrenia genetic 74. F. R. Day, P. R. Loh, R. A. Scott, K. K. Ong, J. R. B. Perry, A 95. C. Palazzo, C. Nguyen, M. M. Lefevre-Colau, F. Rannou,
variation in human evolutionary accelerated regions. Mol. Biol. robust example of collider bias in a genetic association study. S. Poiraudeau, Risk factors and burden of osteoarthritis. Ann.
Evol. 32, 1148–1160 (2015). doi: 10.1093/molbev/msv031; Am. J. Hum. Genet. 98, 392–393 (2016). doi: 10.1016/j. Phys. Rehabil. Med. 59, 134–138 (2016). doi: 10.1016/j.rehab.
pmid: 25681384 ajhg.2015.12.019; pmid: 26849114 2016.01.006; pmid: 26904959
54. K. S. Pollard et al., Forces shaping the fastest evolving 75. A. S. Nicholls et al., The association between hip morphology 96. E. Kun, Human-skeletal-form. Zenodo (2023); https://doi.org/
regions in the human genome. PLOS Genet. 2, e168 (2006). parameters and nineteen-year risk of end-stage osteoarthritis 10.5281/zenodo.7787839.
doi: 10.1371/journal.pgen.0020168; pmid: 17040131 of the hip: A nested case-control study. Arthritis Rheum. 63, 97. O. Smith, HARE. Zenodo (2023); https://doi.org/10.5281/
55. P. M. Thompson et al., ENIGMA and global neuroscience: A 3392–3400 (2011). doi: 10.1002/art.30523; pmid: 21739424 zenodo.7793834.
decade of large-scale studies of the brain in health and disease 76. T. M. Ecker, M. Tannast, M. Puls, K. A. Siebenrock, S. B. Murphy,
AC KNOWLED GME NTS
Editor’s summary
Many skeletal changes occurred on the path to modern humans, resulting in bipedalism but also susceptibility
to musculoskeletal diseases. Kun et al. used imaging data from more than 30,000 UK Biobank participants to
characterize skeletal proportions, assessing the genetic basis of these features, as well as their relationships to each
other. They found that limb proportions are uncorrelated with body width proportions, that there are associations
Science (ISSN ) is published by the American Association for the Advancement of Science. 1200 New York Avenue NW, Washington, DC
20005. The title Science is a registered trademark of AAAS.
Copyright © 2023 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim
to original U.S. Government Works