Professional Documents
Culture Documents
Summary
Author for correspondence: Inflorescence architecture in plants is often complex and challenging to quantify, particularly
Christopher N. Topp for inflorescences of cereal grasses. Methods for capturing inflorescence architecture and for
Tel: +1 314 587 1609
analyzing the resulting data are limited to a few easily captured parameters that may miss the
Email: ctopp@danforthcenter.org
rich underlying diversity.
Received: 9 December 2019 Here, we apply X-ray computed tomography combined with detailed morphometrics, offer-
Accepted: 23 February 2020
ing new imaging and computational tools to analyze three-dimensional inflorescence architec-
ture. To show the power of this approach, we focus on the panicles of Sorghum bicolor,
New Phytologist (2020) 226: 1873–1885 which vary extensively in numbers, lengths, and angles of primary branches, as well as the
doi: 10.1111/nph.16533 three-dimensional shape, size, and distribution of the seed.
We imaged and comprehensively evaluated the panicle morphology of 55 sorghum acces-
Key words: inflorescences, panicle, sions that represent the five botanical races in the most common classification system of the
phenomics, phenotyping, sorghum, X-ray. species, defined by genetic data. We used our data to determine the reliability of the morpho-
logical characters for assigning specimens to race and found that seed features were particu-
larly informative.
However, the extensive overlap between botanical races in multivariate trait space indicates
that the phenotypic range of each group extends well beyond its overall genetic background,
indicating unexpectedly weak correlation between morphology, genetic identity, and domes-
tication history.
(a) X-ray imaging 3D reconstruction 2D projection Seeds Seedless panicle Feature extraction
3D panicle features
2D panicle features
+ + Seed morphology
Branch features
+ +
Fig. 1 Pipeline to computationally dissect and quantify Sorghum bicolor panicles from three-dimensional (3D) X-ray imaging. (a) From X-ray imaging, a 3D
reconstruction and a composite side-view two-dimensional (2D) projection of the panicle were generated. Each individual seed is then identified and
digitally removed to form a seedless panicle. Five categories of features can be extracted: 3D panicle features, 2D panicle features, seed morphology, seed
spatial features, and branch features. (b) Pipeline for extracting seed profile: based on the intensity of the 3D X-ray imaging, each individual seed can be
captured and quantified by fitting an ellipsoid and computing the histogram of the radius along uniformly and densely distributed directions. Seed spatial
features, such as seed number distributed along the panicle, can be measured by a curve. (c) Skeleton-based pipeline for obtaining branch features:
thresholding the X-ray imaging generates a shape that is thinned to its curve skeleton. Then, geometric algorithms are used to extract the main stalk and
primary branches, and their corresponding measurements.
Panicle2DAspectRatio, Panicle2DConvexHullArea, Pani- (Table S2). Internodes were defined as the distance between two
cle2DSolidity, Panicle2DPerimeter, Panicle2DDepth, and Pani- adjacent primary branch nodes, regardless of whether these intern-
cle2DCircularity. This 2D projection contains similar ode sections are elongated (e.g. panicles with more evenly dis-
information to that in conventional 2D photograph imaging, tributed nodes, or panicles with clustered nodes resembling
except color (Table S2). whorls).
To extract seed morphological features, a fixed and learned Additional information and detailed feature calculations are
upper threshold value was used to segment the seeds. Any small described in Methods S1. Raw extracted measurements from
piece less than one-eighth of the mean seed volume was treated as every sample are available in Table S3.
noise and removed. The shape of each seed was also used to
remove outliers, such as very long pieces. When seeds touched
Statistical analysis
each other, an erosion and dilation step in which seeds were
shrunk and re-expanded (also known as morphological opening; For statistical analysis, phenotypic values of replicates were first
Fig. S2b) was applied to identify each individual seed. Seed mor- averaged for each accession (unless otherwise indicated). For dif-
phological features included SeedNumber, SeedNumber/Depth, ferences in means in univariate traits between botanical races, a
SeedTotalVolume, and SeedAvgVolume. Other seed morphology Kruskal–Wallis test was first performed; traits with significant
features were computed from 100 sampled seeds (Table S2). For differences between any of the botanical races were then tested
this, each panicle was divided into 10 bins along the rachis, and between every pairwise combination of races using a Mann–
within each bin 10 seeds were randomly picked from those whose Whitney U test. Similarly, for homogeneity of variances between
volume was larger than the 25th percentile (to avoid measuring botanical races in univariate traits, a Bartlett test was first per-
underdeveloped seeds). SeedElongation and SeedFlatness were formed; traits with significant differences in variance between any
measured in two different ways: (1) computing the ratio of the of the botanical races were then tested between every pairwise
voxel variation, and (2) fitting an ellipsoid. We also treated the combination of races with the Brown–Forsythe test using the
distance between the ellipsoid and seed (SeedEllipsoidError) as a leveneTest function with center = ‘median’ in the R package CAR
seed morphology feature (Table S2). Since seeds are nearly ball- (Fox & Weisberg, 2019). For each statistical test, P-values across
like shapes, we computed the histogram (SeedshapeRhist1-10) of all phenotypes were adjusted together using the Benjamini–
the radius along uniformly and densely distributed directions Hochberg method. For distribution (binned) traits, each distri-
(Figs 1b, S3). We identified and recorded the centroid position bution was treated as a 10-dimensional vector. To determine sta-
of each seed, allowing us to calculate the 3D spatial distribution. tistical significance between the distributions of any two races,
The distributions of seed number, total seed volume, and average 1000 random permutations were run. At each permutation, the
seed volume along the panicle are discretized as SeedNum- mean vector was calculated for each race, and the distance (Xi,
bervhist1-10, SeedBiomassvhist1-10, and SeedSizevhist1-10 where i is a single permutation) between mean vectors of
(Figs 1b, S3). The distances between seeds and the main stalk two races was calculated by using the L2 metric. The P-value was
permitted computation of three derived traits: MinDis- calculated as the proportion of sampled permutations with
tanceSeedMainStalk, AvgDistanceSeedMainStalk and MaxDis- Xi > X0, where X0 is the original difference in means between the
tanceSeedMainStalk. two races.
Rachis, primary branch, and node features were extracted PCA and hierarchical cluster analysis were performed in MAT-
through a computational pipeline involving the medial axis. Each LAB using functions pca and clustergram. The R function cor.mtest
seedless panicle volume was downsampled by 2, and then an inten- and package CORRPLOT (Wei & Simko, 2017) were used for
sity threshold was applied in order to eliminate noise and generate Spearman correlation significance tests and correlation matrix
a 3D shape (Fig. S4). A hole-filling algorithm was applied to the visualization. The function lda in R package MASS (Venables &
shape in order to consolidate the stem into one single skeleton Ripley, 2002) was used for linear discriminant analysis (LDA)
curve when the medial axis is computed. The trait with a jackknifed ‘leave-one-out’ cross-validation (LOOCV)
MainStalkDiameter is computed as a by-product during this step. method on a per-accession basis to calculate classification accu-
This shape was thinned to its 2D medial axis using the Voxelcore racy. The function mantel in R package VEGAN (Oksanen et al.,
method (Yan et al., 2018) and further down to a one-dimensional 2018) was used to perform Mantel tests, which compares the cor-
(1D), noise-pruned, curve skeleton using the erosion thickness relation between two matrices. This test was used to detect possi-
method (Yan et al., 2016). The resulting skeleton is also equipped ble genotype–phenotype associations, with genotype being
with a radius measure. The stem is identified by applying hysteresis represented by the kinship matrix from Shakoor et al. (2016).
thresholding of the radius measure to the skeleton to obtain a thick For the other classification methods, parameter tuning and
region, and then finding a high radius path through this region LOOCV were performed using the R package CARET (Kuhn
(Fig. S5). The junctions along the stem form nodes, from which a et al., 2019). For a support vector machine (SVM), the classifica-
search algorithm is applied to identify candidate primary branches tion function within the R package KERNLAB (Karatzoglou et al.,
(Fig. S6). From the stem, node, and branch identification steps are 2004) was used with method = ‘svmRadial’ (which outperformed
derived seven traits: 1°BranchNumber, 1°BranchNumber/Depth, ‘svmLinear’ or ‘svmPoly’) and a parameter search including
First1°BranchLength, 1°BranchAvgLength, LongestIntern- sigma = seq(0.001, 0.1, by = 0.001) and C = 1:30. For random
odeLength, 2ndLongestInternodeLength, and 1°BranchTipAngle forest, the classification function within the R package
RANDOMFOREST (Liaw & Wiener, 2002) was used with a parame- composite side-view 2D projected image was generated from the
ter search including mtry = 1:20 and ntree = 1000. For naive 3D reconstruction to measure 2D shape and size, features that
Bayes, the classification function within the R package NAIVEBAYES more closely resemble traits typically measured in conventional
(Majka, 2019) was used with a parameter search including photographic imaging of sorghum panicles (Fig. 1a).
laplace = c(0,1) and usekernel = c(T,F). For k-nearest neighbors, A major advantage of using XRT for inflorescence phenotyp-
the classification function within the R package KKNN (Schliep ing is the simultaneous capture of seeds and their 3D shape pro-
et al., 2016) was used with a parameter search including file. To identify individual seeds, a threshold (a grayscale value
kmax = 1:30, distance = 1:10, and kernel = ‘optimal’. that is generally larger than branch values and smaller than seed
The k-means clustering for two, three, four, or five clusters was values) was first used to segment the seeds relative to branch seg-
performed using the base R kmeans function, with the parameter mentation. When seeds touched each other, seeds were computa-
nstart = 25. Average cluster silhouette values and visualization of tionally shrunk and re-expanded to identify a morphological gap
k-means clustering were generated using the fviz_nbclust and separating individual seeds. Seed morphological features were cal-
fviz_cluster functions, respectively, from the R package FACTOEX- culated by various approaches, including fitting an ellipsoid to
TRA (Kassambara & Mundt, 2017). the 3D image (see details in the Materials and Methods section).
Since seeds are roughly spherical, we computed the histogram of
the radius along uniformly and densely distributed directions to
Workstation, computational time, and code availability
extract finer details in seed shape than 2D images can capture
We ran our pipeline on two different workstations. Depending (Figs 1b, S3a). Because the positions of all the seeds were
on the dimension of the image volumes, it took between recorded, we were also able to compute seed spatial features, such
c. 20 min and several hours to calculate all the digitally measured as the distribution of seed number, total seed volume, average
traits, excluding branch traits, on an x64-based PC with an Intel seed volume along the panicle (Figs 1b, S3b), and the distances
Xeon E5-2630 CPU at 2.20 GHz. For primary branch and rachis between seeds and the main stalk (rachis).
detection and trait extraction, the running time ranged between 2 Next, we developed a skeleton-based method based upon the
and 5 min on an x64-based PC with an Intel Xeon W3690 CPU medial axis (Blum, 1967) to measure features related to the
at 3.47 GHz. rachis, primary branches, and nodes after digitally removing the
The full 3D imaging dataset for this work can be downloaded seeds (seedless panicle; Figs 1c, S7). The thin, homotopy-preserv-
from: https://www.danforthcenter.org/scientists-research/princ ing properties of the skeleton allow our method to efficiently
ipal-investigators/chris-topp/resources. All code used for image obtain morphological branch traits while preserving the topology
processing, digital feature extraction, and statistical analysis from of the digital panicle shape; the thickness measure along the
this study can be found at the following GitHub repository: skeleton was then used to identify the main stalk, along which
https://github.com/Topp-Roots-Lab/3D-Sorghum-Inflorescence nodes were identified. The nodes, which form junctions within
. the skeleton, were used as starting points for a search algorithm
to identify primary branches. Branch features were then derived
from the identified stem, nodes, and primary branches.
Results
In total, our methods retrieved 77 inflorescence and seed fea-
tures (37 stand-alone and 40 distributional), more than any other
Three-dimensional imaging and digital phenotyping of
quantitative study on sorghum inflorescence phenotype to date; a
sorghum inflorescences using X-ray tomography
full list of features, detailed descriptions, and sample measure-
To quantify inflorescence traits and their variation among the five ments are shown in Methods S1 (Tables S2, S3).
main botanical races of sorghum, we imaged individual panicles
using XRT (Fig. S1). The average scan time was approximately
Phenotypic validation and relationships
10 min per sample, making this approach amenable to moderate
throughput. From the Sorghum Association Panel (Casa et al., To compare and validate the traits that could be estimated manu-
2008), 34 accessions were selected for imaging with two field ally, we measured 10 traits by hand for 20 individual panicle
replicates each. Subsequently, an additional 21 accessions were samples (c. 1.5 h per sample, Methods S1) and calculated the
imaged with a single replicate each, for a total of 55 sorghum coefficient of determination R2 with their digital counterparts
accessions included in this study, consisting of nine or more (Fig. 2). Three traits commonly and easily measured in breeding
accessions from each botanical race. Accessions were chosen solely and genetic studies – seed number, panicle depth, and panicle
based on high genetic support for identity within their respective width – were highly consistent between manual and digital meth-
botanical races (Table S1), not by any phenotypic criteria, ods, with R2 values of 0.98, 0.94, and 0.92, respectively. Main
thereby selecting genotypes that should be unambiguously stalk (rachis) diameter had a weaker correspondence between
among the best genetic representations of each botanical race. manual and digital methods (R2 = 0.81), likely due to the rachis
Three-dimensional panicle features, including global shape and cross-section often being more ellipsoidal or irregular rather than
size, were then computed in MATLAB using the image volumes a perfect circle, creating uncertainty in the manual measurements.
generated from an automated reconstruction algorithm after The remaining manually measured features related to primary
adjusting the grayscale intensity (Fig. S2a). For comparison, a branch or internode traits and had R2 values between 0.64 and
Seed number Panicle depth (mm) Panicle width (mm) Main stalk diameter (mm)
R² = 0.98 R² = 0.94 R² = 0.92 14 R² = 0.81
300
3000 150 12
250
10
2000 200 100
8
150
1000 6 RMSEn = 0.05
RMSEn = 0.06 RMSEn = 0.01 50 RMSEn = 0.05
1000 2000 3000 150 200 250 300 50 100 150 6 8 10 12 14
1° Branch number First 1° Branch length (mm) 1° Branch avg length (mm) 1°BranchTipAngle (◦)
Manual traits
40 50
50 20
RMSEn = 0.01 RMSEn = 0.11 RMSEn = 0.01 RMSEn = 0.04
20 0
20 40 60 80 100 0 50 100 150 50 100 150 20 60 100
0.92. Here again, traits that could more easily be measured by instances of lower correspondence being explainable by the afore-
hand (e.g. primary branch tip angle) showed stronger correspon- mentioned challenges, primarily among manual methods.
dence between manual and digital methods (R2 = 0.92), whereas Next, we computed the relationship between each pair of the
more tedious or difficult measurements such as longest internode 77 digital traits using Spearman’s rank correlation coefficient q,
length or second longest internode length (where exhaustive with each accession being an observation. From this, we con-
manual searches over the entire rachis are infeasible) showed firmed several intuitive or expected correlations, such as between
lower correspondence, with R2 values of 0.85, and 0.84 respec- 3D panicle width and 3D panicle convex hull volume (q = 0.95,
tively. Indeed, most of the digital measurements are likely more P = 1.18 9 10 28), and between panicle flatness and panicle
accurate than manual measurements, except for primary branch width (q = 0.49, P = 1.32 9 10 4), which implies flat, wide
number and first primary branch length, where the digital mea- panicles are more common than round, broad ones (Fig. S8).
surement has relatively higher uncertainty due to challenges in Total seed number was positively correlated with total seed vol-
identifying and segmenting all primary branches. Similarly, aver- ume (q = 0.75, P = 3.16 9 10 11), but slightly negatively corre-
age primary branch length showed a lower correspondence lated to average seed volume (q = 0.31, P = 0.02), indicating a
(R2 = 0.64) possibly due to difficulties in both digital and manual generally inverse relationship between seed number and seed size
measurements, and manual measurements took the average of in sorghum. On average, seeds also tended to be farther from the
nine random branches as opposed to the larger proportion of rachis when the panicle is longer (q = 0.71, P = 1.02 9 10 9).
branches measured by the digital method. Overall, however, the We also found relationships between traits that, though not sur-
consistency between digital traits and manually measured traits is prising, are difficult to measure manually, such as between the
high enough to give confidence in the measurements, with most longest internode length and average seed-to-rachis distance
(q = 0.47, P = 3.35 9 10 4), and the inverse correlation between Archetypal panicle morphology for major sorghum
3D panicle solidity and 3D panicle convex hull volume botanical races
(q = 0.83, P = 9.26 9 10 15). Finally, our measurements also
Each of the five major botanical races is commonly associated
indicate that seed shape is generally independent of other panicle
with a stereotypical inflorescence phenotype, sometimes related
features.
to its most prevalent growing conditions (Harlan & De Wet,
Giving confidence to our digital measurements, correlations
1972). For example, Guinea panicles are frequently ‘open’ in
between 3D features and the closest analogous 2D features were
overall architecture and thus presumably more resistant to mold
high, such as in the case of 3D convex hull volume and 2D con-
or pests in wet conditions, whereas Caudatum and Durra pani-
vex area (q = 0.95, P = 5.45 9 10 29), or 3D panicle solidity and
cles are frequently compact due to drier conditions and therefore
2D panicle solidity (q = 0.95, P = 1.78 9 10 29, Fig. S8). A rep-
potentially more amenable to selection purely for high yield. To
resentative variety of topological structures is shown in Fig. S7.
provide a comprehensive empirical evaluation of these associa-
The primary branches and rachis, shown in different colors, are
tions, we aggregated the genotypes by genetically defined botani-
computed using our skeleton-based method. The negative corre-
cal race and compared trait averages and variances (Figs 3, S9) for
lation between primary branch density and panicle depth
the 37 stand-alone traits (i.e. not a distribution across the panicle
(q = 0.62, P = 3.56 9 10 7) is verified by the common observa-
or seed sections). For 14 of these, at least one of the five botanical
tion of panicles that were either short and dense or long and
races was significantly different from the others (Kruskal–Wallis
sparse. Denser primary branches correlated with shorter intern-
test and Mann–Whitney U test, P < 0.05; Table S4). For exam-
ode lengths (q = 0.70, P = 2 9 10 9, between primary branch
ple, within 3D panicle traits, convex hull volume differed signifi-
density and longest internode length), whereas wider panicles
cantly between Guinea and Caudatum or Durra; solidity differed
tend to have greater tip angles and lower primary branch densities
significantly between Durra and Bicolor, Guinea, or Kafir, and
along the stem. This suggests that open branches would result in
between Caudatum and Bicolor or Guinea (Fig. 3). Branch traits
wider and sparser panicles, and that panicles with greater tip
also differed between botanical races; for example, Durra was sig-
angles and branch density also have lower internode lengths.
nificantly different from the other four races in primary branch
Note that our primary branch detection algorithm performs bet-
density, first primary branch length, and longest internode
ter for wider panicles that have fewer intersections between
length. In seed traits, notable differences were identified, includ-
branches.
ing differences in seed number density (Caudatum vs Durra or
Panicle convex hull volume Panicle solidity 1° Branch number 1° Branch avg length Seed number distribution
10 6
ab a a b ab a bc c a ab apex Bicolor abc
100 Caudatum ac
3 0.2 120 Position along panicle as a percentage (%) Durra bc
20 Guinea a
80 100 Kafir b
(mm)
(mm³)
2
0.1 60 80
1 40 40
60
0 0 20
Seed number Seed avg volume Seed flatness Longest internode length
60
a b ab ab a a a b a a
25 50
3000 0.7
20 40
(mm³)
(mm)
0.6 30 80
2000 15
20
10 0.5
1000 10
base
0 200 400 600
D m
G rra
Ka a
da or
fir
D m
G ra
Ka a
da or
fir
D m
G rra
Ka a
da or
fir
D m
G ra
Ka a
da or
fir
ne
ne
ne
ne
tu
tu
tu
au col
ur
au l
tu
au col
ur
au col
C ico
u
u
ui
ui
ui
ui
Bi
Bi
Bi
B
C
C
C
Fig. 3 Distribution and variation for selected traits among the five botanical races in Sorghum bicolor. For boxplots, each dot indicates an outlier beyond
the first/third quartiles 1.5 times the interquartile range. The central horizontal line indicates the median, the bottom and top edges of the box represent
the 25th and 75th percentiles, respectively, and the whiskers extend to the most extreme nonoutlier data point. Each curve in the rightmost column
represents the average seed number distributed along the panicle from apex to the base for each race. The shadow shows the region within one SD. Colors
correspond to races; different letters indicate groups with statistically significant differences at P < 0.05.
Caudatum
seed morphology (SM) 8
Guinea
Bicolor
seed spatial feature (SS)
Durra
Kafir
branch feature (B) 6
σ
SeedSizevhist1 4
SeedSizevhist2
SeedSizevhist3
0 SeedSizevhist4 SM 2
SeedAvgVolume
PC2 (17.95%)
13%
SeedSizevhist5
SeedSizevhist6 0
SeedSizevhist7
-σ SeedSizevhist8 87%
SeedSizevhist9 –2
SeedSizevhist10
SeedshapeRhist4 SS
SeedBiomassvhist8 –4
SeedBiomassvhist10
SeedBiomassvhist9
SeedNumbervhist10 –6
SeedNumbervhist9
PanicleFlatness
SeedNumbervhist8
SeedshapeRhist9 –8
SeedshapeRhist1
SeedshapeRhist2
Panicle2DAspectRatio 3D –10
SeedshapeRhist7 2D 13% B
SeedEllipsoidError 10% 10%
SeedshapeRhist3 –10 –5 0 5 10
SeedshapeRhist8
PanicleElongation 35% 32%
MainStalkDiameter PC1 (21.52%)
1°BranchNumber SS
1°BranchNumber/Depth SM Bicolor Caudatum Durra Guinea Kafir
SeedNumbervhist5
SeedNumber/Depth
SeedNumbervhist6 (c) predicted race
SeedNumbervhist7 0.8
Panicle2DCircularity 70.9% classification
actual race
PanicleSolidity 0.6
Panicle2DSolidity accuracy rate
SeedElongation2 0.4
SeedElongation1
SeedBiomassvhist7 0.2
SeedBiomassvhist6
SeedBiomassvhist5 0
PanicleVolume 3
SeedTotalVolume
SeedBiomassvhist4
SeedBiomassvhist3
SeedBiomassvhist2
SeedBiomassvhist1 SM 2
SeedNumbervhist3
SeedNumbervhist2
SeedNumbervhist1 42%
SeedNumbervhist4 58%
SeedNumber 1
SeedshapeRhist10 SS
SeedFlatness1
LD2
SeedFlatness2
SeedshapeRhist6 0
SeedshapeRhist5
First1°BranchLength
AvgDistanceSeed-MainStalk
MinDistanceSeed-MainStalk
Panicle2DMajorAxisLength
MaxDistanceSeed-MainStalk –1
Panicle2DDepth
2ndLongestInternodeLength
LongestInternodeLength 3D
PanicleDepth 2D 21% –2
Panicle2DConvexHullArea B
PanicleMaxWidth
PanicleConvexHullVolume 32% 26%
Panicle2DPerimeter
1°BranchAverageLength 5%
16% –3
Panicle2DMinorAxisLength SM SS
1°BranchTipAngle
Panicle2DArea –4 –3 –2 –1 0 1 2 3 4
PanicleWidthDepthRatio
LD1
Fig. 4 Two-way hierarchical cluster analysis, principal component analysis (PCA), and linear discriminant analysis (LDA). (a) Two-way hierarchical cluster
analysis based on the mean value of morphological features (rows) among the five botanical races of Sorghum bicolor (columns). The heat map indicates
an increase (red) or a decrease (blue) with respect to the overall mean value of five races. The features are clustered into four main groups. The pie charts
show the composition of feature categories for each group. (b) PCA plot of all morphological features with 85% confidence ellipses. The percentage
variance for each principal component (PC) is shown in parentheses. Colors indicate botanical race. (c) LDA plot on the first eight PCs (84.3% variance)
with 85% confidence ellipses. Colors the same as in (b). The confusion matrix for predicted species is shown in the upper left corner. Jackknifed ‘leave-one-
out’ cross-validation found 70.9% classification accuracy.
relatively low, correlation (r = 0.2333, P = 0.001). To determine significantly changed either from removing the features that had
whether this can be affected by feature selection, we performed lower between-replicate consistency or by repeating the compar-
similar comparisons using increasingly stringent subsets of traits ison using only replicate 1 or replicate 2 phenotypic data
for constructing the morphological matrix, based on the coeffi- (Table S8). In conjunction with the results of PCA and LDA, this
cient of determination R2 between accession replicates result indicates that morphological differences between either the
(Table S7). Phenotypic differences among accession replicates botanical races or overall genetic background are quantitative
may be due to innate biological (i.e. developmental) variation, as rather than qualitative, and relatively modest. Instead, panicle
well as environmental variation present within the field sampling. morphology is more accurately described as a continuum of trait
However, the kinship-to-morphology Mantel test was not variation without obvious categorical distinctions.
Table 1 Results of principal component analysis–linear discriminant analysis (PCA–LDA) classification for the five botanical races in Sorghum bicolor using
different categories of features, and Mantel tests for correlation between genotype and phenotype.
Trait category No. of traits 8 PCs variance (%) Accuracy rate (%) Correlation r P-value
(a) Kinship matrix (b) All morphological distance matrix (c) Seed morphological distance matrix
Bicolor 2.5 20
10
18
2
Caudatum 16
8
14
1.5
12
Durra 6
1 10
8 4
Guinea
0.5 6
4 2
Kafir 0
2
0 0
could reflect local genetic introgression rather than broader or absence of awns may be useful for distinguishing between
genomic identity corresponding to race, potentially due to the sorghum botanical races (Harlan & De Wet, 1972). As they pre-
complex history of breeding and local adaptation in domesticated sent special challenges for segmentation, we did not include them
sorghum. Although 2D image analysis may have come to a simi- for analysis here, although future efforts may be able to incorpo-
lar conclusion for some overall shape features (such as panicle rate these features.
area), our analysis using XRT 3D imaging additionally demon- Aside from these limitations, sorghum inflorescence phenotyp-
strates a phenotypic gradient even for specific inflorescence fea- ing via XRT represents a more powerful, objective, and quantita-
tures (such as branch lengths) that cannot be reliably measured tive way of measuring panicle features compared with previous
using 2D imaging due to the occlusion of branches, which pre- approaches. From the combined methods yielding 77 total inflo-
vents accurate representation of the panicle branching hierarchy. rescence traits, we find interesting relationships between numer-
When compared with manual branch phenotyping methods, ous features, but ultimately no compelling evidence that panicle
our skeleton-based pipeline calculates morphological traits from architecture reflects botanical race. Indeed, in sorghum, botanical
branches that more completely and accurately represent each race is not a formal taxonomic rank and has no standing in a Lin-
panicle. Our computational method nondestructively captures naean classification, despite some support for genetic separation.
branches evenly distributed throughout the panicle, as opposed In any case, our results suggest that the complex history and
to hand-picked branches that may be biased. By using the interbreeding of sorghum germplasm has resulted in panicle
thinned, 1D curve structure of the skeleton, the topology of the architecture overall being mostly dissociated from botanical race.
shape and the semi-automatic annotation of the stem and pri- This is potentially advantageous for applications such as genome-
mary branches are more amenable for characterization, while wide association studies, in which population structure would be
retaining information of value. An additional advantage of digital predicted to have less of an impact on genetic mapping. Likewise,
characterization is improved objectivity, as manual measurement we found no support for alternative categorical schemes (such as
of traits such as branch angle, selection of the nine primary the common classification of panicle shapes into ‘open’, ‘closed’,
branches for average lengths, and node distances can be subjec- ‘intermediate’, etc.) across our samples, indicating that, despite
tive. Although digital methods also need individual users’ choice their convenience, they do not accurately reflect the quantitative
of parameters, the space of such choices is significantly smaller nature of inflorescence variation present in sorghum. Future
than those in manual measurements. studies can be scaled to diversity panels or other mapping popula-
In panicles with large tip angles, the longest-path approach of tions with reasonable expectations for finding potentially novel
our skeleton-based method performs exceptionally well because quantitative trait loci controlling the myriad of sorghum inflores-
of the lack of intersections between different branches within the cence features we have described here.
shape. However, for compact panicles, the path of a single branch
may appear to intersect with several others in the image. This
Acknowledgements
may cause the detected branch to wind between several actual
branches. Although branch length, curvature, tortuosity, and tip We thank Keith Duncan for guidance with XRT and Greg
angle constrain the predicted branch path to be approximately Ziegler for sharing the kinship matrix data. This project was sup-
correct, small deviations will inevitably cause parts of multiple ported by NSF grants IOS-1638507 to CNT, DBI-1759796 to
branches to be included in the same path. In future work, we plan CNT and TJ, RI-1618685 to TJ, and DEB-1457748 and IOS-
to explore methods from combinatorial optimization to better 1822330 to EAK, and NIH grant CA233303-1 to TJ. DZ was
address such topological issues, potentially by clustering ‘branch- supported in part by an Imaging Sciences Pathway fellowship
like’ parts of the skeletons together. from WUSTL. Plant growth and field support were provided by
Another advantage of XRT imaging for inflorescence pheno- Felix Fritschi at University of Missouri Columbia and Todd
typing is that all the seeds within a panicle can be segmented and Mockler at the DDPSC.
isolated with their entire 3D shape, a task that would otherwise
be laborious, if not impossible, by optical imaging without the
Author contributions
aid of robotics. Our analysis found that seed shape and size are
among the more distinctive traits of the sorghum botanical races, CNT, EAK and TJ designed the experiment. M-RS performed
thus illustrating the potential for high-resolution mapping of seed sample collection and imaging. ML, DZ and M-RS performed
traits. Additionally, within the genotypes analyzed, we found feature extraction and analysis. M-RS, ML and DZ wrote the
some support for a trade-off between seed number and seed size, manuscript with contributions from CNT, EAK and TJ. ML
a long-standing concept in biology that is often difficult to con- and M-RS contributed equally to this work.
firm empirically due to confounding factors (Smith & Fretwell,
1974; Paul-Victor & Turnbull, 2009). However, because XRT
ORCID
imaging relies on object density, information such as the color of
the seeds or glumes is not captured. Although there is little evi- Tao Ju https://orcid.org/0000-0002-1848-1012
dence that seed color is associated with botanical race, it may be Elizabeth A. Kellogg https://orcid.org/0000-0003-1671-7447
of interest in other applications, such as predicting seed tannin Mao Li https://orcid.org/0000-0002-5964-1764
content. On the other hand, glume morphology and the presence Mon-Ray Shao https://orcid.org/0000-0003-0773-7024
Christopher N. Topp https://orcid.org/0000-0001-9228- Harlan JR, De Wet JMJ. 1971. Toward a rational classification of cultivated
6752 plants. Taxon 20: 509–517.
Harlan JR, De Wet JMJ. 1972. A simplified classification of cultivated sorghum.
Dan Zeng https://orcid.org/0000-0002-2636-3729 Crop Science 12: 172–176.
Hmon KPW, Shehzad T, Okuno K. 2014. QTLs underlying inflorescence
architecture in sorghum (Sorghum bicolor (L.) Moench) as detected by
association analysis. Genetic Resources and Crop Evolution 61: 1545–1564.
References Hughes N, Askew K, Scotson CP, Williams K, Sauze C, Corke F, Doonan JH,
Barkworth ME, Capels KM, Long S, Anderton LK. 2007. Magnoliophyta: Nibau C. 2017. Non-destructive, high-content analysis of wheat grain traits
Commelinidae (in part): Poaceae (part 1). In: Flora of North America Editorial using X-ray micro computed tomography. Plant Methods 13: e76.
Committee, eds. Flora of North America, vol. 24. New York, NY, USA: Oxford Jiang N, Floro E, Bray AL, Laws B, Duncan KE, Topp CN. 2019. Three-
University Press, 1–908. dimensional time-lapse analysis reveals multiscale relationships in maize root
Barkworth ME, Capels KM, Long S, Piep M. 2003. Magnoliophyta: systems with contrasting architectures. The Plant Cell 31: 1708–1722.
Commelinidae (in part): Poaceae (part 2). In: Flora of North America Editorial Karatzoglou A, Smola A, Hornik K, Zeileis A. 2004. KERNLAB – an S4 package
Committee, eds. Flora of North America, vol. 25. New York, NY, USA: Oxford for kernel methods in R. Journal of Statistical Software 11: 1–20.
University Press, 1–781. Kassambara A, Mundt F. 2017. FACTOEXTRA: extract and visualize the results of
Bartlett ME, Thompson B. 2014. Meristem identity and phyllotaxis in multivariate data analyses. R package v.1.0.5. [WWW document] URL https://
inflorescence development. Frontiers in Plant Science 5: e508. CRAN.R-project.org/package=factoextra [accessed 6 April 2020].
Bell AD. 2008. Plant form: an illustrated guide to flowering plant morphology. Kellogg EA. 2000. The grasses: a case study in macroevolution. Annual Review of
Portland, OR, USA: Timber Press Inc. Ecology and Systematics 31: 217–238.
Blum H. 1967. A transformation for extracting new descriptors of form. In: Kellogg EA. 2007. Floral displays: genetic control of grass inflorescences. Current
Wathen-Dunn W, ed. Models for the perception of speech and visual form. Opinion in Plant Biology 10: 26–31.
Cambridge, MA, USA: MIT Press, 362–380. Kellogg EA, Camara PE, Rudall PJ, Ladd P, Malcomber ST, Whipple CJ,
Bommert P, Satoh-Nagasawa N, Jackson D, Hirano HY. 2005. Genetics and Doust AN. 2013. Early inflorescence development in the grasses (Poaceae).
evolution of inflorescence and flower development in grasses. Plant and Cell Frontiers in Plant Science 4: e250.
Physiology 46: 69–78. Koppolu R, Schnurbusch T. 2019. Developmental pathways for shaping spike
Bougourd S, Marrison J, Haseloff J. 2000. Technical advance: an aniline blue inflorescence architecture in barley and wheat. Journal of Integrative Plant
staining procedure for confocal microscopy and 3D imaging of normal and Biology 61: 278–295.
perturbed cellular phenotypes in mature Arabidopsis embryos. The Plant Kuhn M, Wing J, Weston S, Williams A, Keefer C, Engelhardt A, Cooper
Journal 24: 543–550. T, Mayer Z, Kenkel B, Core Team R, Benesty M, Lescarbeau R, Ziem A,
Brown PJ, Klein PE, Bortiri E, Acharya CB, Rooney WL, Kresovich S. 2006. Scrucca L, Tang Y, Candan C, Hunt T. 2019. CARET: classification and
Inheritance of inflorescence architecture in sorghum. TAG. Theoretical and regression training. R package v.6.0-84. [WWW document] URL https://
Applied Genetics. 113: 931–942. CRAN.R-project.org/package=caret [accessed 6 April 2020].
Buzgo M, Soltis PS, Soltis DE. 2004. Floral developmental morphology of Kyozuka J. 2014. Grass inflorescence: basic structure and diversity. Advances in
Amborella trichopoda (Amborellaceae). International Journal of Plant Sciences Botanical Research 72: 191–219.
165: 925–947. Li M, Duncan K, Topp CN, Chitwood DH. 2017. Persistent homology and the
Casa AM, Pressoir G, Brown PJ, Mitchell SE, Rooney WL, Tuinstra MR, branching topologies of plants. American Journal of Botany 104: 349–353.
Franks CD, Kresovich S. 2008. Community resources and strategies for Li M, Klein LL, Duncan KE, Jiang N, Chitwood DH, Londo J, Miller AJ, Topp
association mapping in Sorghum. Crop Science 48: 30–40. CN. 2019. Characterizing 3D inflorescence architecture in grapevine using X-
Chen D, Neumann K, Friedel S, Kilian B, Chen M, Altmann T, Klukas C. ray imaging and advanced morphometrics: implications for understanding
2014. Dissecting the phenotypic components of crop plant growth and cluster density. Journal of Experimental Botany 70: 6261–6276.
drought responses based on high-throughput image analysis. The Plant Cell 26: Liaw A, Wiener M. 2002. Classification and regression by RANDOMFOREST. R
4636–4655. News 2: 18–22.
De Wet JMJ. 1978. Systematics and evolution of Sorghum sect. Sorghum Majka M. 2019. NAIVEBAYES: high performance implementation of the naive Bayes
(Gramineae). American Journal of Botany 65: 477–484. algorithm in R. R package v.0.9.6. [WWW document] URL https://CRAN.R-
Denk W, Horstmann H. 2004. Serial block-face scanning electron microscopy to project.org/package=naivebayes [accessed 6 April 2020].
reconstruct three-dimensional tissue nanostructure. PLoS Biology 2: e329. Malambo L, Popescu SC, Murray SC, Putman E, Pugh NA, Horne DW,
Deu M, Gonzalez-de-Leon D, Glaszmann JC, Degremont I, Chantereau J, Richardson G, Sheridan R, Rooney WL, Avant R et al. 2018. Multitemporal
Lanaud C, Hamon P. 1994. RFLP diversity in cultivated sorghum in relation field-based plant height estimation using 3D point clouds generated from small
to racial differentiation. TAG. Theoretical and Applied Genetics. 88: 838–844. unmanned aerial systems high-resolution imagery. International Journal of
Dhondt S, Vanhaeren H, Van Loo D, Cnudde V, Inze D. 2010. Plant structure Applied Earth Observation and Geoinformation 64: 31–42.
visualization by high-resolution X-ray computed tomography. Trends in Plant Malcomber ST, Preston JC, Reinheimer R, Kossuth J, Kellogg EA. 2006.
Science 15: 419–422. Developmental gene evolution and the origin of grass inflorescence diversity.
Doust AN, Devos KM, Gadberry M, Gale MD, Kellogg EA. 2005. The genetic Advances in Botanical Research 44: 425–481.
basis for inflorescence variation between foxtail and green millet (Poaceae). Morris GP, Ramu P, Deshpande SP, Hash CT, Shah T, Upadhyaya HD, Riera-
Genetics 169: 1659–1672. Lizarazu O, Brown PJ, Acharya CB, Mitchell SE et al. 2013. Population
Esau K. 1977. Anatomy of seed plants. New York, NY, USA: John Wiley & Sons Inc. genomic and genome-wide association studies of agroclimatic traits in
Fox J, Weisberg S. 2019. An R companion to applied regression, 3rd edn. Thousand sorghum. Proceedings of the National Academy of Sciences, USA 110: 453–458.
Oaks, CA, USA: SAGE Publications. Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D,
Friedli M, Kirchgessner N, Grieder C, Liebisch F, Mannale M, Walter A. 2016. Minchin PR, O’Hara RB, Simpson GL, Solymos Pet al. 2018. VEGAN:
Terrestrial 3D laser scanning to track the increase in canopy height of both community ecology package. R package v.2.5-3. [WWW document] URL https://
monocot and dicot crop species under field conditions. Plant Methods. 12: e9. CRAN.R-project.org/package=vegan [accessed 6 April 2020].
Gehan MA, Kellogg EA. 2017. High-throughput phenotyping. American Journal Paul-Victor C, Turnbull LA. 2009. The effect of growth conditions on the seed
of Botany 104: 505–508. size/number trade-off. PLoS ONE 4: e6917.
Goddard TD, Huang CC, Meng EC, Pettersen EF, Couch GS, Morris JH, Perez-Torres E, Kirchgessner N, Pfeifer J, Walter A. 2015. Assessing potato
Ferrin TE. 2017. UCSF CHIMERA: meeting modern challenges in visualization tuber diel growth by means of X-ray computed tomography. Plant, Cell &
and analysis. Protein Science. 27: 14–25. Environment 38: 2318–2326.
Perreta MG, Ramos JC, Vegetti AC. 2009. Development and structure of the Fig. S1 Examples of sorghum panicle morphology.
grass inflorescence. Botanical Review 75: 377–396.
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC,
Ferrin TE. 2004. UCSF CHIMERA – a visualization system for exploratory
Fig. S2 Additional details on X-ray imaging workflow.
research and analysis. Journal of Computational Chemistry 25: 1605–1612.
Pilatti V, Muchut SE, Uberti-Manassero NG, Vegetti AC, Reinheimer R. 2019. Fig. S3 Diagrammatic description of seed distribution features.
Comparative study of the inflorescence, spikelet and flower development in
species of Cynodonteae (Chloridoideae, Poaceae). Botanical Journal of the Fig. S4 Thresholding considerations with primary branches.
Linnean Society 189: 353–377.
Prenner G, Rudall PJ. 2007. Comparative ontogeny of the cyathium in
Euphorbia (Euphorbiaceae) and its allies: exploring the organ–flower– Fig. S5 Stem identification process.
inflorescence boundary. American Journal of Botany 94: 1612–1629.
Reinheimer R, Zuloaga FO, Vegetti A, Pozner R. 2009. Diversification of the Fig. S6 Branch identification process.
inflorescence development in the PCK clade (Poaceae: Panicoideae: Paniceae).
American Journal of Botany 96: 549–564.
Rogers ED, Monaenkova D, Mijar M, Nori A, Goldman DI, Benfey PN. 2016.
Fig. S7 Primary branches and rachises of representative sorghum
X-ray computed tomography reveals the response of root system architecture to panicles.
soil texture. Plant Physiology 171: 2028–2040.
Schliep K, Hechenbichler K, Lizee A. 2016. KKNN: weighted k-nearest neighbors. Fig. S8 Trait correlation plot.
R package v.1.3.1. [WWW document] URL https://CRAN.R-project.org/pac
kage=kknn [accessed 6 April 2020].
Shakoor N, Ziegler G, Dilkes BP, Brenton Z, Boyles R, Connolly EL, Kresovich
Fig. S9 Distribution and variation for all traits among the five
S, Baxter I. 2016. Integration of experiments across diverse environments races.
identifies the genetic determinants of variation in Sorghum bicolor seed element
composition. Plant Physiology 170: 1989–1998. Fig. S10 Principal component analysis and linear discriminant
Smith CC, Fretwell SD. 1974. The optimal balance between size and number of analysis loadings.
offspring. American Naturalist 108: 499–506.
Snowden JD. 1936. The cultivated races of Sorghum. London, UK: Allard and Son.
Snowden JD. 1955. The wild fodder sorghums of the section Eu-sorghum. Fig. S11 Isometric feature mapping with all traits.
Journal of the Linnean Society of London, Botany 55: 191–260.
Stuppy WH, Maisano JA, Colbert MW, Rudall PJ, Rowe TB. 2003. Three- Fig. S12 K-mean clustering.
dimensional analysis of plant structure using high-resolution X-ray computed
tomography. Trends in Plant Science 8: 2–6.
Vegetti AC, Anton A. 1995. Some evolution trends in the inflorescence of
Methods S1 Additional methods on primary branch and rachis
Poaceae. Flora 190: 225–228. trait extraction, skeleton generation, stem identification, primary
Venables WN, Ripley BD. 2002. Modern applied statistics with S. New York, NY, branch identification and measurements, and manual panicle
USA: Springer-Verlag. measurements.
Vollbrecht E, Springer PS, Goh L, Buckler ES, Martienssen R. 2005.
Architecture of floral branch systems in maize and related grasses. Nature 436:
1119–1126.
Table S1 Germplasm and genotypes.
Wei T, Simko V.2017. R package ‘CORRPLOT’: visualization of a correlation matrix
(v.0.84). [WWW document] URL https://github.com/taiyun/corrplot Table S2 Trait descriptions.
[accessed 6 April 2020].
Whipple CJ. 2017. Grass inflorescence architecture and evolution: the origin of Table S3 Complete trait value data.
novel signaling centers. New Phytologist 216: 367–372.
Wu Z, Raven PH, Hong D, eds. 2006. Flora of China. Beijing, China: Science
Press. Table S4 P-values for univariate traits.
Wu Z, Raven PH, Hong D, eds. 2007. Flora of China illustrations. Beijing,
China: Science Press. Table S5 P-values for distribution traits.
Yan Y, Letscher D, Ju T. 2018. Voxel cores: efficient, robust, and provably good
approximation of 3D medial axes. ACM Transactions on Graphics 37: e44.
Yan Y, Sykes K, Chambers E, Letscher D, Ju T. 2016. Erosion thickness on
Table S6 Classification results from supervised methods.
medial axes of 3D shapes. ACM Transactions on Graphics 35: e38.
Zhang D, Li J, Compton RO, Robertson J, Goff VH, Epps E, Kong W, Kim C, Table S7 Coefficient of determination between replicates.
Paterson AH. 2015. Comparative genetics of seed size traits in divergent cereal
lineages represented by sorghum (Panicoidae) and rice (Oryzoidae). G3: Genes, Table S8 Kinship-to-morphology Mantel test.
Genomes, Genetics 5:1117–1128.
Zhou Y, Srinivasan S, Mirnezami SV, Kusmec A, Fu Q, Attigala L, Salas
Fernandez MG, Ganapathysubramanian B, Schnable PS. 2019. Please note: Wiley Blackwell are not responsible for the content
Semiautomated feature extraction from RGB images for sorghum panicle or functionality of any Supporting Information supplied by the
architecture GWAS. Plant Physiology 179: 24–37. authors. Any queries (other than missing material) should be
directed to the New Phytologist Central Office.
Supporting Information
Additional Supporting Information may be found online in the
Supporting Information section at the end of the article.