You are on page 1of 14

Tree Genetics & Genomes (2020) 16:10

https://doi.org/10.1007/s11295-019-1406-x

ORIGINAL ARTICLE

Genome-wide SNP discovery through genotyping by sequencing,


population structure, and linkage disequilibrium in Brazilian peach
breeding germplasm
Liane Bahr Thurow 1,2 & Ksenija Gasic 3 & Maria do Carmo Bassols Raseira 2 & Sandro Bonow 2 &
Caroline Marques Castro 2

Received: 28 May 2019 / Revised: 27 September 2019 / Accepted: 3 December 2019


# Springer-Verlag GmbH Germany, part of Springer Nature 2019

Abstract
Genotyping by sequencing (GBS) is a flexible and cost-effective strategy for genome-wide SNP discovery and high-throughput
genotyping. Here, we employed the GBS approach to explore population structure, genetic variability, and patterns of linkage
disequilibrium (LD) among 220 peach genotypes (Prunus persica), representative of the Brazilian breeding germplasm. This selected
panel is mainly composed by locally adapted peaches developed in mild winter and high relative humidity conditions, and represents a
worldwide reference of peach germplasm for low-chilling areas. A total of 93,353 SNP markers were discovered, and after filtering,
18,373 high-quality SNPs were used in analyses. Thirty-four percent of selected SNPs were located in genic regions and about 70% of
these in the coding sequence. Principal coordinate analysis (PCoA) and Bayesian clustering (fastSTRUCTURE) successfully detected
population genetic structure in our germplasm panel, supporting the identification of three main distinct subpopulations. The distribu-
tion of the genotypes within subpopulations reflected their fruit-related traits: melting and non-melting flesh cultivars. LD patterns
revealed a medium LD level in peach, with the extent of LD highly dependent on the subpopulation and genome regions. The use of
GBS-SNPs to perform genome-wide association studies (GWAS) was also exploited by using phenotypic information from five fruit
quality traits. The observed significant SNP-trait associations were always in the regions previously predicted by linkage and associ-
ation mapping analysis, providing valuable insights to incorporate the use of marker-assisted selection in the breeding program.

Keywords Prunus persica . GBS . Diversity analysis . Population genetics . LD patterns . GWAS

Introduction species (Byrne et al. 2012). It was domesticated in China more


than 4000 years ago, from where it was dispersed throughout
Peach [Prunus persica (L.) Batsch] belongs to the Rosaceae the world, becoming adapted to a wide range of climates
family and is one of the best genetically characterized fruit tree (Faust and Timon 1995).
From the late 1950s, peach breeding started in southern
Communicated by C. Dardick Brazil (Pelotas, RS at lat. 31° 42′ S, long. 52° 24′ W) with
Electronic supplementary material The online version of this article the aim to develop peaches adapted to the mild winter and
(https://doi.org/10.1007/s11295-019-1406-x) contains supplementary high relative humidity conditions (Raseira and Nakasu
material, which is available to authorized users. 2006). The raw germplasm used as the basic genetic material
for the breeding program, originated from seedling selections
* Caroline Marques Castro of locally adapted varieties, enriched by thousands of open-
caroline.castro@embrapa.br
pollinated and hybrid seeds from the North American breed-
1 ing programs (Raseira et al. 2008). At that time, only two
Programa de Pós-graduação em Agronomia, Centro de Genômica e
Fitomelhoramento, Universidade Federal de Pelotas, Pelotas, RS, cultivars were grown in southern Brazil, with a harvest season
Brazil spanning only 15 days (Raseira et al. 1992). Since then, con-
2
Embrapa Clima Temperado, Brazilian Agricultural Research siderable improvements have been made, extending this peri-
Corporation (Embrapa), Pelotas, RS, Brazil od to more than 100 days, improved disease resistance, pro-
3
Peach Breeding and Genetics, Department of Plant and ductivity, fruit quality, and adaptation (Raseira and Nakasu
Environmental Sciences, Clemson University, Clemson, SC, USA 2012).
10 Page 2 of 14 Tree Genetics & Genomes (2020) 16:10

The program has evaluated and crossed a wide range of Materials and methods
local and foreign accessions. Since its inception, several cul-
tivars were released, which are being grown extensively in the Plant material
south and southeast regions of Brazil and also being used for
breeding purposes. Hence, exploring this low-chill genetic A panel of 220 diverse peach cultivars and advanced selec-
variability is crucial to breed new peach cultivars adapted to tions [Prunus persica (L.) Batsch] conserved at Embrapa
current challenges. Therefore, this germplasm needs to be (Pelotas-RS, located at lat. 31° 42′ S, long. 52° 24′ W and an
evaluated and used to a greater extent, for development of altitude of 57 m above sea level) was used in this study
DNA-based information to support breeding decisions. (Table S1). All peach trees were grafted on seedling rootstocks
The advances in development of genomics resources, such of P. persica, spaced 3 m between trees and 5 m between the
as the availability of the whole-genome sequence (Verde et al. rows, trained as open center system and standard horticultural
2013; Verde et al. 2017) and the rapid development of next- practices were applied.
generation sequencing (NGS) technologies, have greatly im- This panel was selected to represent the whole germplasm
proved the understanding of the genetic base of important available for the breeding program, based on prior knowledge
agronomic traits in peach. DNA-based genetic markers have of contrasting phenotypes to bacterial spot and brown rot re-
been extensively used in peach to characterize different germ- sistance and tolerance to abiotic stresses such as heat tolerance
plasm panels, evaluate diversity within breeding programs, at the flowering stage and chilling requirement.
analyze population structure, and perform QTL mapping. Accessions adapted to the low and medium chill zones
Valuable genetic tools, such as 9K SNP peach array (Verde mainly composed this selected panel. In addition, cultivars from
et al. 2012), are being developed and used in whole-genome different peach breeding programs and some foreign cultivars
diversity and QTL mapping studies (Yang et al. 2013; Frett (USA, Bolivia, Spain, Italy, Canary Islands, Mexico, Japan,
et al. 2014; Micheletti et al. 2015; Fresnedo-Ramírez et al. Uruguay, and Taiwan) were also included in this study.
2015; Fresnedo-Ramírez et al. 2016), providing resources that Phenotypic data of almost all 220 accessions in the panel
will facilitate the development of new varieties. was available for five qualitatively inherited traits: 220 for
Advances in NGS technologies continuously improved the fruit hairiness (peach/nectarine), 214 for fruit flesh color
throughput and cost-efficiency, offering the possibility of (white/yellow), 217 for fruit flesh texture (melting/non-melt-
shifting from pre-defined SNP panels to direct sequencing of ing), 214 for fruit shape (round/flat), and 206 for fruit flesh
populations of interest, producing unbiased markers across the adhesion (clingstone/semi-clingstone/freestone) (Table S1).
entire genome (Barabaschi et al. 2016). In this regard, genotyp-
ing by sequencing (GBS) has become a promising approach DNA isolation, library preparation, and sequencing
for comprehensive genotyping on a genome-wide scale
(Elshire et al. 2011; Peterson et al. 2014). GBS uses enzyme- Genomic DNA was isolated from powdered, freeze-dried young
based complexity reduction and allows simultaneous marker leaves of all the 220 P. persica accessions, following the
discovery and genotyping across a whole germplasm set of miniprep method based on Dellaporta et al. (1983). DNA sam-
interest, with and without reference genomes (Elshire et al. ples were first quality tested with a NanoDrop ND-1000
2011; Poland and Rife 2012). It has been used in a wide num- Spectrophotometer (NanoDrop Technologies; Wilmington,
ber of crop species, providing new opportunities for breeders DE, USA) and then quantified using Hoechst 33258 (Sigma-
(Kim et al. 2016) and successfully applied in perennial tree Aldrich; St. Louis, MO, USA), against a λ standard DNA dilu-
fruit species such apple (Gardner et al. 2014), sweet cherry tion series, with a Synergy H-T fluorimeter (BioTek; Winooski,
(Guajardo et al. 2015), peach (Bielenberg et al. 2015), apricot VT, USA). Finally, DNA concentrations were normalized to
(Gürcan et al. 2016), japanese plum (Salazar et al. 2017; 10 ng/μl and subsequently used for library preparation.
Carrasco et al. 2018), and almond (Goonetilleke et al. 2018). Ninety-six-plex libraries comprising each 95 peach DNA
In the current study, we explored suitability of GBS as a samples and a negative control (no DNA) were prepared ac-
genome scan to discover and genotype SNPs in a panel com- cording to the “genotyping by sequencing” (GBS) method, as
posed of 220 diverse peach cultivars and advanced selections described by Elshire et al. (2011). High-quality genomic DNA
from the Brazilian Agricultural Research Corporation (100 ng) from individual samples was digested with the
(Embrapa) peach breeding program. We provide a compre- ApeKI methylation sensitive restriction enzyme, and barcode
hensive overview of population structure, variability parame- adapters were ligated to the ends of genomic DNA fragments.
ters, and detailed estimation of LD decay patterns. The use of The adapters comprised a set of 96 forward adaptors each with
GBS-SNPs to perform genome-wide association studies a unique barcode and a single common reverse adaptor.
(GWAS) was also exploited for five peach fruit traits, using Oligonucleotide sequences of the ApeKI barcode adapters,
phenotypic data accumulated across many years by the breed- used for multiplex sequencing, were those provided in
ing program. Elshire et al. (2011).
Tree Genetics & Genomes (2020) 16:10 Page 3 of 14 10

After adapter ligation, DNA samples were pooled by plate genic (exons, introns, and untranslated regions—UTRs) and
into a single library and clean-up with QIAquick PCR intergenic regions (“non-coding DNA” surrounding genes con-
Purification Kit (Qiagen; Valencia, CA, USA), according to taining promoters and other regulatory elements) was per-
the manufacturer’s instructions. Each library in duplicate was formed using custom scripts written in Perl language.
amplified by PCR (polymerase chain reaction), to selectively
enrich for those fragments in the library with adapters at both Population genetic structure
ends; PCR products were purified again, as above, and com-
bined to get more concentrated library. Finally, pooled, ampli- To investigate patterns of population structure, both principal
fied libraries were submitted to David H. Murdock Research coordinate analysis (PCoA) and a Bayesian clustering ap-
Institute (DHMRI; Kannapolis, NC, USA) for sequencing. proach were used. Multidimensional scaling (MDS) through
Aliquots of all libraries were run on the Agilent PCoA was performed on the distance matrix (1-IBS) generat-
BioAnalyzer/High Sensitivity DNA Kit (Agilent ed by PLINK v.1.9 (Purcell et al. 2007) using the “cmdscale”
Technologies, Santa Clara, CA, USA) to check for library function from R (v 3.3.1).
fragment sizes and presence/absence of unwanted peaks of In a second approach, population stratification was inferred
adapter and primer dimers. Libraries were then sequenced using fastSTRUCTURE (Raj et al. 2014). Due to the program
with single-end reads on the Illumina HiSeq2000 platform. assumptions, SNPs with more than 25% missing data and with
a MAF < 0.05 were removed, before the analysis. To avoid the
Processing of raw sequence data and SNP calling strong influence of tightly linked SNPs, LD-based SNP prun-
ing in PLINK 1.9 (Purcell et al. 2007) was also used to gen-
The multiple steps of GBS analysis pipeline were carried out erate a final subset of SNP markers that are in approximate
using default parameters implemented in TASSEL v4.0 GBS linkage equilibrium with each other. Pruning was performed
pipeline (Bradbury et al. 2007; Glaubitz et al. 2014). Filtered using the “indep” function by defining a window of 50 SNPs,
sequence reads were aligned to the peach genome (Peach with a step size of five SNPs to shift the window and a vari-
v2.0.a1, Verde et al. 2017) (https://www.rosaceae.org/species/ ance inflation factor (VIF) threshold of 2 (which recursively
prunus_persica/genome_v2.0.a1) as a reference using Bowtie removes SNPs within the sliding window if R2 > 0.5). LD-
v2.1 (Langmead and Salzberg 2012), and SNP calls were Pruned SNPs were converted to “plink bed” format using
exported as vcf files. Further, raw SNPs were imported into “make-bed” option in PLINK v.1.9 (Purcell et al. 2007) and
TASSEL v5.0 GUI (www.maizegenetics.net) for SNP and output files (.bed, .bim, .fam) used as input in the
taxa filtering based upon coverage and SNP filtering based on fastSTRUCTURE. This software implements a Bayesian
minor allele frequency (MAF). The pipeline parameters used to framework similar to STRUCTURE with variational algo-
filter the raw SNPs were as follows, MAF > 0.05 and a 25% rithms that allows inferring population structure in large
threshold for maximum missing data. SNP data sets. FastSTRUCTURE was run with a “simple
prior” option and remaining default parameters. K (number
Summary statistics and genetic variability of populations) values, ranging from 1 to 10, were tested,
and the most probable number of populations was chosen
Distribution of SNP markers across the peach genome was running the build-in script for multiple choices of K (Raj
calculated using Geno Summary analysis tool in TASSEL et al. 2014). The admixture proportions of each genotype,
v5.0 software (Bradbury et al. 2007) as well as filtering for estimated by fastSTRUCTURE, were visualized using
genotype quality and missing data using taxa and site filter DISTRUCT plots (Rosenberg 2004). Accessions were
tools. Deviations from Hardy-Weinberg equilibrium (HWE) assigned to a specific subpopulation when the estimated mem-
and MAF were assessed for each SNP using PLINK v.1.9 bership coefficients (Q) were above 0.75.
(Purcell et al. 2007). Pairwise fixation index (Fst) estimates among the subpop-
Estimations of observed heterozygosity (Ho), expected het- ulations identified by fastSTRUCTURE were calculated with
erozygosity (He), and inbreeding coefficient (F) were per- R package adegenet 2.0.1 (Jombart and Ahmed 2011), using
formed for each genotype using the “–het” option in PLINK, Nei’s (1972) distance.
as described in Anderson et al. (2010).
Linkage disequilibrium (LD)
Structural annotation of SNPs
Genome-wide LD analysis was performed in the whole panel,
The SNP physical position in base pairs (bp) identified by GBS as well as separately in the subpopulations inferred by
approach was correlated with the GFF3 file containing the fastSTRUCTURE and in the admixed genotypes. To assess
peach genome annotation (Peach v2.0. a1, Verde et al. 2017). LD, SNPs with more than 25% missing data and with a
Analysis based on the genomic distribution of each SNP within MAF < 0.05 were discarded for each of the genotype datasets.
10 Page 4 of 14 Tree Genetics & Genomes (2020) 16:10

Additionally, the extent of LD was calculated across each of the average 1.45 million single-end 100-bp reads per genotype.
eight peach chromosomes using whole panel genotype data. Three genotypes (“Linda”, “Ingo,” and c.2009.173.33)
LD was estimated using the squared correlation (r2) based showed low initial read numbers and were removed for further
on genotype allele counts between pairs of SNP markers as analysis, resulting in a final germplasm set composed of 217
implemented in PLINK v.1.9 (Purcell et al. 2007). The plink peach genotypes with all accessions having between 524,092
command was set to measure pairwise LD between 50 subse- and 3,663,145 reads.
quent SNPs among a distance of 2000 kb. Intra-chromosomal Out of the 1,324,595 high-quality filtered GBS sequence
r2 values were then plotted against the physical distance (Kb) tags, the alignment analysis indicated that 388,803 (29.35%)
and LD decay fitted using a locally weighed polynomial re- had no match within the peach reference genome (Peach
gression (LOESS) curve obtained with R software (v 3.3.1). v.2.0). Of the 70.65% GBS fragments that aligned perfectly
to the peach genome, 602,068 (45.45%) aligned to single
Genome-wide association genomic locations while 333,724 (25.20%) aligned to multi-
ple positions due to the presence of repetitive DNA.
A genome-wide marker-trait association analysis was con- The TASSEL GBS pipeline initially generated a total of
ducted to test for associations between SNPs and five peach 93,353 SNP markers, evenly distributed through the eight
fruit characteristics using the software TASSEL v5.2 major scaffolds of the peach genome, across all 217 peach
(Bradbury et al. 2007). genotypes analyzed. The number of SNPs on each scaffold
Before the analysis, SNPs with more than 25% missing ranged from 18,323 on scaffold 1 to 8447 on scaffold 5. The
data in all accessions and with a MAF lower than 0.05 were proportion of missing SNP markers ranged from 42.88% on
removed. Inferred subpopulation memberships (Q matrix) of scaffold 5 to 45.69% on scaffold 8, with a mean value of
each genotype for the identified population structure were 44.16%, whereas the proportion of heterozygous SNP
used as covariate in association analysis. The genetic related- markers ranged from 13.23% on scaffold 2 to 15.63% on
ness of the genotypes (K matrix) in the panel was computed scaffold 7, with a mean value of 14.42% (Table 1).
using the default “Centered_IBS” (identical by state) kinship Of the 93,353 SNPs, after depth and quality filtering,
method (Endekman and Jannink 2012), implemented in 18,373 SNP markers were selected, with MAF > 0.05 and a
TASSEL v5.2. call rate of 75%. These 18,373 SNPs were used for subsequent
Two types of methods were employed, a general linear analysis and correspond to an average of 80.8 SNPs/Mb (con-
model (GLM) and a mixed linear model (MLM). In the sidering peach whole-genome size ~ 227.4 Mb) which is
GLM method, the Q matrix comprising the subpopulation equivalent to 35.4 SNPs/cM of the Prunus reference map
membership estimates was used as covariate in the model to (519 cM; according to Dirlewanger et al. 2004). The greatest
avoid spurious associations. Multiple testing corrections were number of SNPs was observed on scaffold 1 (3513 SNPs),
implemented, in order to control the experiment-wise error while the lowest was detected on scaffold 5 (1780 SNPs).
rate, by running 1000 permutations. To improve statistical Chromosome 2 showed the lowest heterozygosity rate
power, in the MLM method, along with the phenotypic data, (Ho = 0.267), while the highest heterozygosity was detected
kinship matrix (K matrix) was included as a random effect on chromosome 7 (Ho = 0.313). Detailed information of the
within the model, in addition to the genotypic and Q matrix, 18,373 SNP markers is provided in Table 1. After applying
which was considered as fixed effects. Q + K matrices correct stringent filtering quality scores, the proportion of missing
the bias for both population structure and relatedness. The SNP markers decreased from 44.16 to 7.54% and the propor-
genome-wide significance threshold was set at the probability tion of heterozygous SNPs increased from 14.42 to 29.36%.
value p < 10−8. Deviations from Hardy-Weinberg equilibrium (HWE)
The comparison of GLM and MLM analysis was made were tested for all 93,353 SNP markers and 60.13% failed
using quantile-quantile (QQ) plots by showing the deviation HWE test at p < 0.001 cutoff. For the 18K SNP panel,
of the observed p values against expected values (null hypoth- 48.54% of the markers showed significant deviation from
esis). To provide a complimentary summary, Manhattan plots HWE.
were generated using R software (v 3.3.1). The site frequency analyses for all 217 peach genotypes
revealed 40% of SNPs in the 93K panel, with MAF < 0.1,
and 25% of those SNPs being rare (MAF < 0.05) (Fig. 1a).
Results The 18K SNP panel, accounted for 27% of the total SNPs,
exhibited similar distribution of SNPs per MAF category after
Sequencing and SNP identification removal of SNPs with MAF < 0.05 (Fig. 1b).
Considering all 217 peach genotypes analyzed with 18,373
Within our 220 Prunus persica accessions, Illumina sequenc- SNP markers, the observed mean heterozygosity (Ho) rate per
ing generated a total of 314.8 MB of sequence, with an genotype was 0.291, ranging from 0.133 in “Cristal-Taquari”
Tree Genetics & Genomes (2020) 16:10 Page 5 of 14 10

Table 1 Distribution of single-nucleotide polymorphisms (SNPs) across the 8 major scaffolds of the peach genome, missing and heterozygosity rate
including SNP panels with 93,353 (93K) and 18,373 (18K) SNPs, respectively

Scaffold Panel of 93,353 SNPs (93K) Panel of 18,373 SNPs (18K)

Number of SNPs Missing rate (%) Heterozygosity rate (%) Number of SNPs Missing rate (%) Heterozygosity rate (%)

1 18,323 43.08 14.19 3513 7.19 29.86


2 13,525 44.98 13.23 2568 7.32 26.70
3 10,963 44.83 14.67 2264 8.18 28.83
4 11,432 45.14 14.87 2261 7.32 30.03
5 8447 42.88 14.81 1780 7.29 30.35
6 12,037 43.31 13.89 2290 7.80 29.66
7 8950 43.38 15.63 1837 7.38 31.27
8 9676 45.69 14.05 1860 7.85 28.14
Total 93,353 – – 18,373 – –
Average – 44.16 14.42 – 7.54 29.36

to 0.462 in “Turmalina.” The average expected heterozygosity remaining 5.1% SNPs were situated in untranslated regions, 5′
(He) was 0.305, with an average inbreeding coefficient per UTR (2.98%) and 3′UTR (2.12%) (Fig. 2). Most of the allelic
individual (F = 1 − (Ho/He)) of 0.045, ranging from − 0.519 variation was structurally annotated in non-coding sequence
to 0.562. components with only 23.99% discovered in coding gene
regions.

Structural annotation of SNPs


Population structure
The structural annotation of 18,373 genome-based SNPs was
determined by comparing the physical position of each SNP Population structure and correlation among the 217 peach
against the annotated peach genome (Peach v2.0. a1). Results genotypes was addressed using two different approaches
showed 6268 SNPs (34.12%) covering 2127 genes, with an (MDS and fastSTRUCTURE). First, approximation of popu-
average frequency of 2.9 SNPs/gene, and 12,105 SNPs lation stratification was obtained using multidimensional scal-
(65.88%) in the intergenic regions (Fig. 2). ing through PCoA for all 93,353 SNPs, which provided evi-
The annotation of SNPs based on their structural compo- dence of genetic variation among genotypes in accordance
nents of genes revealed a higher percentage of SNPs in exonic with morphological traits related to the fruit flesh type: melt-
regions (23.99%), followed by intronic regions (5.03%). The ing and non-melting genotypes (Fig. 3).

Fig. 1 Frequency distribution of the minor alleles (MAF) in 217 peach genotypes based on a 93,353 SNPs and b 18,373 SNPs genotype datasets,
respectively
10 Page 6 of 14 Tree Genetics & Genomes (2020) 16:10

At K = 3, it was clear that the distribution of the genotypes in


populations reflected the fruit-type characteristics. Increasing the
number of populations (K) from three to five had almost no
difference between K values and maintained the membership in
the initial three populations almost invariable with additional
populations empty under a membership coefficient above 0.75.
Genetic differentiation between the populations inferred by
fastSTRUCTURE was tested using Fst statistics estimated from
pairwise analysis. Pairwise Fst values ranged from 0.011 (be-
tween POP I and ADM genotypes) to 0.107 (between POP I and
POP II) (Table S2). Summarized statistics of cultivar genetic
variability, including observed and expected heterozygosity
rates and inbreeding coefficients, were also provided for each
population and for the entire germplasm panel, with all 217
genotypes (Table S3). POP II was removed before analysis,
since it had just one genotype assigned. POP I exhibited a higher
level of heterozygosity (Ho = 0.286) than POP III (Ho = 0.259).
The average inbreeding coefficient (F) was 0.061, 0.15, and −
Fig. 2 Distribution of SNPs on the basis of their structural occurrence in 0.06 for POP I, POP III, and ADM, respectively.
different genomic regions. SNPs were categorized using the physical
location of each SNP on Peach v2.0.a1
Linkage disequilibrium (LD)

In a second approach, to obtain a more detailed stratifica- The LD estimates (measured as r2) and extent of LD decay
tion in the germplasm panel, we used the software were calculated using SNP markers with MAF > 0.05 and less
fastSTRUCTURE. Patterns of population structure were eval- than 25% missing data. Pairwise r2 was measured using 17,505
uated using 5378 genome-wide and unlinked SNPs which, SNP markers for POP I, 14,098 for POP III and 24,276 for
based on model complexity that maximizes marginal likeli- ADM genotypes. LD was also calculated for the entire panel
hood, estimated the most likely number of populations at K = using 18,373 SNP markers and within chromosomes of the
3 (Fig. 4). entire panel. LD was not estimated for POP II due to the limited
Considering the grouping threshold of Q > 0.75, the three number of genotypes assigned to this population.
inferred subpopulations were POP I (78 genotypes), POP II (1 The average value estimated of intra-chromosomal r2 was
genotype), and POP III (62 genotypes). The remaining 76 0.093 in POP I, 0.101 in POP III, and 0.099 in ADM geno-
genotypes were classified as admixed (ADM), since they types. On the other hand, the average value for inter-
had less than 75% shared ancestry with one of the three main chromosomal r2 was smaller, showing values of 0.020 and
distinct subpopulations. Individual’s assignment to these sub- 0.027 for POP I and POP III, respectively.
populations is provided in Table S1. On the average, intra-chromosomal LD displayed differen-
Both PCoA and fastSTRUCTURE analyses revealed tial patterns among different populations declining below
population grouping based mainly on the fruit traits of ac- 0.2 at around 46.5 Kb in POP I, at 63.2 Kb in POP III, and
cessions: melting and non-melting flesh (Fig. 3b). POP I at 34.3 Kb in ADM genotypes (Fig. 5). For the entire germ-
accounts for the majority of the melting genotypes. Such plasm panel, this critical r2 value was observed within a dis-
cultivars are used mainly for fresh market and include ad- tance of about 38 Kb (Fig. 5d).
vanced selections and cultivars released by Embrapa breed- Different LD patterns have also been observed between the
ing program (51.3%), cultivars from Agronomic Institute of eight peach chromosomes (Fig. S1). LD declined below 0.2
Campinas (IAC) (3.8%), and majority of the introductions over the shortest distance on chromosome 5 (around 23 Kb)
from North American peach breeding programs (30.8%), as while the greatest distance for LD decay was observed on
well as additional few cultivars from Japan, Mexico, Spain, chromosome 4 (76 Kb) which decayed to its half value (r2 =
Italy, and Taiwan (10.2% in total) and genotypes with un- 0.1) at around 240 Kb.
known origin (3.8%). POP II included only “Mollares
Hierro,” the melting peach from the Canary Islands, while Genome-wide association (GWAS)
POP III is comprised by majority of the non-melting culti-
vars and advanced selections bred for processing purpose GWAS was carried out using 18,373 SNP markers for the
by Embrapa’s program (90.3%), few accessions from IAC following five fruit quality traits: fruit hairiness, flesh color,
(4.8%), and few introduced from Bolivia (4.8%). flesh texture, flesh adhesion, and fruit shape.
Tree Genetics & Genomes (2020) 16:10 Page 7 of 14 10

Fig. 3 Principal coordinate analysis (PCoA) based on the 93,353 genotypes with unknown fruit flesh texture. b The genotypes are
genome-wide SNPs among 217 Prunus persica genotypes. a Principal colored with respect to the three subpopulations inferred by
coordinate plot overlaid with fruit morphology traits: red represents fastSTRUCTURE analysis (membership coefficient > 0.75): POP I
melting genotypes and blue represents non-melting genotypes, based on (red), POP II (green), and POP III (blue). Light gray indicates
phenotypic evaluations by de peach breeding program, in black unstructured genotypes (ADM)

Two models were tested: (1) the GLM model with correction Fruit hairiness is one of the commercial characteristics
for population structure (Q) and (2) MLM model including the used to classify peach fruits. The recessive glabrous phe-
genetic relatedness and the genetic structure information (K + notype (nectarine) has been associated to a
Q). Both models showed good fit between observed and ex- retrotransposon insertion in the MYB gene (PpeMYB25)
pected p values for all traits tested (Fig. S2), except the MLM located on chromosome 5 (15,897,836 to 15,899,002 bp
model for fruit shape trait. Taking into account the concordant interval) resulting in a non-functional form of the MYB
performance of both models, only GWAS results from MLM transcription factor which plays a central role in trichome
will be discussed here, once this corrected for both, population formation (Vendramin et al. 2014). Of the 217 genotypes
structure and kinship relationships. SNP markers detected in sequenced by GBS, 185 were hairy (peach) and 32 gla-
concordance by both GLM and MLM models were considered brous (nectarine). Four SNPs on chromosome 5 were sig-
as high-confidence marker-trait associations. Significant asso- nificantly associated to fruit hairiness in our study (Fig. 6;
ciations at a level of genome-wide significance (p < 10−8) were Table S4). The two SNP markers with the strongest asso-
found for all the traits (Fig. 6; Fig. S3). ciation were located in the 16,748,685 bp (p value

Fig. 4 Genome-wide SNP-based


population genetic structure
among 217 Prunus persica geno-
types, inferred by
fastSTRUCTURE at K = 3. Each
genotype is shown as a vertical
bar. POP I (red), POP II (green),
and POP III (blue)
10 Page 8 of 14 Tree Genetics & Genomes (2020) 16:10

Fig. 5 Linkage disequilibrium measures (r2) against physical distance of LD decay until a 100 Kb physical distance (right-hand side). The red
between pairs of SNP markers for a POP I, b POP III, c ADM line represents the LOESS fitting curve of LD decay. The horizontal
genotypes, and d all 217 genotypes (left-hand side). A zoom-in figure dashed line indicates a fixed r2 value of 0.2

<10−26) and 16,747,980 bp (p value <10−23) correspond- flesh. We identified 19 SNP markers with strong associa-
ing to a 0.85-Mb distance of PpeMYB25. For the 32 nec- tion (p value <10−8) to flesh color. The SNPs were located
tarines, 29 were found to have an T/T genotype at the in a 4.1-Mb region on scaffold 1, flanking CCD4 gene,
position16 ,748,6 85 b p an d an C/C g enotype at with the most associated SNP (p value <10−15) located in
16,747,980 bp, while those heterozygous or homozygous the 25,721,095 bp, with a distance of approximate only
for the most frequent allele were present almost exclusive- 79.6 Kb from the CCD4 gene.
ly in peach. The traits flesh texture and flesh adhesion were previously
Another major trait of interest was fruit flesh color. At described associated to copy number variants of two
least three distinct mutational mechanisms acting on the endopolygalacturonase genes (endoPG) duplicated in tandem
carotenoid cleavage dioxygenase single gene (CCD4) on and located at the distal region on chromosome 4 (Arús et al.
chromosome 1 (25,639,445 to 25,641,500 bp interval) 2012; Gu et al. 2016). In our germplasm collection, 124 ge-
support the origin of the yellow flesh phenotype in peach notypes were classified as melting and 91 as non-melting.
(Falchi et al. 2013). Mutations disrupting CCD4 function Association peaks related to both traits were identified at the
prevented carotenoid degradation, determining the yellow bottom of chromosome 4. For fruit flesh texture, the two SNPs
flesh color. In the present study, a total of 68 genotypes with strongest the association (p value < 10−8) were located in
present fruit white flesh and 144 showed fruit yellow the 19,904,250 bp and 19,904,264 bp positions on scaffold 4
Tree Genetics & Genomes (2020) 16:10 Page 9 of 14 10

Fig. 5 (continued)

(Fig. 6; Table S4) while the highest association peak related to Discussion
fruit flesh adhesion was identified at 18,133,590 bp on the
same scaffold. This study provides the first attempt to evaluate genetic
Flat fruit shape is controlled by a single gene (S/s) mapped variation among Brazilian peach germplasm on a
on the distal end of chromosome 6, with the flat phenotype genome-wide scale. Here, we report the use of GBS, a
caused by the dominant allele in obligated heterozygosity rapid, high-throughput, and cost-effective tool to increase
(Dirlewanger et al. 2006; Picañol et al. 2013). In our germ- the breeding efficiency in perennial tree fruit species
plasm panel, only eight genotypes presented flat shape where- (Badenes et al. 2016). Furthermore, GBS offers several
as 204 showed round fruit shape. Probably the lower number advantages, since no preliminary sequence information
of flat genotypes hampered the MLM analysis, considered too is required and all newly discovered markers originate
stringent in the present study, detecting only one association from the germplasm being genotyped, thus removing as-
signal located at 20,340,046 bp on scaffold 6. The GLM meth- certainment bias (Deschamps et al. 2012). Despite its ben-
od was less stringent and detected the same marker and others efits, GBS is a low coverage sequencing technology,
13 SNP markers with p value < 10−11 located at the same which results in a high missing rate and large number of
genomic region between 20,340,046 and 22,738,626 bp SNPs with very low frequency. These issues can be
(Fig. S3). Less significant signals were also detected by the solved by increasing sequencing depth or by using data
GLM method in other genomic regions. filtration tools.
10 Page 10 of 14 Tree Genetics & Genomes (2020) 16:10

Fig. 6 Genome-wide association results for fruit hairiness, flesh color, (Q + K). The vertical axis plots the -log10(p) values of the association
flesh texture, and flesh adhesion in Brazilian peach germplasm. between the SNP markers and the respective fruit quality trait. The
Accessions were scanned with 18K SNPs using a MLM approach horizontal red line represents a genome-wide significance threshold at
taking into account both population structure and genetic relatedness the probability value p < 10−8 and the blue line a p value <10−5

We successfully applied the GBS approach to generate a consumption and canning purposes in early breeding (Byrne
high-density SNP coverage across the entire peach genome. et al. 2000; Byrne 2003; Raseira et al. 2003). The use of a
The 93,353 SNPs obtained among the 217 peach genotypes limited number of principal founders in the breeding program
constitute an important genomic resource and will facilitate has driven the concomitant formation of distinct subpopula-
peach breeding efforts. Additionally, 18,373 high-quality tions seen in our germplasm panel.
SNPs, obtained after filtering, will enable the use of POP I represents the melting peach genotypes derived from
genome-wide association studies, marker-assisted selection, locally adapted cultivars such as the founder “Delicioso” and
and genetic diversity analysis. most of the germplasm introduced from North American
Structural annotation of the 18K SNPs showed approxi- breeding programs including also the nectarine founder
mately three times less SNPs in genic regions than in “Panamint,” while POP II includes only “Mollares Hierro,”
intergenic regions, probably because genic regions are evolu- a melting peach introduced from the Canary Islands and not
tionarily more conserved compared to intergenic regions, often used by the breeding program, due to its long cycle from
which evolve faster and accumulate higher level of polymor- blooming to fruit harvest and high susceptibility to fungal
phism. However, SNPs located in intergenic regions can also diseases. On the other hand, POP III groups essentially the
be functional harboring promoters and regulatory elements. germplasm of non-melting peaches derived from locally
Complementary information obtained from PCoA and adapted cultivars including the founder varieties “Aldrighi”
fastSTRUCTURE, indicated that three distinct populations and “Abóbora,” which also grouped in this population.
define the genetic variation of the P. persica germplasm ana- The fact that the melting-flesh advanced selections and
lyzed and available at Embrapa. Population stratification cultivars released by the Embrapa program grouped with the
clearly supported the separation between melting and non- majority of the cultivars introduced from North American
melting flesh cultivars/selections. Previous studies with SSR breeding programs suggests a common gene pool for the de-
markers (Aranzana et al. 2010; Li et al. 2013; Chavez et al. velopment of fresh consumption cultivars. These results are in
2014; Thurow et al. 2017) and, more recently, using SNPs agreement with the active exchange of germplasm between
(Micheletti et al. 2015), also reported strong population strat- the breeding programs of these two countries and the use of
ification in Prunus persica with similar grouping. North American cultivars as basis in Embrapa breeding efforts
Our findings are consistent with the peach breeding history (Raseira and Nakasu 2006). In addition, the majority of the
in Southern Brazil, reinforcing the use of different genetic nectarines (20 from 32 nectarines in total) were also included
resources for the development of peach cultivars for fresh in the same population as melting flesh peaches (POP I),
Tree Genetics & Genomes (2020) 16:10 Page 11 of 14 10

suggesting no separate breeding efforts were maintained for LD estimates in this study based on 18K SNPs indicated a
these two fruit types. One of the reasons might be the small more rapid decay than previously reported in peach, declining
fruit size of nectarine under subtropical conditions, suggesting below 0.2 within a distance of about 38 Kb in the entire germ-
that the design of crosses accounted for useful variability from plasm panel. One reason for the divergence in LD patterns
both fruit types. could be the much larger number of gene regions covered in
The low mean observed heterozygosity rate per genotype our study. Differences can also be partly explained by different
for the whole germplasm panel analyzed of Ho = 0.29 was germplasm sets analyzed and the methods used for estimating
similar to that reported for a larger collection of 1580 LD decay distances.
Occidental and Oriental peach accessions, Ho = 0.28 Similar results of LD decay were achieved in a large-scale
(Micheletti et al. 2015). Furthermore, genetic variability cal- sequencing study of 84 Chinese Prunus accessions,
culated for each of the populations showed relevant differ- representing the majority of the ecotypes in the world (Cao
ences for observed heterozygosity rates between POP I et al. 2014). The LD decay was slower in both groups belong-
(Ho = 0.286) and POP III (Ho = 0.259). This could be due to ing to P. persica: ornamental peach (56 Kb) and edible peach
their genetic background. At the establishment of the Embrapa (14 Kb), when compared to the wild species group (5 Kb).
program, breeding of melting peach cultivars used as the basis The number of SNPs required for GWAS is justified by the
for several introductions from different countries and breeding LD decay over distance. Considering the LD decay of our
programs, whereas the non-melting peach breeding efforts germplasm panel (38 Kb), about 5984 SNPs covering the total
utilized mainly local varieties. Thus, the gene pool to select peach genome (227.4 Mb) should be sufficient to carry out
genotypes for processing was probably smaller than the gene association analysis. However, domestication regions contain-
pool available to select peaches for fresh consumption. In ing key genes exhibit faster LD decay than across total ge-
addition, worldwide, the number of peach breeding programs nome and more SNPs might need to be identified (Cao et al.
that breed non-melting flesh types adapted to low-chill areas is 2016).
only a few compared to melting fresh market peach breeding Furthermore, the observed variable level of LD in different
programs, limiting germplasm exchange. chromosomes might be due to artificial selection that led to the
Pairwise Fst indicated that there were expressive genetic fixation of a higher number of LD blocks especially around
differences between the populations, especially pairwise com- genes carrying important agronomic traits (Soto-Cerda and
parisons with POP II. Our results confirm that this introduc- Cloutier 2012).
tion from the Canary Islands represents a different germplasm GWAS takes advantage of the LD present between SNPs, as
source from those commonly used for breeding in Brazil, well as historical recombinations within the gene pool avail-
since this genotype did not cluster with any other genotype. able, to identify significant associations between DNA poly-
The degree of population structure influences LD pat- morphisms and trait variation (Khan and Korban 2012;
terns within the genome. A rapid LD decay was observed Varshney et al. 2014). In the present study, we performed
in all populations. As expected, LD decayed faster with GWAS analysis to validate the high quality of the GBS-SNPs
distance in the ADM population, once those genotypes identified in our peach germplasm. SNPs co-segregating with
retain variability from both melting and non-melting foun- the traits were found for all five fruit quality traits analyzed,
ders with different allele frequencies and recombination located in the expected regions, close to candidate genes previ-
rates that weaken LD (Flint-Garcia et al. 2003). POP III ously identified by linkage analysis and/or GWAS.
displayed a much slower LD decay than POP I, probably The highest associations found in the present study, related
due to the lower genetic variability archived in the first to fruit hairiness and fruit flesh color, were not only consistent
population. Inbreeding contributes to LD maintenance with previous GWAS results identified in peach germplasm
over distance, limiting the effective recombination rates. (Micheletti et al. 2015; Cao et al. 2016) but also more closely
Previous studies have measured LD decay over distance located to the MYB gene, which plays a central role in trichome
in different peach germplasm collections with different formation (Vendramin et al. 2014) and to the CCD4 gene,
low-to-medium density markers. Several studies estimated which determines the flesh color in peach (Falchi et al. 2013).
a high level of LD conservation in peach (Aranzana et al. Our results showed that the most significant SNP-trait associ-
2010; Cao et al. 2012; Font i Forcada et al. 2013; Li et al. ations related to fruit flesh adhesion and fruit flesh texture were
2013; Chavez et al. 2014). However, those authors used also consistent with previous mapped endoPG genes at the end
SSR markers; therefore, their results are not comparable of chromosome 4, controlling these traits (Árus et al. 2012).
to those reported in this study. Micheletti et al. (2015) Although the most significant association signals related to
reported average LD distance values varying from 0.8 to fruit shape discovered by our study were at the same genomic
1.8 Mb in different populations using SNP markers, with region reported by previous studies (Micheletti et al. 2015; Cao
an average decay of 1.4 Mb in Occidental accessions used et al. 2016; Tan et al. 2019), there were some differences be-
in modern breeding programs. tween the loci associated to fruit shape variance. Our GWAS
10 Page 12 of 14 Tree Genetics & Genomes (2020) 16:10

results found the highest 13 SNP-trait associations located be- commercial varieties. BMC Genet 11:69. https://doi.org/10.1186/
1471-2156-11-69
tween 20,340,046 and 22,738,626 bp on scaffold 6, while pre-
Arús P, Verde I, Sosinski B, Zhebentyayeva T, Abbott AG (2012) The
vious studies (Cao et al. 2016) found differences in fruit shape peach genome. Tree Genet Genomes 8:531–547. https://doi.org/10.
strongly associated at a locus located at 25,060,196 bp (Cao 1007/s11295-012-0493-8
et al. 2016) and at 26,924,482 bp (Tan et al. 2019) on the same Badenes ML, Fernández i Martí A, Ríos G, Rubio-Cabetas MJ (2016)
scaffold, respectively. Micheletti et al. (2015) also found an Application of genomic technologies to the breeding of trees. Front
Genet 7:198. https://doi.org/10.3389/fgene.2016.00198
evident association peak at the end of chromosome 6 Barabaschi D, Tondelli A, Desiderio F, Volante A, Vaccino P, Valè G,
(23,101,004…26,601,733 interval) related to the fruit shape Cattivelli L (2016) Next generation breeding. Plant Sci 242:3–13.
trait. These results found in distinct germplasms with different https://doi.org/10.1016/j.plantsci.2015.07.010
genetic backgrounds provide strong evidence that this genomic Bielenberg DG, Rauh B, Fan S, Gasic K, Abbott AG, Reighard GL et al
(2015) Genotyping by sequencing for SNP-based linkage map con-
region therefore regulates fruit shape in peach.
struction and QTL analysis of chilling requirement and bloom date
in peach [Prunus persica (L.) Batsch]. PLoS ONE 10:e0139406.
doi: https://doi.org/10.1371/journal.pone.0139406
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler
Conclusion ES (2007) TASSEL: software for association mapping of complex
traits in diverse samples. Bioinformatics 23:2633–2635. https://doi.
org/10.1093/bioinformatics/btm308
In the present study, we successfully applied the GBS approach
Byrne DH (2003) Founding clones of low chilling fresh market peach
for the identification of high-quality genome-wide SNPs in germplasm developed in the USA and Brazil. Acta Hortic 606:17–
Brazilian peach germplasm. Our results detected strong popu- 21
lation genetic structure, with the distribution of genotypes with- Byrne DH, Raseira MB, Bassi D, Piagnani MC, Gasic K, Reighard GL,
in populations based mainly on fruit-related traits: melting and Moreno MA, Pérez S (2012) Peach. In: Badenes ML, Byrne DH
(eds) Fruit breeding: handbook of plant breeding. Springer, New
non-melting flesh. LD patterns suggested a medium LD level in York, pp 505–569
peach germplasm with the extent of LD being highly dependent Byrne DH, Sherman WB, Bacon TA (2000) Stone fruit genetic pool and
on the populations and genome regions. its exploitation for growing under warm winter conditions. In: Erez
The high-quality identified SNPs and the described diverse A (ed) Temperate fruit crops in warm climates. Kluwer Academic
Publishers, Dordrecht, pp 157–230
peach panel can be efficiently used for GWAS taken into
Cao K, Wang L, Zhu G, Fang W, Chen C, Luo J (2012) Genetic diversity,
account the population structure and LD-based information. linkage disequilibrium, and association mapping analyses of peach
This study provides valuable insights relevant to the rapid (Prunus persica) landraces in China. Tree Genet Genomes 8:975–
identification of genomic regions associated with peach fruit 990. https://doi.org/10.1007/s11295-012-0477-8
quality using GWAS and will accelerate future breeding Cao K, Zheng Z, Wang L, Liu X, Zhu G, Fang W, Cheng S, Zeng P, Chen
C, Wang X, Xie M, Zhong X, Wang X, Zhao P, Bian C, Zhu Y,
efforts. Zhang J, Ma G, Chen C, Li Y, Hao F, Li Y, Huang G, Li Y, Li H, Guo
J, Xu X, Wang J (2014) Comparative population genomics reveals
Acknowledgments The authors are thankful to the Brazilian Agricultural the domestication history of the peach, Prunus persica, and human
Research Corporation (Embrapa) for the financial support provided for influences on perennial fruit crops. Genome Biol 15:415. https://doi.
this research and the National Council for Scientific and Technological org/10.1186/s13059-014-0415-1
Development (CNPq) and the Coordination of Improvement of Higher Cao K, Zhou Z, Wang Q, Guo J, Zhao P, Zhu G, Fang W, Chen C, Wang
Education Personnel (CAPES) for granting the doctoral scholarship of the X, Wang X, Tian Z, Wang L (2016) Genome-wide association study
first author. of 12 agronomic traits in peach. Nat Commun 7:13246. https://doi.
org/10.1038/ncomms13246
Data archiving statement The authors declare that all the work de- Carrasco B, González M, Gebauer M, García-González R, Maldonado J,
scribed in this manuscript followed the standard Tree Genetics and Silva H (2018) Construction of a highly saturated linkage map in
Genomes policy. Raw sequence data is currently being submitted to Japanese plum (Prunus salicina L.) using GBS for SNP marker
NCBI SRA and accession number will be supplied once available. calling. PLoS ONE 13:e0208032. https://doi.org/10.1371/
journal.pone.0208032
Chavez DJ, Beckman TG, Werner DJ, Chaparro JX (2014) Genetic di-
versity in peach [Prunus persica (L.) Batsch] at the University of
Florida: past, present and future. Tree Genet Genomes 10:1399–
1417. https://doi.org/10.1007/s11295-014-0769-2
Deschamps S, Llaca V, May GD (2012) Genotyping-by-sequencing in
plants. Biology 1:460–483. https://doi.org/10.3390/biology1030460
References Dellaporta SL, Wood J, Hicks JB (1983) A plant DNA minipreparation:
version II. Plant Mol Biol Report 1:19–21
Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, Dirlewanger E, Cosson P, Boudehri K, Renaud C, Capdeville G, Tauzin
Zondervan KT (2010) Data quality control in genetic case-control Y, Laigret F, Moing A (2006) Development of a second-generation
association studies. Nat Protoc 5:1564–1573. https://doi.org/10. genetic linkage map for peach [Prunus persica (L.) Batsch] and
1038/nprot.2010.116 characterization of morphological traits affecting flower and fruit.
Aranzana MJ, Abbassi E-K, Howad W, Arús P (2010) Genetic variation, Tree Genet Genomes 3:1–13. https://doi.org/10.1007/s11295-006-
population structure and linkage disequilibrium in peach 0053-1
Tree Genetics & Genomes (2020) 16:10 Page 13 of 14 10

Dirlewanger E, Graziano E, Joobeur T, Garriga-Calderé F, Cosson P, Jombart T, Ahmed I (2011) adegenet 1.3-1: new tools for the analysis of
Howad W, Arús P (2004) Comparative mapping and marker- genome-wide SNP data. Bioinformatics 27:3070–3071. https://doi.
assisted selection in Rosaceae fruit crops. Proc Natl Acad Sci U S org/10.1093/bioinformatics/btr521
A 101:9891–9896. https://doi.org/10.1073/pnas.0307937101 Khan MK, Korban SS (2012) Association mapping in forest trees and
Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, fruit crops. J Exp Bot 63:4045–4060. https://doi.org/10.1093/jxb/
Mitchell SE (2011) A robust, simple genotyping-by-sequencing ers105
(GBS) approach for high diversity species. PLoS One 6:e19379. Kim C, Guo H, Kong W, Chandnani R, Shuang L-S, Paterson AH (2016)
https://doi.org/10.1371/journal.pone.0019379 Application of genotyping by sequencing technology to a variety of
Endekman JB, Jannink J-L (2012) Shrinkage estimation of the realized crop breeding programs. Plant Sci 242:14–22. https://doi.org/10.
relationship matrix. G3 (Bethesda) 2:1405-1413. doi: https://doi. 1016/j.plantsci.2015.04.016
org/10.1534/g3.112.004259 Langmead B, Salzberg SL (2012) Fast gapped-read alignment with
Falchi R, Vendramin E, Zanon L, Scalabrin S, Cipriani G, Verde I, Bowtie 2. Nat Methods 9:357–359. https://doi.org/10.1038/nmeth.
Vizzotto G, Morgante M (2013) Three distinct mutational mecha- 1923
nisms acting on a single gene underpin the origin of yellow flesh in Li X-w, Meng X-q, Jia H-j, Yu M-lYu M-l, Ma R-j, Wang L-r et al (2013)
peach. Plant J 76:175–187. https://doi.org/10.1111/tpj.12283 Peach genetic resources: diversity, population structure and linkage
Faust M, Timon B (1995) Origin and dissemination of peach. Hortic Rev disequilibrium. BMC Genet 14:84. doi: https://doi.org/10.1186/
17:331–379 1471-2156-14-84
Flint-Garcia SA, Thornsberry JM, Buckler ES (2003) Structure of linkage Micheletti D, Dettori MT, Micali S, Aramini V, Pacheco I, Linge CS et al
disequilibrium in plants. Annu Rev Plant Biol 54:357–374. https:// (2015) Whole-genome analysis of diversity and SNP-major gene
doi.org/10.1146/annurev.arplant.54.031902.134907 association in peach germplasm. PLoS One 10:e0136803. https://
Font i Forcada C, Oraguzie N, Igartua E, Moreno MA, Gogorcena Y doi.org/10.1371/journal.pone.0136803
(2013) Population structure and marker-trait associations for pomo- Nei M (1972) Genetic distances between populations. Am Nat 106:283–
logical traits in peach and nectarine cultivars. Tree Genet Genomes 292
9:331–349. https://doi.org/10.1007/s11295-012-0553-0 Peterson GW, Dong Y, Horbach C, Fu Y-B (2014) Genotyping-by-
Fresnedo-Ramírez J, Bink MCAM, van de Weg E, Famula TR, Crisosto sequencing for plant genetic diversity analysis: a lab guide for
CH, Frett TJ, Gasic K, Peace CP, Gradziel TM (2015) QTL mapping SNP genotyping. Diversity 6:665–680. https://doi.org/10.3390/
of pomological traits in peach and related species breeding germ- d6040665
plasm. Mol Breed 35:166. https://doi.org/10.1007/s11032-015- Picañol R, Eduardo I, Aranzana MJ, Howad W, Batlle I, Iglesias I, Alonso
0357-7 JM, Arús P (2013) Combining linkage and association mapping to
Fresnedo-Ramírez J, Frett TJ, Sandefur PJ, Salgado-Rojas A, Clark JR, search for markers linked to the flat fruit character in peach.
Gasic K, Peace CP, Anderson N, Hartmann TP, Byrne DH, Bink Euphytica 190:279–288. https://doi.org/10.1007/s10681-012-0844-
MCAM, van de Weg E, Crisosto CH, Gradziel TM (2016) QTL 4
mapping and breeding value estimation through pedigree-based Poland JA, Rife TW (2012) Genotyping-by-sequencing for plant breed-
analysis of fruit size and weight in four diverse peach breeding ing and genetics. The plant genome 5:92–102. https://doi.org/10.
programs. Tree Genet Genomes 12:25. https://doi.org/10.1007/ 3835/plantgenome2012.05.0005
s11295-016-0985-z Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D,
Frett TJ, Reighard GL, Okie WR, Gasic K (2014) Mapping quantitative Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007)
trait loci associated with blush in peach [Prunus persica (L.) PLINK: a tool set for whole-genome association and population-
Batsch]. Tree Genet Genomes 10:377–381. https://doi.org/10. based linkage analyses. Am J Hum Genet 81:559–575. https://doi.
1007/s11295-013-0692-y org/10.1086/519795
Gardner KM, Brown P, Cooke TF, Cann S, Costa F, Bustamante C et al Raj A, Stephens M, Pritchard JK (2014) fastSTRUCTURE: variational
(2014) Fast and cost-effective genetic mapping in apple using next- inference of population structure in large SNP data sets. Genetics
generation sequencing. G3 (Bethesda) 4:1681-1687. doi: https://doi. 197:573–589. https://doi.org/10.1534/genetics.114.164350
org/10.1534/g3.114.011023 Raseira MCB, Byrne DH, Franzon RC (2008) Pessegueiro: tradição e
Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, poesia. In: Barbieri RL, Stumpf ST (eds) Origem e evolução de
Buckler ES (2014) TASSEL-GBS: a high capacity genotyping by plantas cultivadas. Embrapa Informação Tecnológica, Brasília, pp
sequencing analysis pipeline. PLoS One 9:e90346. https://doi.org/ 679–705
10.1371/journal.pone.0090346 Raseira MCB, Herter F, Posser CAS (2003) The EMBRAPA/ Clima
Goonetilleke SN, March TJ, Wirthensohn MG, Arus P, Walker AR, Temperado peach breeding program and adaptation to subtropical
Mather DE (2018) Genotyping by sequencing in almond: SNP dis- regions. Acta Hortic 606:45–50
covery, linkage mapping, and marker design. G3 (Bethesda) 8:161- Raseira MCB, Nakasu BH (2006) Peach breeding program in Southern
172. doi: https://doi.org/10.1534/g3.117.300376 Brazil. Acta Hortic 713:93–97
Gu C, Wang L, Wang W, Zhou H, Ma B, Zheng H, Fang T, Ogutu C, Raseira MCB, Nakasu BH (2012) Breeding peaches for mild winters:
Vimolmangkang S, Han Y (2016) Copy number variation of a gene recent results of the non-melting peach breeding program of
cluster encoding endopolygalacturonase mediates flesh texture and Embrapa, in Southern Brazil. Acta Hortic 962:29–34
stone adhesion in peach. J Exp Bot 67:1993–2005. https://doi.org/ Raseira MCB, Nakasu BH, Santos AM, Fortes JF, Martins OM, Raseira
10.1093/jxb/erw021 A, Bernardi J̃ (1992) The CNPFT/EMBRAPA fruit breeding pro-
Guajardo V, Solís S, Sagredo B, Gainza F, Muñoz C, Gasic K et al (2015) gram in Brazil. HortScience 27:1154–1157
Construction of high density sweet cherry (Prunus avium L.) linkage Rosenberg NA (2004) DISTRUCT: a program for the graphical display
maps using microsatellite markers and SNPs detected by of population structure. Mol Ecol Notes 4:137–138. https://doi.org/
genotyping-by-sequencing (GBS). PLoS ONE 10:e0127750. doi: 10.1046/j.1471-8286.2003.00566.x
https://doi.org/10.1371/journal.pone.0127750 Salazar JA, Pacheco I, Shinya P, Zapata P, Silva C, Aradhya M et al
Gürcan K, Teber S, Ercisli S, Yilmaz KU (2016) Genotyping by sequenc- (2017) Genotyping by sequencing for SNP-based linkage analysis
ing (GBS) in apricots and genetic diversity assessment with GBS- and identification of QTLs linked to fruit quality traits in Japanese
derived single-nucleotide polymorphisms (SNPs) Biochem Genet plum (Punus salicina Lindl.) Front Plant Sci 8:476. doi: https://doi.
54:854–885. doi: https://doi.org/10.1007/s10528-016-9762-9 org/10.3389/fpls.2017.00476
10 Page 14 of 14 Tree Genetics & Genomes (2020) 16:10

Soto-Cerda BJ, Cloutier S (2012) Association mapping in plant genomes. SNP array for peach by internationally coordinated SNP detection
In: Caliskan M (ed) Genetic diversity in plants. InTech, Rijeka, pp and validation in breeding germplasm. PLoS One 7:e35668. https://
29–54 doi.org/10.1371/journal.pone.0035668
Tan Q, Liu X, Gao H, Xiao W, Chen X, Fu X, Li L, Li D, Gao D (2019) Verde I, Jenkins J, Dondini L, Micali S, Pagliarani G, Vendramin E, Paris
Comparison between flat and round peaches, genomic evidences of R, Aramini V, Gazza L, Rossini L, Bassi D, Troggio M, Shu S,
heterozygosity events. Front Plant Sci 10:592. https://doi.org/10. Grimwood J, Tartarini S, Dettori MT, Schmutz J (2017) The Peach
3389/fpls.2019.00592 v2.0 release: high-resolution linkage mapping and deep
Thurow LB, Raseira MCB, Bonow S, Arge LWP, Castro CM (2017) resequencing improve chromosome-scale assembly and contiguity.
Population genetic analysis of Brazilian peach breeding germplasm. BMC Genomics 18:225. https://doi.org/10.1186/s12864-017-3606-
Revista Brasileira de Fruticultura 39:e-686. doi: https://doi.org/10. 9
1590/0100-29452017166 Vendramin E, Pea G, Dondini L, Pacheco I, Dettori MT, Gazza L,
Varshney RK, Terauchi R, McCouch SR (2014) Harvesting the promising Scalabrin S, Strozzi F, Tartarini S, Bassi D, Verde I, Rossini L
fruits of genomics: applying genome sequencing technologies to (2014) A unique mutation in a MYB gene cosegregates with the
crop breeding. PLoS Biol 12:e1001883. https://doi.org/10.1371/ nectarine phenotype in peach. PLoS One 9:e90574. https://doi.org/
journal.pbio.1001883 10.1371/journal.pone.0090574
Verde I, Abbott AG, Scalabrin S, Jung S, Shu SQ, Marroni F et al (2013) Yang N, Reighard G, Ritchie D, Okie W, Gasic K (2013) Mapping quan-
The high-quality draft genome of peach (Prunus persica) identifies titative trait loci associated with resistance to bacterial spot
unique patterns of genetic diversity, domestication and genome evo- (Xanthomonas arboricola pv. pruni) in peach. Tree Genet
lution. Nat Genet 45:487–494. https://doi.org/10.1038/ng.2586 Genomes 9:573–586. https://doi.org/10.1007/s11295-012-0580-x
Verde I, Bassil N, Scalabrin S, Gilmore B, Lawley CT, Gasic K,
Micheletti D, Rosyara UR, Cattonaro F, Vendramin E, Main D,
Aramini V, Blas AL, Mockler TC, Bryant DW, Wilhelm L, Publisher’s note Springer Nature remains neutral with regard to jurisdic-
Troggio M, Sosinski B, Aranzana MJ, Arús P, Iezzoni A, tional claims in published maps and institutional affiliations.
Morgante M, Peace C (2012) Development and evaluation of a 9K

You might also like