Professional Documents
Culture Documents
https://doi.org/10.1093/g3journal/jkad138
Advance Access Publication Date: 19 June 2023
Genome Report
Abstract
Atlantic salmon (Salmo salar) in Northeastern US and Eastern Canada has high economic value for the sport fishing and aquaculture in-
dustries. Large differences exist between the genomes of Atlantic salmon of European origin and North American (N.A.) origin. Given the
genetic and genomic differences between the 2 lineages, it is crucial to develop unique genomic resources for N.A. Atlantic salmon.
Here, we describe the resources that we recently developed for genomic and genetic research in N.A. Atlantic salmon aquaculture.
Firstly, a new single nucleotide polymorphism (SNP) database for N.A. Atlantic salmon consisting of 3.1 million putative SNPs was gen-
erated using data from whole-genome resequencing of 80 N.A. Atlantic salmon individuals. Secondly, a high-density 50K SNP array en-
riched for the genic regions of the genome and containing 3 sex determination and 61 putative continent of origin markers was
developed and validated. Thirdly, a genetic map composed of 27 linkage groups with 36K SNP markers was generated from 2,512 in-
dividuals in 141 full-sib families. Finally, a chromosome-level de novo genome assembly from a male N.A. Atlantic salmon from the
St. John River aquaculture strain was generated using PacBio long reads. Information from Hi-C proximity ligation sequences and
Bionano optical mapping was used to concatenate the contigs into scaffolds. The assembly contains 1,755 scaffolds and only 1,253
gaps, with a total length of 2.83 Gb and N50 of 17.2 Mb. A BUSCO analysis detected 96.2% of the conserved Actinopterygii genes
in the assembly, and the genetic linkage information was used to guide the formation of 27 chromosome sequences. Comparative ana-
lysis with the reference genome assembly of the European Atlantic salmon confirmed that the karyotype differences between the 2
lineages are caused by a fission in chromosome Ssa01 and 3 chromosome fusions including the p arm of chromosome Ssa01 with
Ssa23, Ssa08 with Ssa29, and Ssa26 with Ssa28. The genomic resources we have generated for Atlantic salmon provide a crucial boost
for genetic research and for management of farmed and wild populations in this highly valued species.
Keywords: genome map, SNP array, Atlantic salmon, North American, lineage, karyotype
Introduction 2004). Evidence from microsatellite DNA markers has clearly de-
monstrated that Atlantic salmon of N.A. origin is genetically dis-
Atlantic salmon (Salmo salar) in Northeastern US and Eastern
tinct from European Atlantic salmon (King et al. 2001; Council
Canada has high economic value for the sport fishing and aqua-
2002; King et al. 2005). This led to a judicial ruling that strictly regu-
culture industries and is of significant ecological and social im- lated farming of Atlantic salmon in Maine and Eastern Canada,
portance (Council 2004). Atlantic salmon of North American banned the farming of European and genetically modified salmon
(N.A). origin is also being farmed in Tasmania, Australia (Kijas strains in these areas, and required routine genetic certification of
et al. 2017), and Chile (Yáñez et al. 2016). The historic declines of the N.A. aquaculture stocks (Council 2004).
wild Atlantic salmon populations in the state of Maine, USA, led Large differences exist between the genomes of the 2 lineages
to the listing of wild or natural Atlantic salmon from 8 rivers as en- or subspecies. The haploid chromosome number of the N.A. and
dangered under the Federal Endangered Species Act (Council European Atlantic salmon is N = 27 and N = 29, respectively
(Lubieniecki et al. 2010; Brenna-Hansen et al. 2012). Recent studies Quebec (Nash and Waknitz 2003), and propagated for
that used whole-genome scans with high-density single nucleo- Aquaculture on the west US coast before being added as an aqua-
tide polymorphism (SNP) assays have provided further evidence culture stock to the USDA breeding program in Maine. The
for the genomic differentiation between the N.A. and European Penobscot River strain is thought to have originated from a wild
lineages (Lehnert et al. 2020). At the time we started this project, population of N.A. Atlantic salmon in Maine that was spawned
there was only a reference genome for the European lineage at the Craig brook National Fish Hatchery. The St. John River
(Lien et al. 2016). A reference genome is required for discovering (SJR) aquaculture strain is widely used for Atlantic salmon farm-
DNA sequence variants among individuals and between popula- ing in North America. Wild fish from tributaries above the
tions and for using those variants to develop high-density geno- Mactaquac Dam (New Brunswick province, Canada) were used
the King 7 microsatellites (King et al. 2001; King et al. 2005) and for COO classification, including 44 nuclear markers and 20 mito-
were certified to have genotypes consistent of the N.A. COO. chondrial markers identified previously (Gao et al. 2020) and 8 nu-
clear markers that were previously found to be informative for
SNP discovery COO classification by the Center for Aquaculture Technologies
The same DNA sequence data reported in our previous SNP dis- (unpublished data). Four additional sequence probes from the
covery effort (Gao et al. 2020) were used in a new SNP discovery sdY gene were also added to the array for sex determination ana-
analysis to generate the source dataset for the design of a 50K lysis. Those probes were previously included and tested in other
SNP array. The main difference was that here, we used the N.A. Axiom SNP arrays for Atlantic salmon (Kijas et al. 2018). The pres-
Atlantic salmon reference genome assembly from a fish from ence of the sequences compatible with the sdY probes would indi-
Genome assembly: DNA extraction and Bionano assembly guided by the contigs of the PacBio sequence
sequencing assembly. About 1.9 million molecules with N50 of ∼305 kb
DNA from a single male (sample NCWMAC SJR YC14-15 Fam3) (∼150× genome coverage) and average label density of 10 sites
from the USDA St. John River stock of N.A. Atlantic salmon was ex- per 100 kb were selected from the raw data after the filtering
tracted using the Qiagen DNeasy kit from a fresh blood sample step, and 1,881 Bionano genome maps were generated with a total
collected in tri-potassium EDTA. The blood was stored on ice until length of ∼3.09 Gb and N50 of ∼5.1 Mb. The Bionano genome maps
DNA extraction within 24 hours to enhance the yield of high- and the 14,240 primary contigs and haplotigs were then scaffolded
molecular weight DNA for long-read sequencing. The DNA sam- using the Bionano hybrid scaffold pipeline. This resulted in the
joining of 2,174 contigs (∼2.61 Gb) into 817 scaffolds (∼2.65 Gb)
Linkage information from the new genetic maps was used to Table 1. Validation of the SNPs with DNA samples from N.A.
generate chromosome sequences from the Hi-C assembly scaf- Atlantic salmon (N = 2,512). The marker categories are based on
the Affymetrix Axiom array cluster properties. Number of SNP
folds. The program NovoAlign (http://www.novocraft.com/
markers assigned to each category are listed next to the
products/novoalign/) was used to map the flanking sequences of percentage of total number of markers in parenthese.
each SNP marker to the scaffolds, and the genetic linkage infor-
mation was used to anchor the scaffolds to chromosomes and Category SNP markers
then to order, orient, and concatenate the scaffolds into 27 chro- Total SNPs on the array 49,880 (100%)
mosomes. With the guidance of the linkage information, 56 Hi-C Call rate < 97% 5,421 (10.9%)
scaffolds were broken into 150 smaller scaffolds at 94 Hi-C join No minor homozygotes 797 (1.6%)
One of the N.A. origin fish from Gander—Jonathan’s Brook linkage group ranged from 712 to 2,224. The average genetic
(sample NF2-015) had European mitochondria genotypes and length per chromosome in cM was 63 and 91 with a minimum
N.A. nuclear genotypes, indicating possible introgression of the length of 36 and 60 and maximum length of 129 and 127 for the
European mitochondrial genome due to ancestral hybridization. male and female maps, respectively. The distribution of markers
Evidence for introgression of European genotypes into N.A. popu- in linkage groups labeled to match the nomenclature of the
lations was previously reported for Atlantic salmon in European Atlantic salmon karyotype is shown in Fig. 3. Similar
Newfoundland (Bradbury et al. 2015). Also, European mtDNA al- to previous reports for Atlantic salmon and rainbow trout (Lien
leles have been confirmed in several Newfoundland populations et al. 2016; Pearse et al. 2019), the pattern of genetic distance in
of Atlantic salmon (Ian Bradbury, personal communication). the paternal map was composed of high levels of recombination
The results of the principal component analysis we conducted in the telomeric regions and high interference in the centromeric
using genotypes from the 61 validated COO markers are shown in region and throughout most of the chromosome. Conversely, the
Fig. 2. The reference samples were clearly separated into 2 clus- typical maternal pattern of recombination in acrocentric chromo-
ters in concordance with their COO (Fig. 2a), except for the single somes demonstrated higher rates of recombination near the
sample from Gander—Jonathan’s Brook described above due to its centromere and then a steady level of recombination throughout
nucleus markers being with N.A. genotypes and mitochondria the rest of the chromosome that gradually decreased to a plateau
markers with European genotypes. As expected, the 2,512 sam- in the telomere region. In the metacentric chromosome, the ma-
ples from the aquaculture SJR strain also clustered with the refer- ternal pattern demonstrated strong recombination interference
ence samples from N.A. origin (Fig. 2b). Among the reference in the centromeric region and then a similar pattern to the acro-
samples, the primary component that differentiated the centric chromosome in the direction of either telomere whereas
European origin salmon from the N.A. origin accounted for 94% the paternal chromosome was demonstrated to have high levels
of the total variance, and after adding the 2,512 SJR fish to the ana- of interference except near both telomeres (Fig. 4).
lysis, the primary component still explained ∼83% of the variance.
Genome assembly
Sex determination markers The genome assembly presented in the present research is based
Three markers derived from Atlantic salmon sdY sequences (Kijas on DNA from a single male from the USDA St. John River stock of
et al. 2018) were placed on the array in duplicates (i.e. 2 Probset IDs N.A. Atlantic salmon that was sequenced using PacBio long-read
for each SNP ID). The positions of the flanking sequences in the sequencing technology. The PacBio sequences were assembled
sdY gene sequence, Probset IDs, and the flanking sequences are into contigs with Canu, and those contigs were joined into scaf-
listed in Supplementary Table 3. The alternative alleles for SNP folds using Bionano optical mapping and Hi-C proximity ligation
ID Affx-158802265, Affx-158802258, and Affx-158802275 were C/ sequence data. Using genetic linkage information, the scaffolds
T, C/T, and G/T, respectively. Homozygous T alleles correlated from the Hi-C assembly were then anchored to and ordered on
with the absence of the sdY sequence (female), and homozygous chromosomes to generate the chromosome sequences of the
alternative alleles or heterozygous genotypes correlated with USDA_NASsal_1.1 assembly (GCA_021399835.1).
sdY presence (male). To evaluate the accuracy of the sex deter- The Canu sequence assembly was composed of 14,845 contigs
mination markers, we genotyped 135 female and 95 male parents with a total length of ∼3.97 Gb, N50 of 1.45 Mb, and BUSCO score of
of the families used for the linkage analysis and found 100% con- 95.8% (using actinopterygii_odb9 as the lineage dataset). While
cordance of the sex phenotype with the sdY marker genotypes. the estimated genome size of Atlantic salmon is ∼3.0 Gb (Lien
et al. 2016), it was clear that heterozygosity in this individual
Linkage map genome caused duplication of haplotypes for some of the hetero-
A total of 36,158 polymorphic high-resolution markers were zygous genomic regions. After the purging steps removed dupli-
mapped to 27 linkage groups with an average of 1,339 markers cated haplotypes resulting from heterozygous sequences, the
per group (Supplementary File 2). The number of markers per BUSCO score was improved slightly to 96.0% but the total length
G. Gao et al. | 7
of the assembly contigs at ∼3.89 Gb was still greater than the ex- therefore 1,755, total length is ∼2.83 Gb, total ungapped sequence
pected genome size. More detailed statistics for each step in the length is ∼2.79 Gb, and scaffold N50 and L50 are 16.2 Mb and 42,
assembly pipeline are shown in Table 2. The Bionano hybrid scaf- respectively.
folding pipeline used the optical map to join neighboring or over- A comparison of the assembly statistics between this assembly
lapping Canu contigs and to break chimeric sequence contigs that and the 2 annotated reference genome assemblies for the
were misassembled by Canu. The final Bionano scaffolded assem- European Atlantic salmon lineage is shown in Supplementary
bly included 2,012 scaffolds of joined primary contigs and haplo- Table 4. The contiguity of this assembly is much better than the
tigs and primary contigs that could not be joined into scaffolds. previous assembly (GCA_000233375.4) (Lien et al. 2016) which
After this step, the total length of the assembly was finally re- had scaffold N50 and L50 of ∼0.6 Mb and 594, respectively.
duced to within the expected genome size at ∼2.83 Gb and the However, the contiguity of the new reference for the European
N50 increased to 11.9 Mb (Table 2). The Hi-C ligation proximity Atlantic salmon lineage genome assembly (GCA_905237065.2)
mapping further reduced the number of scaffolds to 1,633 and in- with scaffold N50 and L50 of 28 Mb and 33, respectively, is better
creased the N50 to 37.9 Mb and the BUSCO score to 96.2%, with than the assembly we present here. It also has a nearly perfect
negligible impact on the total length of the assembly (Table 2). BUSCO score showing 99.3% of complete protein-coding genes de-
Using information from the new linkage map to anchor the tected compared to 96.2% in this genome assembly for the N.A.
scaffolds to karyotype chromosomes broke some of the largest Atlantic salmon lineage.
scaffolds due to their alignment with different linkage groups.
Overall, 633 scaffolds were anchored to 27 linkage groups with SNP array positions on the chromosome
914 spanned gaps and 606 unspanned gaps to generate 27 sequences
chromosome sequences in a total length of ∼2.53 Gb, or 89% of The genome positions of the SNPs from the Axiom 50K array that
the total genome assembly length. The distribution of chromo- were mapped to unique positions of the N.A. Atlantic salmon
somes by physical length in base pairs from this assembly is chromosome sequences with no indels and including 60 bp flank-
shown in comparison with the chromosomes of the European ing sequences for each SNP are available in Supplementary File 3.
Atlantic salmon reference genome assembly in Fig. 5. The total The distribution of the 50K SNPs on the chromosomes of the gen-
length of the 1,122 unplaced scaffolds that are not anchored to a ome assembly is shown in Supplementary Fig. 1. The average
chromosome is ∼300 Mb. The total number of scaffolds in the number of SNP markers per chromosome is 1,725 with a min-
chromosome-level assembly for the N.A. Atlantic salmon is imum of 920 and a maximum of 2,885. Overall, the total number
8 | G3, 2023, Vol. 00, No. 0
of SNPs per chromosome is highly correlated with the physical the genes that they carry as illustrated in Supplementary Fig. 2.
size of the chromosomes (Pearson r = 0.96) in the new genome as- The rearrangements we found in the remaining 4 chromosomes
sembly we generated for N.A. Atlantic salmon. An additional 188 can be explained by 4 major separate events. The first was fission
SNPs were mapped to unplaced scaffolds or contigs. The SNP in the centromere region of chromosome Ssa01 resulting in
marker density per 1 Mb in each of the 27 chromosomes is dia- the separation of chromosome arms Ssa01q and Ssa01p. The
grammed in Fig. 6. The marker density range is from zero to 40 second was the translocation and fusion of Ssa01p with Ssa23
per 1 Mb. The marker density fluctuates within the chromosomes resulting in Chr 23 of the N.A. lineage genome assembly.
with lower density typically near the centromeres and telomeres. The evidence from sequence alignments for those 2 chromo-
This detailed characterization of marker density on each chromo- somal rearrangements is illustrated in Fig. 7. According to
some is important for assessing the utility of the array in genome Brenna-Hansen et al. (2012), this is the most common rearrange-
scans for QTL and for other genomic analyses. The overall distri- ment between salmon of European and N.A. origins as all the
bution of the array SNPs on chromosomes is highly correlated chromosome karyotypes that they have observed from N.A. sal-
with the chromosomes’ length, which is similar to what was re- mon were homozygous for the translocation of Ssa01p to Chr 23.
ported for high-density arrays that were developed for the However, more recent studies have documented differences in
European linage (Houston et al. 2014; Yáñez et al. 2016). the Ssa01/23 translocation in North America in specific geo-
graphic regions where secondary contact between European
Comparison between European and N.A. genome and N.A. Atlantic salmon has occurred (Lehnert et al. 2019;
assemblies: Watson et al. 2022). The third rearrangement event is the forma-
Similar to rainbow trout’s plasticity in chromosome number be- tion of Chr 08 in the N.A. genome from the fusion of the acro-
tween populations which correlates with geographic distribution centric chromosomes Ssa08 and Ssa29 (Fig. 8). The fourth is
(Thorgaard 1983; Gao et al. 2021), the Atlantic salmon genome the fusion of Ssa26 and Ssa28 to form Chr 26 in the N.A. genome
also exhibits tolerance to chromosomal rearrangements. While assembly (Fig. 9). Karyotype preparations and other evidence
other karyotypes have been reported for N.A. Atlantic salmon from molecular markers revealed that variation in the latter 2
from other populations with chromosome numbers varying chromosomal fusions (Ssa08/29 and Ssa26/28) can be found in
from 2N = 54 to 2N = 58 (see discussion in Brenna-Hansen et al. a wider range of Atlantic salmon populations from the N.A. lin-
2012), here, we found unequivocal evidence that supports 2N = eage (Brenna-Hansen et al. 2012; Lehnert et al. 2019; Wellband
54 in the SJR strain individual fish that we used for the genome et al. 2019). Hence, it is likely that those 2 chromosomal fusion
assembly presented in this report. The results of our linkage ana- events are more recent than the translocation event of Ssa01p
lysis coupled with the de novo assembly of large sequence to Chr 23 in N.A. Atlantic salmon.
scaffolds confirmed the chromosomal rearrangements between Smaller scale rearrangements in the form of inversions were
N.A. and European Atlantic salmon proposed previously by recently reported in Atlantic salmon (Stenløkk et al. 2022). Based
Brenna-Hansen et al. based on their own linkage analysis and kar- on our DNA sequence alignments between the genome assembly
yotyping (Brenna-Hansen et al. 2012). reported here and the European reference genome assembly, we
The chromosome sequences from this assembly of the N.A. sal- detected 2 of those inversions and found them to be larger than
mon genome (USDA_NASsal_1.1; GCA_021399835.1, N = 27) were what was reported by Stenløkk et al. (2022), with the size of 7
aligned with the chromosome sequences of the reference genome and 4 Mb for the inversions on Ssa09 (just below 100M) and
from the European lineage (Ssal_v3.1; GCA_905237065.2, N = 29) to Ssa18 (just below 80M), respectively (Supplementary File 4). We
identify and confirm the chromosomal rearrangements. Graphic were able to confirm that those 2 inversions between the 2 gen-
representations of the alignments for all 27 chromosomes can ome assemblies are likely to be real as they are encompassed
be found in Supplementary File 4. For 23 of the 27 chromosomes, within a single contig in both assemblies, and thus not likely to
we found simple collinear sequence alignments indicating highly be due to scaffolding errors. In addition, there is an optical appear-
similar order and direction of those chromosome sequences and ance of a very small inversion on Ssa16 at ∼60 Mb as shown in
G. Gao et al. | 9
Supplementary File 4. However, this is not a true inversion, but ra- than the rest of the genome because of those nearly identical
ther an optical illusion caused by the proximity (∼150 kb) of 2 du- stretches of sequence that could not be resolved with short reads.
plicated sequences on both genomes. Hence, the expectation would be that the average number of scaf-
folds in those hard to resolve chromosome arms will be higher
Contiguity of scaffolds in homeologous than the rest of the genome. However, the average number of
chromosome arms scaffolds in the 18 duplicated chromosome arms in this assembly
The Atlantic salmon genome contain 9 pairs of chromosome arms was 15 (Supplementary Table 5) compared to the overall average
with very high sequence similarity due to residual tetraploidy of 18 for all chromosome arms in the assembly, suggesting similar
(Lien et al. 2016). It is nearly impossible to distinguish alleles contiguity with the rest of the genome. The total number of scaf-
from homeologous tetraploid regions, and all salmonid genomes folds in chromosome sequences in this assembly was 633, and the
share this problem (Allendorf et al. 2015). In past de novo genome overall average was calculated based on the number of 50 chromo-
assemblies of salmonid genomes that were based on short reads some arms identified by Lien et al. (2016). Therefore, it appears that
(Berthelot et al. 2014; Lien et al. 2016; Pearse et al. 2019), the home- the problem of assembling those homeologous genome regions
ologous or duplicated chromosome arms were more fragmented can largely overcome with reads that are long enough to cover
10 | G3, 2023, Vol. 00, No. 0
Table 2. Statistics of the N.A. Atlantic salmon genome assembly at each step of the assembly pipeline.
Feature Canu contigs Purge duplicates Bionano scaffolds Hi-C scaffolds Chr. sequencesa
Fig. 5. Distribution of chromosome lengths in base pairs in the assembly from N.A. Atlantic salmon (accession GCA_021399835.1) presented in this report
a) vs. b); the distribution in the reference genome assembly from the European lineage (Ssal_v3.1; GCA_905237065.2).
stretches of DNA sequence that contain enough sequence variants Blast searches with the Refseq-annotated version of the
to distinguish between those nearly identical genome regions. European lineage genome assembly (GCF_905237065.1). For ex-
ample, the T cell receptor beta 01 region which is found on
Gene annotation Ssa01p in the European lineage reference genome has been shown
Gene annotation for this genome assembly accession is not avail- to be translocated to Chr 23 (Ssa01p_Ssa23) in the assembly of the
able currently from NCBI. Chromosome sequences or specific re- N.A. Atlantic salmon genome presented here (Grimholt et al.
gions of interests from the genome assembly can be aligned by 2022). In addition, the alignments of USDA_NASsal_1.1 (GCA_
G. Gao et al. | 11
Fig. 7. Chromosome fusions and fissions between the European and N.A. Atlantic salmon lineages. Fission of the European chromosome Ssa01q and p
arms and fusion of Ssa01p with Ssa23 in the N.A. Atlantic salmon genome. Sequences were compared using data from the reference assembly for the
European lineage in NCBI (Ssal_v3.1; GCA_905237065.2).
021399835.1) to the annotated Ssal_v3.1 (GCF_905237065.1) gen- can choose the output format of interest. The output file in gen-
ome assembly are now available in the NCBI remap menu at ome workbench format can be used to view the aligned chromo-
https://www.ncbi.nlm.nih.gov/genome/tools/remap. The USDA some region or regions of choice in a genome browser. In
assembly can be used as the source assembly and the Ssal_v3.1 addition, a first pass gene annotation can be downloaded from
as the target assembly. After pasting or uploading the chromo- Ensembl Rapid Release at https://rapid.ensembl.org/Salmo_salar_
some coordinates of interest from the source assembly, the user GCA_021399835.1/Info/Index.
12 | G3, 2023, Vol. 00, No. 0
Fig. 9. Fusion of the European lineage chromosomes Ssa26 and Ssa28. Sequences were compared using data from the reference assembly for the
European lineage in NCBI (Ssal_v3.1; GCA_905237065.2).
Conclusions Acknowledgements
We report here on the de novo assembly of the first chromosome- We thank Kristy Shewbridge for her technical assistance with the
level reference genome and the development and validation of preparation of the DNA sample used for genome sequencing. We
the first publicly available high-density SNP array for the N.A. thank Dr. Serap Gonen and Dr. Matthew Baranski from Mowi for
lineage of Atlantic salmon. These genomic resources are re- sharing SNP polymorphism information from the European SNP
quired for implementing genomic selection for aquaculture pro- array and for contributing samples of farmed salmon from
duction and are crucial to current research on the genetics, European origin. We thank Dr. Matthew Kent for sharing the
biology, evolution, and ecology of this subspecies that holds flanking sequence data of the markers used in the linkage map
high economic and social value in Eastern United States, published by Brenna-Hansen et al. (2012). Mention of trade names
Canada, Chile, and Australia. or commercial products in this publication is solely for the
G. Gao et al. | 13
purpose of providing specific information and does not imply Council NR. Genetic Status of Atlantic Salmon in Maine: Interim
recommendation or endorsement by the U.S. Department of Report from the Committee on Atlantic Salmon in Maine.
Agriculture. USDA is an equal opportunity provider and employer. Washington (DC): The National Academies Press; 2002.
Council NR. Atlantic Salmon in Maine. Washington (DC): The
National Academies Press; 2004.
Funding Gao G, Magadan S, Waldbieser GC, Youngblood RC, Wheeler PA,
Scheffler BE, Thorgaard GH, Palti Y. A long reads-based de-novo
This study was supported by the USDA Agricultural Research
assembly of the genome of the Arlee homozygous line reveals
Service in-house project numbers 8030-31000-004/005 and
chromosomal rearrangements in rainbow trout. G3 (Bethesda).
perspective from microsatellite DNA variation. Mol Ecol. 2001; Roazen D, et al. Scaling accurate genetic variant discovery to
10(4):807–821. doi:10.1046/j.1365-294X.2001.01231.x. tens of thousands of samples. bioRxiv:201178. https://doi.org/
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 10.1101/201178. 2018. preprint: not peer reviewed.
Canu: scalable and accurate long-read assembly via adaptive Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender
k-mer weighting and repeat separation. Genome Res. 2017; D, Maller J, Sklar P, de Bakker PIW, Daly MJ, et al. PLINK: a toolset
27(5):722–736. doi:10.1101/gr.215087.116. for whole-genome association and population-based linkage
Lehnert SJ, Bentzen P, Kess T, Lien S, Horne JB, Clément M, Bradbury analysis. Am J Hum Genet. 2007;81(3):559–575. doi:10.1086/
IR. Chromosome polymorphisms track trans-Atlantic divergence 519795.
and secondary contact in Atlantic salmon. Mol Ecol. 2019;28(8): Rastas P. Lep-MAP3: robust linkage mapping even for low-coverage