You are on page 1of 9

Genomics 113 (2021) 2944–2952

Contents lists available at ScienceDirect

Genomics
journal homepage: www.elsevier.com/locate/ygeno

Chromosome-level assembly of the Hypophthalmichthys molitrix


(Cypriniformes: Cyprinidae) genome provides insights into its
ecological adaptation
Yi Zhou a, Weiling Qin a, Huan Zhong b, *, Hong Zhang c, Luojing Zhou d
a
State Key Laboratory of Developmental Biology of Freshwater Fish, Life Science College, Hunan Normal University, Changsha, Hunan, PR China
b
Hunan Research Center of Engineering Technology for Utilization of Distinctive Aquatic Resource, College of Animal Science and Technology, Hunan Agricultural
University, Changsha, China
c
Guangxi Key Laboratory of Beibu Gulf Marine Biodiversity Conservation, Beibu Gulf University, Qinzhou, China
d
Hunan Provincial Key Laboratory of Nutrition and Quality Control of Aquatic Animals, Changsha University, Changsha, China

A R T I C L E I N F O A B S T R A C T

Keywords: Hypophthalmichthys molitrix (silver carp) is phytoplanktivorous and is an economically and ecologically impor­
Whole genome sequencing tant fish species. As a well-known invasive species, a number of factors associated with the ecological adaptations
Chromosomal-level of this species are largely unknown. Here, we present a chromosomal-level assembly of the species based on the
Hypophthalmichthys molitrix
PacBio Sequel II platform and Hi-C scaffolding technology. Based on the high-quality genome sequences and
Positive selection
previous genome sequencing projects, a number of genes that were probably subject to positive selection reside
Ecological adaptation
in the genome of H. molitrix, and the last common ancestors of H. molitrix and H. nobilis were identified. Some of
these genes may partially explain the mechanisms of H. molitrix for surviving damage due to toxic algae. De­
mographic history estimation suggests that the effective population size (EPS) of the species may have constantly
increased along with the uplift of the Qinghai-Tibet Plateau, started to decline when quaternary glaciation
started, and further declined during the Younger Dryas Period. Moreover, the introgression from H. nobilis to
H. molitrix in North America was corroborated based on the whole-genome sequencing data, and the proportion
of introgressed regions was estimated to be approximately 5.8%. Based on the high-quality assembly, the possible
mechanisms by which H. molitrix adapts to its endemic and invaded locations were profiled.

1. Introduction found in the wild in the 1980s and has since been expanding its terri­
tories and threatening native species [4]. Moreover, the hybridization of
Hypophthalmichthys molitrix is one of the three species included in the H. molitrix and H. nobilis in the wild has been reported based on
genus Hypophthalmichthys, Cyprinidae, which is the largest family of morphological features, allozyme loci, and nucleotide and mitochon­
freshwater fishes [1]. H. molitrix is endemic to East Asia and is an drial markers [5,6] and has been indirectly corroborated by whole
important source of animal protein for local people. The cultivation of genome sequencing [7]. However, the extent of introgression and
the species can be traced back to the Tang dynasty (618–907 CE), and its introgressed regions are largely unknown.
production ranked second in worldwide aquaculture in 2016 [2]. A number of phytoplankton, for example, blue-green algae, can
H. molitrix is phytoplanktivorous and can feed on cyanophytes, generate toxic secondary metabolites that are harmful to the health of
euglenales, cryptomonas, etc. Thus, H. molitrix has been used to control humans and other animals [8]. The hepatoxic microcystin generated by
algal blooms resulting from eutrophication, that is, so-called “nontra­ a number of blue-green algae should be the most widespread cyano­
ditional biomanipulation” [3]. As a result, H. molitrix, together with its bacterial toxin and may lead to acute death in humans [9]. In addition,
sibling H. nobilis, has been introduced to more than 70 countries for the three other kinds of toxins, neurotoxins, cytotoxins, and dermatoxins,
control of phytoplankton and/or the development of aquaculture [4]. In can also be generated by phytoplankton [8]. Thus, it is thought that
North America, H. molitrix was introduced in the early 1970s, and it was H. molitrix may adapt to survive based on their feeding habits. However,

* Corresponding author at: Hunan Research Center of Engineering Technology for Utilization of Distinctive Aquatic Resource, College of Animal Science and
Technology, Hunan Agricultural University, Changsha, China.
E-mail address: zhonghuanzh@126.com (H. Zhong).

https://doi.org/10.1016/j.ygeno.2021.06.024
Received 28 October 2020; Received in revised form 13 June 2021; Accepted 16 June 2021
Available online 19 June 2021
0888-7543/© 2021 Published by Elsevier Inc. This article is made available under the Elsevier license (http://www.elsevier.com/open-access/userlicense/1.0/).
Y. Zhou et al. Genomics 113 (2021) 2944–2952

the molecular events that happened from ancient times to present times long reads were used to de novo assemble the genome using FALCON
are largely unknown. 0.3.0 [16] with the following parameters “length_cutoff = 30000,
With the advancement of sequencing technology, the sequencing of length_cutoff_pr = 28000, pa_HPCdaligner_option = -v -B188 -t12 -w8
the whole genome of a species to investigate its ecological adaptations is -M24 -e.75 -k18 -h280 -l2800 -s1000, overlap_filtering_setting =
frequently reported. During the preparation of the report, the –max_diff 60 –max_cov 90 –min_cov 2 –bestn 10”; then, the reads were
chromosomal-level genome sequences of H. molitrix were reported [10]. mapped back to the new assembly using blasr v5.3.1 [17] with default
However, the assembly was based on next-generation sequencing tech­ settings, to improve the assembly using arrow (https://github.com/Paci
nology, and contigs were anchored according to a genetic map. Thus, the ficBiosciences/GenomicConsensus). The improved assembly was also
contigs of the assembly were short, and a large number of contigs could polished using pilon v1.8 [18], based on the alignments of genomic short
not be anchored [10]. Third-generation sequencing technology can reads mapped back to the assembly using BWA-MEM with default set­
generate much longer contigs, and the Hi-C scaffolding method can tings [19]. The process was repeated 3 times iteratively to try to adjust
anchor almost all of the contigs onto chromosomes. In the present study, the assembly to increase its accuracy. To remove possible redundant
using both PacBio single molecule real-time sequencing (SMRT) and Hi- sequences, all the long-reads were mapped back to the genome using
C scaffolding technologies, we sequenced and de novo assembled the minimap2 with the parameter “-x map-pb” [20], followed by the
whole genome of H. molitrix individual. Based on the newly generated removal of possible redundants according to the purge_haplotigs pipe­
and nearly chromosomal assembly and the whole genome sequencing line [21].
data generated in previous reports, the possible positively selected genes To scaffold the genome, the cleaned short reads from the Hi-C library
(PSGs) residing in the H. molitrix genome that may explain its adaptation were mapped to the new assembly using bowtie2 [22] with parameters
to the environment and the demographic history that may represent its “-end-to-end –very-sensitive -L 30”, and the interaction information
adaptation to ancient environments were deduced. Moreover, the among each two positions of the genome was estimated and used to
ingression from the sibling species H. nobilis to H. molitrix in the exotic locate and orient the contigs using LANCHESIS [23], given that the
environments of North America was also profiled. karyotype of the species is 2n = 48 [24]. The gaps between any two
neighboring contigs were filled with 100 Ns.
2. Materials and methods A number of strategies were used to estimate the accuracy of the new
assembly. First, evolutionarily informed expectations of the gene con­
2.1. Sampling tent of near-universal single-copy orthologs within Actinopterygii were
estimated using Benchmarking Universal Single-Copy Orthologs
A live sample of H. molitrix was purchased from an aquatic product (BUSCO 3) software [25] to represent the completeness of the assembly.
market in Changsha, China. The classification of the fish was corrobo­ Then, all the cleaned short reads from each of the RNA-Seq libraries
rated before sampling. The fish was dissected after injection with tri­ were mapped back to the genome using hisat2 with default settings [26],
caine methanesulfonate (MS-222), and samples of white muscle, heart, and the mapping rate was estimated. Third, the interactions between
brain, kidney, liver, and spleen were frozen in liquid nitrogen immedi­ each pair of positions of the genome were estimated and used to check
ately after dissection and stored until use. the correctness of the scaffolding, that is, to check if stronger in­
teractions exist in closer locations. Finally, the scaffolds of H. molitrix
2.2. Library constructions and sequencing were compared with chromosomes of the model species belonging to the
same family, Danio rerio (zebrafish), using MCScanX [27], and the re­
Genomic DNA was extracted from white muscle using the standard sults were displayed with circus v0.69.8 [28] to check their collinearity.
phenol/chloroform method, and the integrity was checked by agarose
gel electrophoresis. Ten micrograms of DNA was used to construct the 2.4. Genome annotation
library for PacBio SMRT Sequencing using the SMRTbell Express Tem­
plate Prep Kit (PacBio, Menlo Park, CA, USA), and the library was Repeat elements (REs) residing in the genome were identified using
sequenced using the PacBio Sequel II System with CLR mode. Genomic two strategies:de novobased and homolog-based methods. The former
DNA was also used to construct a library with 350 bp insert length ac­ includes the identification of SSRs, tandem repeats, and LTRs based on
cording to Illumina standard procedures (Illumina, San Diego, CA, USA) their structural characteristics using MISA [29], TRF v4.09 [30], and
and was sequenced using a HiSeq 2500 Sequencing System with 150-bp LTR_retriever v2.8.7 [31], respectively. Moreover, possible REs were
paired-end mode. also de novo identified using RepeatModeler (http://www.repeatm
To assist the annotation of the gene models, total RNA was extracted asker.org) by recruiting RECON V1.05 [32] and RepeatScout v1.0.5
from five tissues, including the heart, brain, kidney, liver, and spleen, [33] and were combined with the consensus sequences of known RE
using TRIzol reagent (Invitrogen, Carlsbad, CA) following the manu­ deposits in Repbase [34] to form a library; this library was used to detect
facturer’s protocol, and the libraries were prepared and sequenced all possible REs based on the homolog-based strategy implemented in
based on Illumina technology, similar to the protocal performed for RepeatMasker version open-4.0.9 (http://www.repeatmasker.org). All
genomic DNA as aforementioned. these results were combined, and the nonredundant list of REs residing
For scaffolding the assembly, the white muscle sample was cross­ in the H. molitrix genome remained.
linked using formaldehyde, and then, the crosslinked DNA was extrac­ Three strategies, including RNA-Seq-based, de novo, and homolog-
ted, digested, and used to construct the highest-throughput chromosome based methods, were used to predict all the possible protein-coding
conformation capture (Hi-C) library following the protocol described gene models of the genome. The cleaned RNA-Seq data were com­
elsewhere [11,12]. Ultimately, the library was sequenced using the bined and assembled using Trinity v2.8.2 [35,36] with a genome-guided
HiSeq 2500 Sequencing System in 150-bp paired-end mode. model and were then used to predict possible coding regions following
the PASA2 + TransDecode pipeline [36,37]. The possible coding se­
2.3. Genome assembly and assessments quences were then used to train the models for AUGUSTUS v2.5.5 [38],
Glimmer v3.02b [39], and SNAP [40], and the de novo prediction of gene
All the Illumina short reads were quality controlled using fastp [13] models was implemented based on the repeat masked genome. The
with default settings. The cleaned short reads were used to count the 17- proteomes of five related fish species, Ctenopharyngodon idellus, D. rerio,
mer spectrum using Jellyfish-2 [14], which was then used to survey the Ictalurus punctatus, Megalobrama amblycephala, and Triplophysa tibetana,
complexity of the genome including the size, repeat contents, and het­ were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/) and
erozygosity of the genome, using GenomeScope 2.0 [15]. The PacBio were used to predict possible coding regions based on homology. For

2945
Y. Zhou et al. Genomics 113 (2021) 2944–2952

each species, the homologous regions of each protein sequence were identified and filtered as the ingroup species H. molitrix and H. nobilis. To
extracted using genBlastA v1.0.4 [41], and the possible coding regions perform the test, all passed SNP sites for each individual were extracted
were then deduced using GeneWise v2.4.1 [42]. Ultimately, all these and merged using bcftools v1.10.2, which was then converted into
results were combined and were used to generate a nonredundant gene eigenstrat format using vcf2eigenstrat.py (https://github.com/mathii/
set for the species using EVidenceModeler v1.1.1 [37] with the weights gdc/blob/master/vcf2eigenstrat.py). The test was performed using
“RNA_Seq > homology > de novo”. Functional descriptions for the gene admixr [56], which provides a convenient interface for the traditional
models were obtained from the top hit protein sequences deposited in nr ADMIXTOOLS workflow [57]. To estimate the fraction of the genome
and swissprot using blastp with the e-value set at 1e-5 [43]. KOG clas­ shared through introgression, we duplicated the SNPs called for
sification and KEGG pathway assignments were performed using blastp H. nobilis and used them as fully introgressed individuals, and the
against protein sequences deposited in the KOG [44] and KEGG data­ possible number of total introgressed sites (T) was estimated. The
bases [45], respectively. GO terms were associated by querying the number of possible introgressed sites residing in the HM2 genome ac­
proteins deposited in InterPro [46] using InterProScan5 [47]. counting for T was regarded as the possible fraction of introgressed re­
gions [58].
2.5. Identification of possible PSGs
3. Results
Genes subject to positive selection may indicate the ecological
adaptation of a species. To identify possible PSGs in H. molitrix, coding 3.1. Raw data
sequences of the siblings M. amblycephala, C. idellus, and Anabarilius
grahami were downloaded from NCBI (https://www.ncbi.nlm.nih. A number of raw data were generated in the present study. Using the
gov/), and the coding sequences of H. nobilis were predicted using PacBio sequel II sequencing system in CLR mode, a single cell generated
augustus [38]. Orthologs for each of the species were identified and nearly 8.5 M sequences, totaling approximately 145.4 Gbs and repre­
aligned, and candidates that were probably subject to positive selection senting approximately 169× of the estimated genome, and the mean and
were identified based on the branch-site model implemented in codeml N50 lengths of the subreads were 17.1 kb and 26.9 kb, respectively
[48] using PosiGene [49]. The analyses were repeated two times with (Table 1). The Illumina sequencing of the genomic shotgun library
different foreground branches specified: the first time, H. molitrix was generated more than 997 M raw sequences, totaling nearly 150 Gbs, and
specified, and the second time, the last common ancestor (LCA) of representing approximately 175× of the estimated genome, with Q20
H. molitrix and H. nobilis was used as the foreground branch. The P- and Q30 rates of 96% and 90%, respectively (Table 1). After filtering,
values for multiple tests were adjusted using the Bonferroni and Ben­ more than 984 M reads totaling more than 147 Gbs remained. The Hic
jamini–Hochberg methods [50], and the functional annotations were library generated nearly 550 M raw reads, totaling more than 82 Gbs
associated using the online tool String v11.0 [51]. representing approximately 95× of the estimated genome, and after
filtering, more than 545 M reads, totaling nearly 82 Gbs, were remained.
2.6. Demographic analyses using psmc For each of the five RNA-Seq libraries, approximately 40 M raw reads
totaling 6 Gbs were generated (Table 1).
Based on the short reads generated from the genomic library, the
dynamics of effective population sizes (EPSs) were deduced for the in­
dividuals in the present study (thereafter called HM1) and the in­ 3.2. De novo assembly of the genome
dividuals collected from North America in a previous report (hereafter
called HM2) [7]. Briefly, for each individual, the cleaned short reads The distribution of the frequency spectrum showed that the
were first mapped back to the present assembly using BWA-MEM with maximum frequency of 17-mers was 62 (Supplemental Fig. S1), and the
default settings [19]. Then, gapped alignments were identified and total number of 17-mers was approximately 53 G; thus, the estimated
realigned using GATK3 [52] to improve the accuracy. Possible PCR genome size of H. molitrix was approximately 856 Mb. The estimated
duplicates were marked using GATK4 [52]. Then, possible SNPs were heterozygosity and repeat contents were approximately 0.5% and
called using bcftools v1.10.2 [53] with default settings, and all the called approximately 30%, respectively. Based on the estimated genome size,
SNPs were converted into fq format using vcfutils.pl with the mode subreads longer than 28 kb were used to de novo assemble the genome,
vcf2fq. According to the recommendations of the manual, the minimum and 916 contigs totaling 1135 Mb were obtained, with a contig N50
and maximum read depths were set to one-third and triple of the mean length of 4.6 Mb, and the longest contig reached more than 21 Mb. After
depth, respectively. Finally, the psmc-compatible format of mutations rounds of polishing, the total length of the genome reached 1136.7 Mb,
was generated using fq2psmcfa with min_qual set to 20. Psmc v0.6.5 which was significantly larger than the estimated genome size. This
[54] was run for each of the individuals, with a pattern of parameters set
as “4 + 24*2 + 3 + 6” to ensure that more than 10 recombinations are Table 1
inferred to occur in the intervals each parameter spans. Moreover, 100 Raw data generated for H. molitrix in the present study.
bootstrap replicates were performed to test the robustness of the esti­ Sequencing application tissue platform Number of Size of
mation on the basis of the randomly selected 500 kb segments extracted type raw data raw
sequences data
from the chromosomes. For comparison, the dynamics of effective
(Gb)
population sizes for the individuals of H. nobilis collected from North
America [7] were also deduced as aforementioned, but no bootstrap was Genome-Seq Survey White Illumina 997,285,638 149.6
muscle HiSeq
performed.
2500
De novo PacBio 8,499,972 145.4
2.7. Introgression Sequel II
Hi-C Illumina 549,739,284 82.5
HiSeq
The aforementioned SNPs were all filtered using bcftools v1.10.2
2500
with the criterion of “-g 3 QUAL<30” and minimum and maximum RNA-Seq Assist brain Illumina 36,436,828 5.5
depths of one-third and triple the mean depth, respectively. For the annotation heart HiSeq 39,085,910 5.9
ABBA-BABA test, M. amblycephala was selected as the outgroup species kidney 2500 48,506,166 7.3
[55], the raw reads of genomic short gun sequencing were downloaded liver 53,224,480 8.0
spleen 45,050,516 6.8
from the NCBI Sequence Read Archive (SRA), and possible SNPs were

2946
Y. Zhou et al. Genomics 113 (2021) 2944–2952

result may have occurred because of the relatively high level of het­ accuracy of the assembly (Fig. 2).
erozygosity. After removing possible redundant sequences, the final
version of the genome was composed of 215 contigs, totaling 856.5 Mb,
with a contig N50 length of 6.6 Mb (Supplemental Table S1). Based on 3.3. Annotations of the genome
the de novo assembly, estimation showed that approximately 58.9% of
the cleaned short reads from the Hi-C library were uniquely mapped Approximately 46% of the genome is composed of repeats (Table 2).
paired-end reads, and approximately 84.0% of them were valid. On the DNA transposons should be the most represented type of RE residing in
basis of the valid short reads, 214 contigs, totaling 856.5 Mb, were the genome, accounting for 26.8% of the genome. Among these REs,
located and oriented to the 24 chromosomes of the species, with a TcMar-Tc1 accounted for 3.5% and was the most represented, followed
scaffold N50 length of 35 Mb (Supplemental Table S2). That is, nearly all by CMC-EnSpm, which accounted for 3.4%, and Kolobok, which
the bases were located on the chromosomes, and the assembly was accounted for 2.9% of the genome. RNA-mediated transposons are also
performed at the chromosomal level. well represented in the genome. Among these REs, LTRs are the most
The new assembly was assessed from different perspectives. First, represented, accounting for 8.2% of the genome. LINEs, accounting for
BUSCO assessment showed that among the 4584 conserved single-copy 2.5% of the genome, were also well represented, of which the top rep­
orthologous genes of Actinopterygii, a total of 4358 accounted for 95.0% resented type was L2, which accounted for 1.5% of the genome. Other
of the assembly, with single-copy and duplicate rates of 86.1% and well-represented REs residing in the genome are simple repeats, ac­
8.9%, respectively (Supplemental Fig. S2). Moreover, fragments of counting for 3.1% of the genome, followed by satellites, accounting for
another 1.9% could also be found in the assembly. Second, more than 0.9% of the genome.
91% of the short reads generated from each of the five RNA-Seq libraries The final gene set included 29,279 genes, of which 25,809 genes
could be mapped back to the assembly. Third, for each of the chromo­ accounted for 88.2% could be annotated by at least one database
somes, the interaction intensity was stronger in the nearby regions than (Table 3). More specifically, gene descriptions were obtained for 23,006
in the distant regions (Fig. 1), which was consistent with the presup­ genes, accounting for 78.6%, from Swissprot and for 25,664 genes, ac­
position of Hi-C scaffolding. Last, the good collinearity between ho­ counting for 87.7%, from NR. KEGG pathways and GO terms were
mologous chromosomes of H. molitrix and D. rerio also suggested the associated with 15,376 genes, accounting for 52.5%, and 14,737 genes,
accounting for 50.3%, of the gene set, respectively.

Fig. 1. Intensity of interactions between any two chromosomal regions across the genome.

2947
Y. Zhou et al. Genomics 113 (2021) 2944–2952

dr25 hm24hm22hm
dr24 19
3 hm
dr2




1






hm


2


dr2



20


hm





21 14


dr


hm





9
20


hm


dr




16






hm
19

dr


10


hm

 
18

15
dr

 

hm



8


7
dr1

hm
 

17
 

hm1


dr16




2 hm
 

18 hm21
dr15









dr14

hm7






hm13

dr13






hm4

dr12




11



hm
 
dr1



3

1

hm
 
dr

23
10

 

hm





2
dr

hm
 
9



6


dr
hm


8




5


hm



dr


7




1


dr








dr6



dr2












dr5
dr4 dr3

Fig. 2. The high collinearity between chromosomes of H. molitrix and D. rerio.


Circos plot demonstrate the high collinearity between chromosomes of H. molitrix and D. rerio. Chromosomes of H. molitrix are begin with “hm”, and chromosomes of
D. rerio are begin with “dr”. Each arc line represents a collinear region.

Table 2 Table 3
The total length and the percentage of the total length account for whole genome Summary statistics of the functional annotations for the predicted gene-models
of each repeat elements type reside in the H. molitrix genome. of H. molitrix against each of the Databases.
Type Length (Mb) % in genome Database Number Percent (%)

DNA 229.6 26.8 Annotation Swissprot 23,006 78.6


LINE 21.6 2.5 NR 25,664 87.7
LTR 70.2 8.2 KEGG 15,376 52.5
SINE 2.6 0.3 COG 16,300 55.7
Other 25.3 2.9 GO 14,737 50.3
Unknown 44.6 5.2 Total Annotated 25,809 88.2
Total 394.0 46.0 Gene 29,279 –

3.4. Possible PSGs the adaptation process of the LCA of H. molitrix and H. nobilis, a total of
56 possible PSGs were identified (FDR < 0.1). These genes include
Possible PSGs that may be pivotal during the adaptation to new NAPSA and tfr1b, which are included in the “Lysosome” pathway, and
environments for H. molitrix and the LCA of H. molitrix and H. nobilis itgb4 participates in the “Regulation of actin cytoskeleton” pathway, etc.
were identified, respectively (Supplemental Tables S3 and S4). During During the adaptation process of H. molitrix, thirty-six genes that were

2948
Y. Zhou et al. Genomics 113 (2021) 2944–2952

probably subjected to positive selection were identified (FDR < 0.1).


These include ccr6b, CXCR5, and tnfb, which is included in the “Cyto­
kine-cytokine receptor interaction” pathway, and pfn2, which is
included in the “Regulation of actin cytoskeleton” pathway, etc. Inter­
estingly, three genes, hpx, CXCR5, and TRIM59, were constantly sub­
jected to positive selection during the adaptation process from the LCA
of H. molitrix and H. nobilis to H. molitrix.

3.5. Demographic history of the species

The demographic history for both of the HM1 and HM2 were
deduced (Fig. 3). Both of the two analyses showed that the EPS of
H. molitrix was approximately 110 k more than 4 million years ago
(Mya). For HM1, the EPS continuously increased until 1.5 Mya, and the Fig. 4. Possible genetic introgression from H. nobilis to H. molitrix in the North
peak value was approximately 195 K. Thereafter, the EPS decreased to America.
approximately 90 k until approximately 0.7 Mya and stabilized for The arrow indicates the phylogenetic position of significant D statistics. HM1,
approximately 40 k years, after which the EPS suddenly decreased to HM2, HN, and MA denotes the individuals of H. molitrix collected in East Asia
27.4 k. The trend of the demographic history of HM2 is similar. How­ and North America, the individual of H. nobilis collected in North America, and
ever, HM2 reached a peak value of 165 k at approximately 2 Mya, and the outgroup M. amblycephala, respectively.
the stable value of 88 k at approximately 0.9 Mya and started to decrease
approximately 60 k years, with an ultimate EPS of 12.9 k. [7,10,55,59–61] (Supplemental Table S1). To some extent, the higher
coverage and longer sequencing reads may lead to more continuous
3.6. Introgression contigs. Thus, compared to previous studies of whole genome
sequencing of cyprinids, the contig N50 length of the present species was
Using the closely related species M. amblycephala as an outgroup, the much longer (Supplemental Table S1). The accuracy of the assembly was
introgression from H. nobilis to H. molitrix was estimated. Given that the confirmed by a number of assessments. It was interesting that the esti­
three ingroups are HM1, HM2, and H. nobilis in sequence, the ABBA- mated genome size of Jian et al. [10] was much less than ours (782 Mb vs
BABA test showed that the numbers of BABA and ABBA sites were 856 Mb), although the two assemblies had a similar size (837 Mb vs
489,089 and 139,933, respectively (Supplemental Table S5). The 856.5 Mb). We supposed that the two individuals were collected from
deduced D value was 0.556, with a Zscore of 32.6. This result corrob­ different local populations and genome sizes may vary significantly
orated the introgression from H. nobilis to HM2 (Fig. 4). Given that the between individuals since large-scale genome size variation between-
number of sites resulting from incomplete lineage sorting (ILS) may be and even within-populations has been reported before [62]. Further
approximately 139,933, the number of introgressed sites from H. nobilis pan-genome investigations comparing individuals from more local
to HM2 may be 349,156. Moreover, using the H. nobilis individual to populations may figure this out more clearly. Moreover, the similar
represent hypothetical complete introgression, the deduced number of distribution of the length of genes, CDSs, exons, and introns, and the
total introgressed sites was 6,030,357; thus, the possible proportion of number of exons and introns per gene between H. molitrix and related
introgressive sites was approximately 5.8%. species may suggest the accuracy of its gene model predictions (Sup­
plemental Fig. S3). Based on the newly generated whole genome se­
4. Discussion quences, a number of mysteries related to the ecological adaptations of
H. mobilis were solved.
In the present study, the raw data used for de novo assembly were
generated by the PacBio SequelII platform with the CLR model, and the
4.1. Pivotal genes possibly related to the feeding habitats
newest technology generated more than 145.35 Gbs data that repre­
sented 169× of the estimated genome. The coverage of raw data was
Both H. molitrix and H. nobilis are planktivorous, with the former
much higher than that in previous studies of the same order
tending to feed on phytoplankton and the latter preferring zooplankton.
However, the difference is not absolute, and most of the time, their diets
overlap. Blue-green algae are phytoplankton and can generate cyano­
toxins. The algae can multiply exponentially when environmental fac­
tors are suitable, such as high temperature and water eutrophication,
and can form cyanobacterial blooms. Microcystin is common in most
algal blooms, and more than 200 kinds have been identified; of these
kinds, microcystin-LR (MCLR) is the most common, and its toxicology is
well studied [63]. MCLR may inhibit protein phosphatases (PP1) and
protein phosphatases (PP2A) and induce oxidative stress, which may
lead to hepatic hemorrhage, necrosis, inflammation, apoptosis, and
cytoskeletal and DNA destruction [63,64]. Thus, it is pivotal to address
these consequences because fishes mainly feed on phytoplankton, such
as H. molitrix, H. nobilis, and their LCA.
We hypothesized that genes participating in the pathways address
the consequences of MCLR may be adapted. In their LCA, Napsin A
Fig. 3. The deduced demographic history of H. molitrix and H. nobilis.
Curves marked with BM1 and BM2 denotes possible demographic history of the aspartic peptidase (NAPSA) participates in the Autophagy, Apoptosis,
two individuals of H. molitrix collected in East Asia and North America, and Lysosome pathways [65], and the Solute carrier family 25 (slc25a4)
respectively. One hundred times of bootstrap were performed for BM1 and BM2 participates in the Necroptosis and Cellular senescence pathways and
and were plotted in light-colored lines. Possible demographic history of the was predicted to be subject to positive selection [66]. Moreover, genes
individual of H. nobilis collected in North America is also shown. such as Integrin beta 4 (itgb4) participate in the Regulation of actin

2949
Y. Zhou et al. Genomics 113 (2021) 2944–2952

cytoskeleton pathway [67], and Keratin 97 belongs to the intermediate high rate of hybridization between the fish species was detected [2,5].
filament family and may be adapted to survive from cytoskeletal and Interspecific hybridization between the two species may be because a
DNA destruction [68]. more homogenous environment in the exotic locations breaks their
In H. molitrix, two more genes related to the process, Tubulin-folding reproductive isolation barriers [2]. Previous studies verified their hy­
cofactor B (tbcb) and Profilin 2 (pfn2) which belong to the profilin bridization based on both morphological traits and few nuclear markers
family and are included in the Regulation of actin cytoskeleton pathway [2]. Using the ABBA-BABA test [57,83], this hybridization was further
[69,70], were also subject to positive selection. X-ray repair com­ directly confirmed by whole genome data. Further estimation of the
plementing defective repair in Chinese hamster cells 6 (xrcc6) partici­ fraction of gene flow [58] showed that 5.8% of the H. molitrix genome
pates in the Non-homologous end-joining pathway may also be related may have originated from H. nobilis through introgression (Fig. 4).
to this process [71]. Four genes were shown to possibly interacted ac­ Scrutinizing the possible introgressed sites showed that the introgression
cording to String predictions: the Chemokine (C-X-C motif) receptor 5 had no obvious bias toward specific chromosomes (data not shown).
(CXCR5) and Tumor necrosis factor b (tnfb), which are involved in the However, the H. nobilis individuals collected from North America may
Cytokine-cytokine receptor interaction pathway, C5a anaphylatoxin have introgressed regions from H. molitrix, and only one individual was
chemotactic receptor 1 (c5ar1), which participates in the Neuroactive used for each of the lineages, which may have resulted in the omission of
ligand-receptor interaction pathway, and G protein-coupled receptor some informative sites. Thus, the present conclusion may be biased, and
55a (gpr55a), which may aid in the adaptation to the consequences further investigation should use samples of more individuals chosen
induced by microcystin, such as necrosis and inflammation, etc. purposefully.
[72–74]. The susceptibility of Endonuclease G (endog) to positive se­ To date, high-quality genome sequences of Cyprinidae have been
lection may result from the adaptation to Apoptosis [75]. scarce. This may hinder the evolutionary and comparative genomics
The three genes, Hemopexin (hpx), Tripartite motif-containing 59 analyses of the large family. The chromosomal genome sequences pre­
(TRIM59) and CXCR5, were continuously subject to positive selection, sented here may slightly improve the situation. Moreover, the high-
which may suggest their importance. CXCR5 may be specifically quality assembly may provide a better reference for studies of the in­
expressed in lymphatic tissues and play important roles in B cell vasion biology of the species in North America and many other places.
migration [76], and these immune-related genes are always prone to Supplementary data to this article can be found online at https://doi.
subject to positive selection [77]. org/10.1016/j.ygeno.2021.06.024.
Many TRIM proteins are ubiquitin E3 ligases and play central roles in
the host defense against viral infection [78], and TRIM59 is involved in a Author contributions
series of cellular processes, including autophagy [79]. The functions of
hpx are to transport heme to the liver and recovery iron, and proteomic Yi Zhou, Weiling Qin and Huan Zhong conceived the project and
studies have shown that this protein is one of the proteins specifically managed the project; Yi Zhou and Hong Zhang collected the samples;
affected by MCLR in mice, rats, and zebrafish [64]. Thus, the adaptation Weiling Qin and Huan Zhong performed the bioinformatic analysis; Yi
of the gene may result from the consequences of MCLR directly. Zhou, Huan Zhong and Luojing Zhou wrote the manuscript; Weiling Qin
and Hong Zhang revised the manuscript. All authors commented on the
4.2. Demographic history manuscript.

The demographic history of H. molitrix may provide insights into its Code availability
adaptation to ancient environments. The eggs of H. molitrix float, and
their incubation must occur in the water current. In East Asia, the No specific code was used in this work.
summer monsoon can bring water vapor from the ocean, which can
generate rainfall, intensify the water currents, and eventually result in Data accessibility
the propagation of H. molitrix. The continuous uplift of the Qinghai-
Tibet Plateau can reinforce and extend the process to high latitude Raw reads generated in the present study are deposit in the NCBI SRA
areas [80,81]. Thus, until approximately 2 Mya, along with the uplift of database under the Accession no. PRJNA631443. The genome model
the plateau, the distribution areas of the species may have been files and functional annotation files are available on Figshare under the
increased, and the EPS continuously increased (Fig. 3). The process https://doi.org/10.6084/m9.figshare.12618884.v1.
stopped when Quaternary glaciation started at approximately 2 Mya Declaration of Competing Interest
since the lowered temperature may have resulted in the shrinkage of its All authors declare that they have no competing interests.
distribution area. Finally, the EPS stabilized at 90 k until approximately
30 k years ago, and we hypothesized that the possible causes of the
Younger Dryas Period [82] may lead to the sharp decline in its EPS. Acknowledgement
The demographic history for HM2 was also deduced. The trend for
the two individuals is similar, but the deduced time of the peak value This work was supported by the National Natural Science Foundation
and the absolute number of EPS are not the same (Fig. 3). To some of China (grant no. 31672627, 31760756), the Natural Science Foun­
extent, the deduced EPS for HM2 is closer to that of H. nobilis, especially dation of Guangxi (grant no. 2017GXNSFFA198001), the Hunan Pro­
those prior to 1 Mya (Fig. 3). Given that HM2 may be subject to intro­ vincial Key Laboratory of Nutrition and Quality Control of Aquatic
gression from H. nobilis in North America [2,5,7], we thought that the Animals (No. 2018TP1027) and the Guangxi Key Laboratory of Beibu
introgressed regions may influence the accuracy of the deduced EPS, and Gulf Marine Biodiversity Conservation (No. 2020KB02).
further analyses should try to avoid using this kind of sample.
References
4.3. Introgression
[1] J.S. Nelson, Fishes of the World, the Fourth Edition, John Wiley & Sons, Inc.,
Hoboken, New Jersey, 2006.
The situation of introgression from H. nobilis to HM2 was assessed. [2] G. Lu, C. Wang, J. Zhao, X. Liao, J. Wang, M. Luo, L. Zhu, L. Bernatzhez, S. Li,
Generally, the two species were seldom hybrid in their native regions. Evolution and Genetics of Bighead and Silver Carps: Native Population
However, based on whole-genome resequencing data, few in­ Conservation Versus Invasive Species Control, Evolutionary Applications, n/a,
2020.
trogressions from H. molitrix to H. nobilis have been identified [10]. In [3] X. Zhang, P. Xie, X. Huang, A review of nontraditional biomanipulation, Sci. World
sharp contrast, in the Mississippi and Illinois rivers in North America, a J. 8 (2008) 1184–1196.

2950
Y. Zhou et al. Genomics 113 (2021) 2944–2952

[4] C.S. Kolar, D.C. Chapman, W.R. Courtenay Jr., C.M. Housel, J.D. Williams, D. [31] S. Ou, N. Jiang, LTR_retriever: a highly accurate and sensitive program for
P. Jennings, Bigheaded carps: a biological synopsis and environmental risk identification of long terminal repeat retrotransposons, Plant Physiol. 176 (2018)
assessment, Bethesda (Maryland): Am. Fish. Soc. 33 (2007). 1410–1422.
[5] J.T. Lamer, C.R. Dolan, J.L. Petersen, J.H. Chick, J.M. Epifanio, Introgressive [32] Z. Bao, S.R. Eddy, Automated de novo identification of repeat sequence families in
hybridization between bighead carp and silver carp in the Mississippi and Illinois sequenced genomes, Genome Res. 12 (2002) 1269–1276.
Rivers, N. Am. J. Fish Manag. 30 (2010) 1452–1461. [33] A.L. Price, N.C. Jones, P.A. Pevzner, De novo identification of repeat families in
[6] C.A. Stepien, M.R. Snyder, A.E. Elz, Invasion genetics of the silver carp large genomes, Bioinformatics 21 (Suppl. 1) (2005) i351–i358.
Hypophthalmichthys molitrix across North America: differentiation of fronts, [34] W. Bao, K.K. Kojima, O. Kohany, Repbase update, a database of repetitive elements
introgression, and eDNA metabarcode detection, PLoS One 14 (2019), e0203012. in eukaryotic genomes, Mob. DNA 6 (2015) 11.
[7] J. Wang, S. Gaughan, J.T. Lamer, C. Deng, W. Hu, M. Wachholtz, S. Qin, H. Nie, [35] M.G. Grabherr, B.J. Haas, M. Yassour, J.Z. Levin, D.A. Thompson, I. Amit,
X. Liao, Q. Ling, W. Li, L. Zhu, L. Bernatchez, C. Wang, G. Lu, Resolving the genetic X. Adiconis, L. Fan, R. Raychowdhury, Q. Zeng, Z. Chen, E. Mauceli, N. Hacohen,
paradox of invasions: preadapted genomes and postintroduction hybridization of A. Gnirke, N. Rhind, F. di Palma, B.W. Birren, C. Nusbaum, K. Lindblad-Toh,
bigheaded carps in the Mississippi River basin, Evol. Appl. 13 (2020) 263–277. N. Friedman, A. Regev, Full-length transcriptome assembly from RNA-Seq data
[8] L. Pearson, T. Mihali, M. Moffitt, R. Kellmann, B. Neilan, On the chemistry, without a reference genome, Nat. Biotechnol. 29 (2011) 644–652.
toxicology and genetics of the cyanobacterial toxins, microcystin, nodularin, [36] B.J. Haas, A. Papanicolaou, M. Yassour, M. Grabherr, P.D. Blood, J. Bowden, M.
saxitoxin and cylindrospermopsin, Mar. Drugs 8 (2010) 1650–1680. B. Couger, D. Eccles, B. Li, M. Lieber, M.D. MacManes, M. Ott, J. Orvis, N. Pochet,
[9] E.M. Jochimsen, W.W. Carmichael, J.S. An, D.M. Cardo, S.T. Cookson, C.E. Holmes, F. Strozzi, N. Weeks, R. Westerman, T. William, C.N. Dewey, R. Henschel, R.
M.B. Antunes, D.A. de Melo Filho, T.M. Lyra, V.S. Barreto, S.M. Azevedo, W. D. LeDuc, N. Friedman, A. Regev, De novo transcript sequence reconstruction from
R. Jarvis, Liver failure and death after exposure to microcystins at a hemodialysis RNA-seq using the trinity platform for reference generation and analysis, Nat.
center in Brazil, N. Engl. J. Med. 338 (1998) 873–878. Protoc. 8 (2013) 1494–1512.
[10] J. Jian, L. Yang, X. Gan, B. Wu, L. Gao, H. Zeng, X. Wang, Z. Liang, Y. Wang, [37] B.J. Haas, S.L. Salzberg, W. Zhu, M. Pertea, J.E. Allen, J. Orvis, O. White, C.
L. Fang, J. Li, S. Jiang, K. Du, B. Fu, M. Bai, M. Chen, X. Fang, H. Liu, S. He, Whole R. Buell, J.R. Wortman, Automated eukaryotic gene structure annotation using
genome sequencing of silver carp (Hypophthalmichthys molitrix) and bighead carp EVidenceModeler and the program to assemble spliced alignments, Genome Biol. 9
(Hypophthalmichthys nobilis) provide novel insights into their evolution and (2008) R7.
speciation, Mol. Ecol. Resour. 21 (2020) 912–923. [38] M. Stanke, B. Morgenstern, AUGUSTUS: a web server for gene prediction in
[11] E. Lieberman-Aiden, N.L. van Berkum, L. Williams, M. Imakaev, T. Ragoczy, eukaryotes that allows user-defined constraints, Nucleic Acids Res. 33 (2005)
A. Telling, I. Amit, B.R. Lajoie, P.J. Sabo, M.O. Dorschner, R. Sandstrom, W465–W467.
B. Bernstein, M.A. Bender, M. Groudine, A. Gnirke, J. Stamatoyannopoulos, L. [39] S.L. Salzberg, A.L. Delcher, S. Kasif, O. White, Microbial gene identification using
A. Mirny, E.S. Lander, J. Dekker, Comprehensive mapping of long-range interpolated Markov models, Nucleic Acids Res. 26 (1998) 544–548.
interactions reveals folding principles of the human genome, Science 326 (2009) [40] I. Korf, Gene finding in novel genomes, BMC Bioinforma. 5 (2004) 59.
289–293. [41] R. She, J.S. Chu, K. Wang, J. Pei, N. Chen, GenBlastA: enabling BLAST to identify
[12] S.S. Rao, M.H. Huntley, N.C. Durand, E.K. Stamenova, I.D. Bochkov, J.T. Robinson, homologous gene sequences, Genome Res. 19 (2009) 143–149.
A.L. Sanborn, I. Machol, A.D. Omer, E.S. Lander, E.L. Aiden, A 3D map of the [42] E. Birney, M. Clamp, R. Durbin, GeneWise and genomewise, Genome Res. 14
human genome at kilobase resolution reveals principles of chromatin looping, Cell (2004) 988–995.
159 (2014) 1665–1680. [43] C. Camacho, G. Coulouris, V. Avagyan, N. Ma, J. Papadopoulos, K. Bealer, T.
[13] S. Chen, Y. Zhou, Y. Chen, J. Gu, Fastp: an ultra-fast all-in-one FASTQ preprocessor, L. Madden, BLAST+: architecture and applications, BMC Bioinforma. 10 (2009)
Bioinformatics 34 (2018) i884–i890. 421.
[14] G. Marcais, C. Kingsford, A fast, lock-free approach for efficient parallel counting of [44] E.V. Koonin, N.D. Fedorova, J.D. Jackson, A.R. Jacobs, D.M. Krylov, K.
occurrences of k-mers, Bioinformatics 27 (2011) 764–770. S. Makarova, R. Mazumder, S.L. Mekhedov, A.N. Nikolskaya, B.S. Rao, I.
[15] T.R. Ranallo-Benavidez, K.S. Jaron, M.C. Schatz, GenomeScope 2.0 and B. Rogozin, S. Smirnov, A.V. Sorokin, A.V. Sverdlov, S. Vasudevan, Y.I. Wolf, J.
Smudgeplot for reference-free profiling of polyploid genomes, Nat. Commun. 11 J. Yin, D.A. Natale, A comprehensive evolutionary classification of proteins
(2020) 1432. encoded in complete eukaryotic genomes, Genome Biol. 5 (2004) R7.
[16] C.S. Chin, P. Peluso, F.J. Sedlazeck, M. Nattestad, G.T. Concepcion, A. Clum, [45] M. Kanehisa, Y. Sato, M. Kawashima, M. Furumichi, M. Tanabe, KEGG as a
C. Dunn, R. O’Malley, R. Figueroa-Balderas, A. Morales-Cruz, G.R. Cramer, reference resource for gene and protein annotation, Nucleic Acids Res. 44 (2015)
M. Delledonne, C. Luo, J.R. Ecker, D. Cantu, D.R. Rank, M.C. Schatz, Phased D457–D462.
diploid genome assembly with single-molecule real-time sequencing, Nat. Methods [46] S. Hunter, R. Apweiler, T.K. Attwood, A. Bairoch, A. Bateman, D. Binns, P. Bork,
13 (2016) 1050–1054. U. Das, L. Daugherty, L. Duquenne, R.D. Finn, J. Gough, D. Haft, N. Hulo, D. Kahn,
[17] M.J. Chaisson, G. Tesler, Mapping single molecule sequencing reads using basic E. Kelly, A. Laugraud, I. Letunic, D. Lonsdale, R. Lopez, M. Madera, J. Maslen,
local alignment with successive refinement (BLASR): application and theory, BMC C. McAnulla, J. McDowall, J. Mistry, A. Mitchell, N. Mulder, D. Natale, C. Orengo,
Bioinforma. 13 (2012) 238. A.F. Quinn, J.D. Selengut, C.J.A. Sigrist, M. Thimma, P.D. Thomas, F. Valentin,
[18] B.J. Walker, T. Abeel, T. Shea, M. Priest, A. Abouelliel, S. Sakthikumar, C. D. Wilson, C.H. Wu, C. Yeats, InterPro: the integrative protein signature database,
A. Cuomo, Q. Zeng, J. Wortman, S.K. Young, A.M. Earl, Pilon: an integrated tool for Nucleic Acids Res. 37 (2009) D211–D215.
comprehensive microbial variant detection and genome assembly improvement, [47] P. Jones, D. Binns, H.Y. Chang, M. Fraser, W. Li, C. McAnulla, H. McWilliam,
PLoS One 9 (2014), e112963. J. Maslen, A. Mitchell, G. Nuka, S. Pesseat, A.F. Quinn, A. Sangrador-Vegas,
[19] H. Li, Aligning sequence reads, clone sequences and assembly contigs with BWA- M. Scheremetjew, S.Y. Yong, R. Lopez, S. Hunter, InterProScan 5: genome-scale
MEM, arXiv (2013), 1303.3997. protein function classification, Bioinformatics 30 (2014) 1236–1240.
[20] H. Li, Minimap2: pairwise alignment for nucleotide sequences, Bioinformatics 34 [48] Z. Yang, PAML: a program package for phylogenetic analysis by maximum
(2018) 3094–3100. likelihood, Comput. Appl. Biosci. 13 (1997) 555–556.
[21] M.J. Roach, S.A. Schmidt, A.R. Borneman, Purge haplotigs: allelic contig [49] A. Sahm, M. Bens, M. Platzer, K. Szafranski, PosiGene: automated and easy-to-use
reassignment for third-gen diploid genome assemblies, BMC Bioinforma. 19 (2018) pipeline for genome-wide detection of positively selected genes, Nucleic Acids Res.
460. 45 (2017), e100.
[22] B. Langmead, S.L. Salzberg, Fast gapped-read alignment with Bowtie 2, Nat. [50] Y. Benjamini, Y. Hochberg, Controlling the false discovery rate: a practical and
Methods 9 (2012) 357–359. powerful approach to multiple testing, J. R. Stat. Soc. Ser. B Methodol. 57 (1995)
[23] J.N. Burton, A. Adey, R.P. Patwardhan, R. Qiu, J.O. Kitzman, J. Shendure, 289–330.
Chromosome-scale scaffolding of de novo genome assemblies based on chromatin [51] D. Szklarczyk, A.L. Gable, D. Lyon, A. Junge, S. Wyder, J. Huerta-Cepas,
interactions, Nat. Biotechnol. 31 (2013) 1119–1125. M. Simonovic, N.T. Doncheva, J.H. Morris, P. Bork, L.J. Jensen, C.V. Mering,
[24] A. Varasteh, M. Hossienzadeh Mogaddam, M. Pourkazemi, M.R. Norooz STRING v11: protein-protein association networks with increased coverage,
Fashkhami, Karyotyping and number of chromosomes of silver carp supporting functional discovery in genome-wide experimental datasets, Nucleic
(Hypophthalmichthys molitrix), Iran. Fish. Sci. J. 11 (2002) 107–115. Acids Res. 47 (2019) D607–D613.
[25] M. Seppey, M. Manni, E.M. Zdobnov, BUSCO: assessing genome assembly and [52] A. McKenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis, A. Kernytsky,
annotation completeness, Methods Mol. Biol. 1962 (2019) 227–245. K. Garimella, D. Altshuler, S. Gabriel, M. Daly, M.A. DePristo, The genome analysis
[26] D. Kim, B. Langmead, S.L. Salzberg, HISAT: a fast spliced aligner with low memory toolkit: a MapReduce framework for analyzing next-generation DNA sequencing
requirements, Nat. Methods 12 (2015) 357–360. data, Genome Res. 20 (2010) 1297–1303.
[27] Y. Wang, H. Tang, J.D. Debarry, X. Tan, J. Li, X. Wang, T.H. Lee, H. Jin, B. Marler, [53] H. Li, A statistical framework for SNP calling, mutation discovery, association
H. Guo, J.C. Kissinger, A.H. Paterson, MCScanX: a toolkit for detection and mapping and population genetical parameter estimation from sequencing data,
evolutionary analysis of gene synteny and collinearity, Nucleic Acids Res. 40 Bioinformatics 27 (2011) 2987–2993.
(2012), e49. [54] H. Li, R. Durbin, Inference of human population history from individual whole-
[28] M. Krzywinski, J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Horsman, S.J. Jones, genome sequences, Nature 475 (2011) 493–496.
M.A. Marra, Circos: an information aesthetic for comparative genomics, Genome [55] H. Liu, C. Chen, Z. Gao, J. Min, Y. Gu, J. Jian, X. Jiang, H. Cai, I. Ebersberger,
Res. 19 (2009) 1639–1645. M. Xu, X. Zhang, J. Chen, W. Luo, B. Chen, J. Chen, H. Liu, J. Li, R. Lai, M. Bai,
[29] S. Beier, T. Thiel, T. Munch, U. Scholz, M. Mascher, MISA-web: a web server for J. Wei, S. Yi, H. Wang, X. Cao, X. Zhou, Y. Zhao, K. Wei, R. Yang, B. Liu, S. Zhao,
microsatellite prediction, Bioinformatics 33 (2017) 2583–2585. X. Fang, M. Schartl, X. Qian, W. Wang, The draft genome of blunt snout bream
[30] G. Benson, Tandem repeats finder: a program to analyze DNA sequences, Nucleic (Megalobrama amblycephala) reveals the development of intermuscular bone and
Acids Res. 27 (1999) 573–580. adaptation to herbivorous diet, Gigascience 6 (2017) 1–13.
[56] M. Petr, B. Vernot, J. Kelso, Admixr—R package for reproducible analyses using
ADMIXTOOLS, Bioinformatics 35 (2019) 3194–3195.

2951
Y. Zhou et al. Genomics 113 (2021) 2944–2952

[57] N. Patterson, P. Moorjani, Y. Luo, S. Mallick, N. Rohland, Y. Zhan, T. Genschoreck, [70] J. Shao, W.J. Welch, N.A. Diprospero, M.I. Diamond, Phosphorylation of profilin by
T. Webster, D. Reich, Ancient admixture in human history, Genetics 192 (2012) ROCK1 regulates polyglutamine aggregation, Mol. Cell. Biol. 28 (2008)
1065–1093. 5196–5208.
[58] Y. Zheng, A. Janke, Gene flow analysis method, the D-statistic, is robust in a wide [71] S.A. Roberts, N. Strande, M.D. Burkhalter, C. Strom, J.M. Havener, P. Hasty, D.
parameter space, BMC Bioinforma. 19 (2018) 10. A. Ramsden, Ku is a 5’-dRP/AP lyase that excises nucleotide damage near broken
[59] L. Yang, Y. Wang, T. Wang, S. Duan, Y. Dong, Y. Zhang, S. He, A chromosome-scale ends, Nature 464 (2010) 1214–1217.
reference assembly of a Tibetan loach, Triplophysa siluroides, Front. Genet. 10 [72] T. Dobner, I. Wolf, T. Emrich, M. Lipp, Differentiation-specific expression of a
(2019) 991. novel G protein-coupled receptor from Burkitt’s lymphoma, Eur. J. Immunol. 22
[60] X. Yang, H. Liu, Z. Ma, Y. Zou, M. Zou, Y. Mao, X. Li, H. Wang, T. Chen, W. Wang, (1992) 2795–2799.
R. Yang, Chromosome-level genome assembly of Triplophysa tibetana, a fish [73] Y. Kobayashi, D. Miyamoto, M. Asada, M. Obinata, T. Osawa, Cloning and
adapted to the harsh high-altitude environment of the Tibetan plateau, Mol. Ecol. expression of human lymphotoxin mRNA derived from a human T cell hybridoma,
Resour. 19 (2019) 1027–1036. J. Biochem. 100 (1986) 727–733.
[61] Y. Wang, Y. Lu, Y. Zhang, Z. Ning, Y. Li, Q. Zhao, H. Lu, R. Huang, X. Xia, Q. Feng, [74] L. Cheng, H. Bu, J.A. Portillo, Y. Li, C.S. Subauste, S.S. Huang, T.S. Kern, F. Lin,
X. Liang, K. Liu, L. Zhang, T. Lu, T. Huang, D. Fan, Q. Weng, C. Zhu, Y. Lu, W. Li, Modulation of retinal Müller cells by complement receptor C5aR, Invest.
Z. Wen, C. Zhou, Q. Tian, X. Kang, M. Shi, W. Zhang, S. Jang, F. Du, S. He, L. Liao, Ophthalmol. Vis. Sci. 54 (2013) 8191–8198.
Y. Li, B. Gui, H. He, Z. Ning, C. Yang, L. He, L. Luo, R. Yang, Q. Luo, X. Liu, S. Li, [75] R.L. Low, Mitochondrial endonuclease G function in apoptosis and mtDNA
W. Huang, L. Xiao, H. Lin, B. Han, Z. Zhu, The draft genome of the grass carp metabolism: a historical perspective, Mitochondrion 2 (2003) 225–236.
(Ctenopharyngodon idellus) provides insights into its evolution and vegetarian [76] R. Forster, A.E. Mattis, E. Kremmer, E. Wolf, G. Brem, M. Lipp, A putative
adaptation, Nat. Genet. 47 (2015) 625–631. chemokine receptor, BLR1, directs B cell migration to defined lymphoid organs and
[62] C.-P. Stelzer, M. Pichler, P. Stadler, A. Hatheuer, S. Riss, Within-population specific anatomic compartments of the spleen, Cell 87 (1996) 1037–1047.
genome size variation is mediated by multiple genomic elements that segregate [77] A. Demogines, J. Abraham, H. Choe, M. Farzan, S.L. Sawyer, Dual host-virus arms
independently during meiosis, Genome Biol. Evol. 11 (2019) 3424–3435. races shape an essential housekeeping protein, PLoS Biol. 11 (2013), e1001571.
[63] I.Y. Massey, F. Yang, A mini review on microcystins and bacterial degradation, [78] M. van Gent, K.M.J. Sparrer, M.U. Gack, TRIM proteins and their roles in antiviral
Toxins (Basel) 12 (2020). host defenses, Annu. Rev. Virol. 5 (2018) 385–405.
[64] R.D. Welten, J.P. Meneely, C.T. Elliott, A comparative review of the effect of [79] P. Tan, Y. Ye, L. He, J. Xie, J. Jing, G. Ma, H. Pan, L. Han, W. Han, Y. Zhou, TRIM59
microcystin-LR on the proteome, Expo Health 12 (2020) 111–129. promotes breast cancer motility by suppressing p62-selective autophagic
[65] Y. Chuman, A. Bergman, T. Ueno, S. Saito, K. Sakaguchi, A.A. Alaiya, B. Franzén, degradation of PDCD10, PLoS Biol. 16 (2018), e3000051.
T. Bergman, D. Arnott, G. Auer, E. Appella, H. Jörnvall, S. Linder, Napsin a, a [80] D. Zheng, T. Yao, Uplifting of Tibetan Plateau with its environmental effects, Adv.
member of the aspartic protease family, is abundantly expressed in normal lung Earth Sci. 21 (2006).
and kidney tissue and is expressed in lung adenocarcinomas, FEBS Lett. 462 (1999) [81] X. Li, Z. Pan, X. Liu, Numerical simulation of influence of Tibetan plateau uplift on
129–134. winter dust cycle in Asian arid regions, Environ. Earth Sci. 75 (2016) 601.
[66] Y.H. Kim, G. Haidl, M. Schaefer, U. Egner, A. Mandal, J.C. Herr, [82] H. Renssen, A. Mairesse, H. Goosse, P. Mathiot, O. Heiri, D.M. Roche, K.
Compartmentalization of a unique ADP/ATP carrier protein SFEC (Sperm Flagellar H. Nisancioglu, P.J. Valdes, Multiple causes of the younger Dryas cold period, Nat.
Energy Carrier, AAC4) with glycolytic enzymes in the fibrous sheath of the human Geosci. 8 (2015) 946–949.
sperm flagellar principal piece, Dev. Biol. 302 (2007) 463–476. [83] R.E. Green, J. Krause, A.W. Briggs, T. Maricic, U. Stenzel, M. Kircher, N. Patterson,
[67] J. Koster, D. Geerts, B. Favre, L. Borradori, A. Sonnenberg, Analysis of the H. Li, W. Zhai, M.H. Fritz, N.F. Hansen, E.Y. Durand, A.S. Malaspinas, J.D. Jensen,
interactions between BP180, BP230, plectin and the integrin alpha6beta4 T. Marques-Bonet, C. Alkan, K. Prufer, M. Meyer, H.A. Burbano, J.M. Good,
important for hemidesmosome assembly, J. Cell Sci. 116 (2003) 387–399. R. Schultz, A. Aximu-Petri, A. Butthof, B. Hober, B. Hoffner, M. Siegemund,
[68] L. Polari, C.M. Alam, J.H. Nyström, T. Heikkilä, M. Tayyab, S. Baghestani, D. A. Weihmann, C. Nusbaum, E.S. Lander, C. Russ, N. Novod, J. Affourtit, M. Egholm,
M. Toivola, Keratin intermediate filaments in the colon: guardians of epithelial C. Verna, P. Rudan, D. Brajkovic, Z. Kucan, I. Gusic, V.B. Doronichev, L.
homeostasis, Int. J. Biochem. Cell Biol. 129 (2020) 105878. V. Golovanova, C. Lalueza-Fox, M. de la Rasilla, J. Fortea, A. Rosas, R.W. Schmitz,
[69] R.K. Vadlamudi, C.J. Barnes, S. Rayala, F. Li, S. Balasenthil, S. Marcus, H. P.L.F. Johnson, E.E. Eichler, D. Falush, E. Birney, J.C. Mullikin, M. Slatkin,
V. Goodson, A.A. Sahin, R. Kumar, p21-activated kinase 1 regulates microtubule R. Nielsen, J. Kelso, M. Lachmann, D. Reich, S. Paabo, A draft sequence of the
dynamics by phosphorylating tubulin cofactor B, Mol. Cell. Biol. 25 (2005) Neandertal genome, Science 328 (2010) 710–722.
3726–3736.

2952

You might also like