You are on page 1of 15

Genomics 114 (2022) 110518

Contents lists available at ScienceDirect

Genomics
journal homepage: www.elsevier.com/locate/ygeno

Chromosome-level genome assembly of the Muscovy duck provides insight


into fatty liver susceptibility
Ming-Min Xu a, b, 1, Li-Hong Gu c, 1, Wan-Yue Lv a, d, 1, Sheng-Chang Duan e, 1, Lian-Wei Li b, f,
Yuan Du e, Li-Zhi Lu g, Tao Zeng g, Zhuo-Cheng Hou h, Zhanshan Sam Ma b, f, Wei Chen i, j,
Adeniyi C. Adeola a, Jian-Lin Han k, l, Tie-Shan Xu m, *, Yang Dong i, j, **, Ya-Ping Zhang a, b, d, n, ***,
Min-Sheng Peng a, b, n, ***
a
State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese
Academy of Sciences, Kunming 650223, China
b
Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming 650204, China
c
Institute of Animal Science & Veterinary Medicine, Hainan Academy of Agricultural Sciences, Haikou 571100, China
d
State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan University, Kunming 650091, China
e
Nowbio Biotechnology Company, Kunming 650201, China
f
Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences,
Kunming 650223, China
g
Institute of Animal Husbandry and Veterinary Science, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
h
National Engineering Laboratory for Animal Breeding and Key Laboratory of Animal Genetics, Breeding and Reproduction, MARA; College of Animal Science and
Technology, China Agricultural University, Beijing 100193, China
i
State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming 650201, China
j
Key Laboratory for Agro-Biodiversity and Pest Control of Ministry of Education, Yunnan Agricultural University, Kunming 650201, China
k
CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing
100193, China
l
Livestock Genetics Program, International Livestock Research Institute (ILRI), Nairobi 00100, Kenya
m
Tropical Crops Genetic Resources Research Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, China
n
KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research in Common Diseases, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming
650223, China

A R T I C L E I N F O A B S T R A C T

Keywords: The Muscovy duck (Cairina moschata) is an economically important poultry species, which is susceptible to fatty
Fatty liver liver. Thus, the Muscovy duck may serve as an excellent candidate animal model of non-alcoholic fatty liver
Muscovy duck disease. However, the mechanisms underlying fatty liver development in this species are poorly understood. In
Genome assembly
this study, we report a chromosome-level genome assembly of the Muscovy duck, with a contig N50 of 11.8 Mb
Evolutionary by-product
and scaffold N50 of 83.16 Mb. The susceptibility of Muscovy duck to fatty liver was mainly attributed to weak
Accelerated CNEs
lipid catabolism capabilities (fatty acid β-oxidation and lipolysis). Furthermore, conserved noncoding elements
(CNEs) showing accelerated evolution contributed to fatty liver formation by down-regulating the expression of
genes involved in hepatic lipid catabolism. We propose that the susceptibility of Muscovy duck to fatty liver is an
evolutionary by-product. In conclusion, this study revealed the potential mechanisms underlying the suscepti­
bility of Muscovy duck to fatty liver.

* Corresponding author.
** Correspondence to: Y. Dong, State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming
650201, China.
*** Correspondence to: Y.-P. Zhang and M.-S. Peng, State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Do­
mestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China.
E-mail addresses: xutieshan760412@163.com (T.-S. Xu), loyalyang@163.com (Y. Dong), zhangyp@mail.kiz.ac.cn (Y.-P. Zhang), pengminsheng@mail.kiz.ac.cn
(M.-S. Peng).
1
These authors contributed equally to this work.

https://doi.org/10.1016/j.ygeno.2022.110518
Received 14 August 2022; Received in revised form 1 November 2022; Accepted 4 November 2022
Available online 5 November 2022
0888-7543/© 2022 Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
M.-M. Xu et al. Genomics 114 (2022) 110518

1. Introduction show marked differences in fatty liver susceptibility. Notably, Muscovy


ducks are prone to develop severe fatty liver when overfed high carbo­
The Muscovy duck (Cairina moschata) is a tropical bird native to hydrate diets, while Peking ducks tend to develop mild fatty liver [14].
Central and South America [1]. The drake is characterized by a nuchal These differences in fatty liver susceptibility are mainly attributed to
crest and fleshy caruncles distributed around the orbital region to the genetic differences between the two species [15,16]. However, the ge­
base of the bill [2]. Wild Muscovy ducks prefer to live in marshy forests, netic mechanism underlying the susceptibility of Muscovy ducks to fatty
roosting in trees and nesting in tree hollows [3], are highly opportunistic liver remains unclear, hindered by the lack of a high-quality genome.
omnivores, feeding primarily on native vegetation and corn as well as However, this unique susceptibility to fatty liver is also utilized. Mule
small fish, reptiles, invertebrates, crustaceans, and insects [4,5]. The ducks produced from crossing male Muscovy ducks with female Peking
Muscovy duck is one of only two large birds (including the turkey) that ducks have replaced geese in foie gras production [17]. Moreover, due to
were originally domesticated in the New World [6]. During the second the shared fatty liver phenotype, the Muscovy duck is proposed as a
voyage of Columbus, Muscovy ducks were brought back to Europe, with potential animal model for studying non-alcoholic fatty liver disease
subsequent dispersal to Africa, Asia, and Australia [2,6]. As an (NAFLD) in humans. With an estimated global prevalence of 25%,
economically important poultry breed, Muscovy ducks are now farmed NAFLD is regarded as a hepatic manifestation of metabolic syndrome
worldwide for meat consumption due to their leanness, tenderness, and [18,19], and is the fastest growing cause of hepatocellular carcinoma,
taste [7,8], and are an essential source of income in many rural com­ liver transplantation, and liver-related mortality [20,21]. Fatty liver is
munities, especially in developing countries in Africa [9,10]. characterized by lipid accumulation and deposition in hepatocytes and
The liver is a primary site of lipid metabolism in birds [11]. In can evolve into NAFLD [22]. Hence, understanding the susceptibility of
waterfowl, non-pathological hepatic steatosis occurs spontaneously for Muscovy ducks to fatty liver may provide insight into the study of
energy storage prior to migration [12]. Muscovy and Peking ducks (Anas NAFLD in humans.
platyrhynchos) are two common domesticated waterfowls, which Herein, we de novo assembled a chromosome-level genome of the
diverged from their common ancestor ~13.8 million years ago [13] and Muscovy duck. We performed comparative genomic and transcriptomic

Fig. 1. Overview of the Muscovy duck genome assembly. (a) Pipeline used for assembly of the Muscovy duck genome. (b) Hi-C heat map of Muscovy duck showing
chromosome interactions. (c, d) Dot plots of whole-genome alignment between Muscovy and Peking duck genomes show a high degree of synteny, including
macrochromosomes (c) and microchromosomes (d).

2
M.-M. Xu et al. Genomics 114 (2022) 110518

analyses to investigate the genetic mechanism underlying susceptibility assembled contigs were scaffolded into 40 pseudochromosomes
to fatty liver in Muscovy duck, emphasizing the contribution of accel­ (Figs. 1a, b and 2) according to previous study [23], resulting in a
erated evolution of conserved noncoding elements (CNEs). This study chromosome-level genome assembly (KizCaiMos1.0) with a contig N50
provides novel insights into the genetic mechanism underlying fatty of 11.81 Mb and scaffold N50 of 83.16 Mb. The contig N50 was ~37
liver from an evolutionary medicine perspective and highlights the po­ times higher than the recently published assembly (ASM1810499v1)
tential of the Muscovy duck as a model animal for studying NAFLD. [13]. The KizCaiMos1.0 assembly showed excellent collinearity with
both the chicken and duck genome references [24] (Fig. 1c, d, and S3).
2. Results Syntenic analyses revealed misplacements and inversions in the recently
published draft Muscovy duck genome [13], which we aligned to the
2.1. Genome assembly and annotation KizCaiMos1.0 assembly (Figs. S4). Results showed that 94.3% of the
Aves Benchmarking Universal Single-Copy Orthologs (BUSCO) could be
Genome assemblies with a high degree of completeness and conti­ aligned to the Muscovy duck genome (Table 1). Compared with the
guity are desirable for comparative genomic analyses. The recently previous assemblies [13], the KizCaiMos1.0 genome exhibited higher
published draft Muscovy duck genome was assembled exclusively by continuity and completeness (Table 1) and was therefore used for sub­
linked-read sequencing, a cost-effective approach at the expense of sequent analyses.
genome contiguity [13]. Therefore, we de novo assembled a Muscovy We performed de novo and homology-based gene prediction of the
duck genome combining Oxford Nanopore long reads, Hi-C reads, Muscovy duck genome (Fig. S5; Table S2). In total, 15,829 protein-
linked-reads (10× Genomics), and Illumina short reads (Table S1). The coding genes were annotated in the KizCaiMos1.0 assembly. We

Fig. 2. Landscape of the Muscovy duck genome. Tracks (from outer to inner circles) indicate: (a) contigs and gaps as blank; (b) GC content (window size of 100 kb);
(c) mobile elements content (window size of 100 kb with a step size of 20 kb); (d, e) DNA methylation level (window size of 100 kb; darker colour indicates higher
methylation), including 5mC (d) and 6 mA (e); (f) gene density (gene numbers per 100 Kb); (g) gene expression level (FPKM; highest expression of 15 sequenced
tissues are shown; darker colour indicates higher expression level); and (h, i) variants density (window size of 100 kb), including single nucleotide polymorphisms
(SNPs) (h) and indels (i). Outer gray circle represents Muscovy duck chromosome length.

3
M.-M. Xu et al. Genomics 114 (2022) 110518

Table 1
Quality comparison for different versions of Muscovy duck assemblies.
Assembly Largest contig (Mb)a N50 contigs (Mb)a Largest scaffold (Mb)a N50 scaffolds (Mb)a “N” gaps (Mb)b BUSCO assessmentc

Cairina moschata (Muscovy duck) C:94.5% [S:93.1%, D:1.4%],


0.87 0.09 177.54 58.54 28.77
[CaiMos1.0] F:3.3%, M:2.2%, n:4915
Cairina moschata (Muscovy duck) C:93.7% [S:92.0%, D:1.7%],
2.05 0.32 194.81 77.35 15.03
[ASM1810499v1] F:3.1%, M:3.2%, n:4915
Cairina moschata (Muscovy duck) C:94.3% [S:93.2%, D:1.1%],
53.95 11.81 200.48 83.16 0.34
[KizCaiMos1.0] F:2.8%, M:2.9%, n:4915
Anas platyrhynchos (mallard) C:94.3% [S:92.6%, D:1.7%],
28.50 5.68 207.24 76.27 4.23
[ZJU1.0] F:3.4%, M:2.3%, n:4915
Gallus gallus (chicken) C:91.1% [S:90.0%, D:1.1%],
65.77 17.49 197.60 91.31 9.78
[GRCg6a] F:5.4%,M:3.5%,n:4915
a
Statistics were calculated using stats.sh script in BBMap(v.38.45).
b
Sum of all “N” nucleotides present in the genome assembly.
c
BUSCO assessment of genome assembly. C: Complete BUSCOs; S: Complete and single-copy BUSCOs; D: Complete and duplicated BUSCOs; F: Fragmented BUSCOs;
M: Missing BUSCOs; n: Total BUSCO groups searched.

further annotated non-coding RNA genes, including 176 microRNAs considered lost during the Muscovy duck evolution. Interestingly, a CNE
(miRNAs), 270 ribosomal RNAs (rRNAs), 227 small-nucleolar RNAs located in an intronic region of CDH13 was specifically lost in the
(snRNAs), and 360 transfer RNAs (tRNAs) (Table S3). Repeat sequences Muscovy duck (Fig. 3b). The CDH13 gene encodes for cadherin 13, a
were also identified (Table S4). Transposable elements comprised 10.3% vascular adiponectin receptor [33]. Furthermore, in humans, CDH13 is
of the genome, 5% of which were long interspersed nuclear elements. In reported to influence adiponectin levels in different ethnic populations
addition, C5-methylcytosine (5mC) and N6-methyldeoxyadenosine [34–36]. Adiponectin is an adipokine secreted by adipocytes and can
(6mA) sites, two important types of DNA methylation, were first promote fatty acid β-oxidation [37,38]. Thus, we speculated that the loss
detected in the genome by decoding the raw electric signal during of the CNE in CDH13 may alter fatty acid β-oxidation in the Muscovy
Nanopore sequencing. The average levels of genome methylation for duck by regulating adiponectin levels. We also identified 1379 CNEs
5mC and 6mA were 10.73% and 0.076%, respectively (Figs. 2 and S6). showing accelerated evolution specific to the Muscovy duck lineage
(hereafter referred to as Muscovy-accelerated CNEs; Table S9). The
typical cases of these Muscovy-accelerated CNEs will be elucidated later
2.2. Genomic selection and expansion of gene families in Muscovy ducks
through combination of other analyses.
To systematically investigate the genomic variations underlying fatty
liver susceptibility in Muscovy ducks, we conducted comparative
2.4. Distinctive lipid metabolism in Muscovy ducks
genomic analysis of 10 species (Fig. S7). No positively selected genes
(PSGs) and only five rapidly evolving genes (REGs) were retained ac­
Given their different fatty liver susceptibility, we considered that the
cording to the false discovery rate (FDR)-adjusted P-value <0.05. Thus,
Muscovy duck may be characterized by a unique lipid metabolism
we used the threshold of P-value <0.05 (chi-square test without FDR
profile compared to the Peking duck. To investigate differences in lipid
correction) and identified 38 putative PSGs and 190 putative rapidly
metabolism between these species, we retrieved publicly available
REGs (Tables S5 and S6). However, only a few PSG and REG hits were
transcriptomic data of liver tissues from both ducks, fed under two
found in lipid metabolism items catalogued in the Reactome database
conditions (ad libitum vs. overfeeding) [14]. The downloaded RNA
(Fig. S8). To some extent, this implies weak selective signals against
sequencing (RNA-seq) data were used in the following analyses. Inter-
lipid metabolism on the coding regions in the Muscovy ducks. Analysis
specific transcriptomic analysis was performed to characterize the he­
of gene family evolution detected 170 expanded and 1396 contracted
patic gene expression profiles of the Muscovy duck, using Peking duck as
gene families in the Muscovy duck genome (Fig. S9a). Functional
a reference, under the two feeding conditions (ad libitum vs. over­
enrichment analysis revealed that the expanded gene families were
feeding), based on 13,417 orthologous genes (Table S10).
enriched in carbohydrate digestion and absorption (Fig. 3a; Table S7),
Overall, we identified 3780 differentially expressed genes (DEGs)
suggesting improved carbohydrate digestion and absorption capacity for
between the two species independent of feeding condition (Fig. S10;
adaptation to a plant-based diet. However, functional enrichment
Table S11). The significant differences in gene expression were mainly
analysis of the rapidly expanded gene families specific to Muscovy ducks
ascribed to genetic divergences between the two duck species. Among
(Fig. S9b; Table S8) failed to detect any enriched signals directly related
the DEGs, 2183 (58%) were lowly expressed (hereafter referred to as
to lipid metabolism. Taken together, these findings suggest that selec­
Muscovy-low DEGs; Table S12) and 1597 were highly expressed (Mus­
tion on coding regions and expanded gene families is unlikely to be the
covy-high DEGs; Table S13) in the Muscovy duck liver compared to the
major evolutionary force driving susceptibility to fatty liver in Muscovy
Peking duck liver. However, no functional enrichments related to lipid
ducks.
or fatty acid metabolism were detected in the Muscovy-high DEGs
(Table S14). In contrast, functional enrichment analysis revealed that
2.3. Evolution of CNEs in Muscovy ducks the Muscovy-low DEGs were significantly enriched in fatty acid
biosynthesis (Fig. S11; Tables S15 and S16). In addition, several
CNEs are conserved DNA regions that evolved under purifying se­ Muscovy-low DEGs were functionally associated with lipid catabolism
lection and do not encode proteins [25–27]. CNEs act as cis-regulatory (fatty acid β-oxidation and lipolysis) (Figs. 4a and S12).
elements, including enhancers, insulators, and repressors, to regulate Gene set enrichment analysis (GSEA) revealed that fatty acid
the expression of target genes [28,29]. Divergence and loss of CNEs biosynthesis was up-regulated in Peking ducks, while fatty acid
during evolution account for a larger portion of phenotypic diversity β-oxidation was down-regulated in Muscovy ducks in response to
among species [27,30–32]. Here, we conducted a genome-wide overfeeding (Fig. 4b and c; Tables S17 and S18). Weighted gene coex­
screening of CNEs that were specifically lost or underwent rapid evo­ pression network analysis (WGCNA) was applied to characterize the
lution in the Muscovy duck. We identified 272,251 CNEs from whole- transcriptomes of the two species. We identified seven modules with
genome alignment of 10 species. Among these CNEs, 669 were gene sizes ranging from 117 to 3405, among which the brown module

4
M.-M. Xu et al. Genomics 114 (2022) 110518

(a)
Biosynthesis of secondary metabolites
Neutrophil extracellular trap formation
Shigellosis
Human papillomavirus infection
Systemic lupus erythematosus
Focal adhesion
cAMP signaling pathway
Biosynthesis of amino acids
Lysine degradation
Amoebiasis p.adjust

Arginine and proline metabolism 0.01


0.02
Alcoholism
0.03
Adrenergic signaling in cardiomyocytes 0.04
Transcriptional misregulation in cancer
Toxoplasmosis
Phospholipase D signaling pathway Count
10
Small cell lung cancer 20
Herpes simplex virus 1 infection 30
ECM−receptor interaction
Necroptosis
Dopaminergic synapse
Glucagon signaling pathway
Aldosterone synthesis and secretion
Glioma
Insulin secretion
Phototransduction−fly
Carbohydrate digestion and absorption
0.025 0.050 0.075 0.100 0.125
GeneRatio

(b) CDH13
100
Muscovy duck
50
Peking duck
Nucleotide sequence similarity
to the chicken counterpart (%)

Goose
Turkey

Guineafowl
Collared flycatcher
Zebra finch
Emu
Alligator
(chr11:16,261,486-16,271,486)

Fig. 3. Comparative genomic analysis. (a) Functional enrichment analysis of expanded gene families in Muscovy duck genome. (b) VISTA sequence conservation plot
of specific CNE loss in Muscovy duck. Using the chicken genome as a reference, sequence alignment for nine species was performed to predict conserved regions with
≥80% identity and ≥ 50 bp window size. Exonic region is indicated by blue shadowed column. Lost CNE was located in the 7th intronic region of the CDH13 gene
indicated by gray shadowed column. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

5
M.-M. Xu et al. Genomics 114 (2022) 110518

(a)
condition
2
species
1

}
HSD17B12
ABCD3 0
CROT -1
ACSL1 fatty acid oxidation -2
CPT1A
CDH13
ADIPOR2 condition

}
ATGL fed ad libitum
ABHD5 lipolysis overfed
MGLL
species

}
FOXO1
FOXO3 Peking duck
FOXOs
FOXO4 Muscovy duck
FOXO6

(b) (c)
Peking duck (fed ad libitum vs. overfed) Muscovy duck (fed ad libitum vs. overfed)

fatty−acyl−CoA biosynthetic process fatty acid oxidation

Running Enrichment Score


Running Enrichment Score

0.6 unsaturated fatty acid biosynthetic process fatty acid beta−oxidation


0.0

0.4 −0.2

0.2 −0.4

0.0 −0.6
Ranked List Metric
Ranked List Metric

7.5 8

5.0 4
2.5
0
0.0
−2.5 −4

3000 6000 9000 3000 6000 9000


Rank in Ordered Dataset Rank in Ordered Dataset
(d) (e)
Module-trait relationships
1
−0.22 0.86 −0.7 0.055
MEgreen (0.2) (6e−12) (1e−06) (0.7)

−0.62 0.52 −0.47 0.57


MEyellow (4e−05) (9e−04) (0.003) (2e−04)

0.5
0.55 0.6 −0.6 −0.54
MEturquoise (4e−04) (7e−05) (6e−05) (5e−04)

0.41 0.14 0.26 −0.83


MEblack (0.01) (0.4) (0.1) (1e−10)

0
0.83 −0.27 0.11 −0.66
MEred (2e−10) (0.1) (0.5) (7e−06)

−0.53 −0.61 0.6 0.54


MEblue (6e−04) (5e−05) (6e−05) (5e−04)

−0.5
0.33 −0.79 0.64 −0.18
MEbrown (0.04) (4e−09) (1e−05) (0.3)

−0.57 0.52 −0.51 0.57


MEgrey (2e−04) (9e−04) (0.001) (2e−04)

−1
Muscovy duck Muscovy duck Peking duck Peking duck
fed ad libitum overfed fed ad libitum overfed

(caption on next page)


6
M.-M. Xu et al. Genomics 114 (2022) 110518

Fig. 4. Comparative transcriptomic analysis. (a) Heat map of gene expression for Muscovy-low DEGs. (b, c) Gene set enrichment analysis (GSEA) of lipid metabolism
pathway in livers of Peking (b) and Muscovy ducks (c). (d) Relationships between module and traits. Every cell contains the corresponding correlation coefficient and
P-value shown in brackets. Cell is colored by correlation value according to colour legend. (e) Gene expression network for brown module. Edges with weight > 0.1
are plotted. Circles in network represent genes. Hub genes are represented by solid circles and labelled by gene names. Red solid circles indicate hub genes also
characterized as Muscovy-low DEGs. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

was the most negatively associated with overfed Muscovy ducks with lipid storage in the liver and hepatic steatosis [46–48]. Here, several hub
severe fatty liver (Fig. 4d). Genes in the brown module were enriched in genes in the brown module known to promote fatty acid oxidation (e.g.,
fatty acid β-oxidation (Table S19). We detected 37 hub genes in the CROT, ACSL1, and CPT1A) were also characterized as Muscovy-low
brown module, several of which functionally promote fatty acid DEGs (Fig. 4a, e, and S12). For instance, the protein encoded by
β-oxidation (PPARGC1A [39], CROT [40], ACSL1 [41], ACOT13 [42], CPT1A transfers fatty acids into the mitochondria, which is a rate-
CPT1A [43], and ACADM [44]), with CPT1A, CROT, and ACSL1 also limiting step in subsequent fatty acid β-oxidation [43]. Importantly,
characterized as Muscovy-low DEGs (Figs. 4e and 6b; Table S20). These we also found that fatty acid β-oxidation was exclusively down-
results suggest marked differences in fatty acid metabolism, especially in regulated in overfed Muscovy ducks (Fig. 4c), consistent with previous
fatty acid synthesis and β-oxidation, between the two species. Compared research showing low expression of fatty acid β-oxidation-related genes
to the Peking duck, the Muscovy duck showed insufficient fatty acid in the livers of overfed Muscovy ducks [14]. Hepatic steatosis develops
biosynthesis and β-oxidation abilities. when there is an imbalance between lipid storage and lipolysis in lipid
droplets, resulting in triglyceride accumulation in hepatocytes [49]. The
ATGL gene initiates the first step in lipolysis, with enzymatic activity
2.5. Rapidly evolving CNEs shape lipid metabolism in Muscovy ducks activated by the protein product of CGI-58 (also known as ABHD5)
[50,51]. The enzyme encoded by the MGLL gene catalyzes the last step
To explore and compare the potential functions of Muscovy- in triglyceride hydrolysis [45] (Fig. 6b). In our study, ATGL, ABHD5, and
accelerated CNEs of DEGs in Muscovy and Peking ducks, we collected MGLL were identified as Muscovy-low DEGs (Figs. 4a, 5e, and S12). In
a Muscovy duck liver sample to generate Hi-C sequencing, assay for conclusion, reduced fatty acid β-oxidation increased intrahepatic fatty
transposase-accessible chromatin sequencing (ATAC-seq), and RNA-seq acid accumulation and impaired lipolysis increased accumulated lipids
data (Table S1). Comparative genomic and transcriptomic analyses were storage in lipid droplets, resulting in greater lipid retention in the liver of
performed using Hi-C, ATAC-seq, and RNA-seq data. We firstly gener­ Muscovy ducks (Fig. 6a).
ated a three-dimensional (3D) genome map of the Muscovy duck with a
resolution of 3.35 kb. We identified 835 topologically associating do­
mains (TADs) and 11,740 loops. Additionally, 107,367 ATAC-seq peaks 3.2. Accelerated evolution of CNEs drives lipid metabolism in Muscovy
and 10,254 commonly expressed genes (TPM ≥ 1) were detected in the ducks
Muscovy duck liver tissue.
In total, 90.14% (1243 in 1379) of the Muscovy-accelerated CNEs Previous studies in birds have shown that CNEs are associated with
were associated with 7803 potential target genes by colocalization developmental genes and that convergent evolution of CNEs likely
within the same TADs, of which 71.15% (5552 in 7803) were commonly contributed to loss of powered flight [32,52]. Our work provides addi­
expressed in the Muscovy duck liver (Table S21). A total of 1538 po­ tional evidence that accelerated CNEs also played a dominant role in
tential target genes were characterized as DEGs and commonly shaping avian lipid metabolism. Notably, accelerated CNEs promoted
expressed in the Muscovy duck liver (Fig. 5a). These results indicate that fatty liver in Muscovy ducks by down-regulating a series of genes
the accelerated CNEs were more likely to account for the distinctive involved in fatty acid β-oxidation and lipolysis (Figs. 6b and S13).
hepatic gene expression profiles in the Muscovy duck, treating CNEs as a Interestingly, the CPT1A gene, a potential target gene of Muscovy-
proxy for cis-regulatory elements. Furthermore, 59.82% (920 in 1538) of accelerated CNEs and characterized as a Muscovy-low DEG in the liver
potential target genes were Muscovy-low DEGs and were enriched in (Figs. 4a and S12), showed no expression differences between the two
biological processes relevant to fatty acid synthesis and oxidation duck species in the muscle [53]. This suggests that accelerated CNEs
(Fig. 5b; Table S22). However, limited enrichment in fatty acid or lipid shape avian lipid metabolism in a tissue-specific manner. In contrast,
metabolism was detected for potential target genes characterized as our results show weak selective signals against lipid metabolism on
Muscovy-high DEGs (Table S23). coding regions (Tables S5 and S6; Fig. S8). This further supports the
For Muscovy-low DEGs, we identified a Muscovy-accelerated CNE hypothesis that cis-regulatory mutations causing morphological muta­
located in the intronic region of MGLL (Fig. 5c, d, and e), a key gene that tions are dominant in long-term evolution, as compared to coding re­
catalyzes the final step of lipolysis [45]. Mus.CNE.2447 was proximal to gions [54].
significant ATAC-seq peaks and interacted with the initial region of
MGLL in the loop within the same TAD (Fig. 5f). The reporter assay 3.3. Fatty liver susceptibility is an evolutionary by-product in Muscovy
results confirmed that Mus.CNE.2447 may exhibit enhancer activity, but ducks
the enhancer activity was weaker in the Muscovy duck than in the
Peking duck (Fig. 5g), implying weaker lipolysis ability. Collectively, Fatty acids, as a high-energy fuel, are generally preferred by
our findings suggest that the accelerated evolution of CNEs may have migratory birds that require excessive energy consumption for extended
shaped the distinctive lipid metabolism of Muscovy ducks by down- flight [11,55]. The Muscovy duck is non-migratory throughout its wild
regulating genes involved in fatty acid synthesis, fatty acid oxidation, range [5,56]. In this species, reduced lipolysis leads to lower free fatty
and lipolysis. The rapidly evolving CNEs may account for the suscepti­ acid production, while attenuated fatty acid β-oxidation results in lower
bility to fatty liver in Muscovy ducks. acyl-CoA for energy generation. Therefore, we hypothesize that Mus­
covy ducks require less fatty acid for energy consumption due to their
3. Discussion non-migratory habits, in contrast to migratory mallards (Anas pla­
tyrhynchos, wild ancestor of Peking ducks) [57,58]. The digestion and
3.1. Weak lipid catabolism contributes to fatty liver susceptibility in absorption capabilities of animals typically reflect the main nutrients
Muscovy ducks and energy sources (such as carbohydrates, proteins, and fats) in their
diets [59–61]. In the wild, Muscovy ducks mainly feed on plant food,
Reduced fatty acid oxidation (especially β-oxidation) leads to excess which is generally rich in carbohydrates [4,61]. In the current study,

7
M.-M. Xu et al. Genomics 114 (2022) 110518

(caption on next page)

8
M.-M. Xu et al. Genomics 114 (2022) 110518

Fig. 5. Integrations of comparative genomic and transcriptomic analyses. (a) Venn diagram showing overlap in DEGs between Peking and Muscovy ducks,
commonly expressed genes (TPM ≥ 1) in Muscovy duck liver, and potential target genes of Muscovy-accelerated CNEs. (b) Selected enriched GO terms relevant to
lipid metabolism for potential target genes of Muscovy-accelerated CNEs, which were also Muscovy-low DEGs and commonly expressed in Muscovy duck liver. (c)
VISTA visualization of Muscovy-accelerated CNE (Mus.CNE.2447). Using the chicken genome as a reference, sequence alignment for nine species was performed to
predict conserved regions with ≥80% identity and ≥ 50 bp window size. Mus.CNE.2447 was located in the 2nd intronic region of the MGLL gene, indicated by gray
shadowed column. (d) Sequence alignment for Mus.CNE.2447 among nine species. Sequences in each row represent CNEs for species in the same order as those in
VISTA visualization. (e) Expression of the MGLL gene between Peking and Muscovy ducks under two feeding conditions. Gene expression was measured by
normalized gene counts. P-values for comparisons were determined by Wilcoxon test. (f) Integrative map of Mus.CNE.2447 and MGLL locus. TAD and loop are shown
with triangles and red arcs, respectively. (g) Results of dual-luciferase reporter assays for Mus.CNE.2447. Significance was tested by Student's t-test and data are
represented as mean ± standard error of the mean (SEM). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version
of this article.)

gene families enriched in carbohydrate digestion and absorption were prepared using a Chromium Genome Reagent Kit. Paired-end libraries
significantly expanded (Fig. 3a), suggesting that Muscovy ducks prefer with insert sizes ranging from 500 to 700 bp were sequenced on the
carbohydrates to fatty acids as their major energy source, in part due to Illumina HiSeq X Ten platform. For genome sequencing, short-read data
dietary adaptation. This appears to be a common strategy adopted by were obtained on the Illumina HiSeq X platform (400-bp insert size).
birds. For example, the ATGL protein, which catalyzes the first step in Long-read data were sequenced on the Oxford Nanopore GridION plat­
lipolysis, shows reduced hydrolytic activity in flight-degenerate bird form. For Hi-C data, the Hi-C library was prepared following standard
species, which tend to utilize carbohydrates as their main energy source procedures and sequenced using the Illumina HiSeq X Ten platform
[62]. In our study, both the ATGL and MGLL genes were characterized as (400-bp insert size).
Muscovy-low DEGs (Figs. 4a and S12). Thus, we propose that the For RNA-seq used in assembly annotation, we extracted RNA from a
adaptation of Muscovy ducks to carbohydrate-dominated diets and non- total of 15 tissue or organ samples from the same Muscovy duck,
migratory habits led to weak lipid catabolism, and that fatty liver sus­ including the cerebellar vermis, cerebral cortex, mesencephalon, pons,
ceptibility is an evolutionary by-product. striatum, eye, tongue, heart, kidney, liver, lung, spleen, stomach, rectum
and testis using TRIzol Reagent. Total RNA-seq libraries were prepared
3.4. Muscovy ducks as a potential animal model of NAFLD using a TruSeq RNA Library Prep Kit v2 according to the manufacturer's
instructions and subsequently sequenced on the HiSeq 4000 platform.
Muscovy ducks share several molecular characteristics of hepatic We sampled liver tissues from another male Muscovy duck collected
steatosis with human patients and genetically manipulated model ani­ in Jilin in northeastern China to investigate regulation of gene expres­
mals. Notably, several genes characterized as Muscovy-low DEGs are sion profiles (Fig. S1b). For ATAC-seq, the libraries of three biological
associated with hepatic steatosis (Figs. 4a and S12). For example, CPT1A replicates were prepared according to standard protocols [67] and
expression is significantly decreased (by 50%) in NAFLD patients [63] sequenced on the Illumina HiSeq 4000 platform. The Hi-C library was
and liver-specific CPT1A knockout mice display severe hepatic steatosis constructed, and sequencing data were obtained using the MGI
[64]. Liver-specific ATGL or ABHD5 knockout mice show triglyceride DNBSEQ-T platform. The prepared RNA-seq library was sequenced on
accumulation and steatosis in the liver [51,65] and liver specific the MGI DNBSEQ-T platform according to the recommended protocols.
FOXO1/3/4 triple knockout mice develop severe hepatic steatosis [66].
Muscovy ducks can also develop hepatic steatosis after only for two 5.2. Genome de novo assembly
weeks of overfeeding [17], which is helpful for experimental research.
Thus, our study highlights the potential of Muscovy ducks as a model The Muscovy duck genome size was estimated based on k-mer
animal for NAFLD. analysis (Fig. S2). We followed the recommended protocols and used the
Supernova software package (10× Genomics, CA) for de novo assembly
4. Conclusions of the chromium-prepared sequencing linked-reads. Scaffolding of the
produced contigs was then completed by using fragScaff software.
We presented a high-quality chromosome-level genome and high- Integrity of the primary assembly was further improved using a
resolution 3D genome map for the Muscovy duck, providing an impor­ hybrid strategy, as reported in previous study ([68] Fig. 1). In brief,
tant genetic resource for evolutionary and functional genomic studies in DBG2OLC software [69] was used to perform hybrid assembly as fol­
waterfowl. By integrating RNA-seq, Hi-C, and ATAC-seq data with the lows: a) The DBG contigs were mapped to the long reads. Each long read
newly assembled genome, we revealed that the weak fatty acid was converted into a compressed read, an order list of contigs. b) The
β-oxidation and lipolysis ability of Muscovy ducks likely contributes to compressed reads were chained together based on the best calculated
their fatty liver susceptibility. The Muscovy duck-specific accelerated overlaps. c) The assembly backbone was constructed from the best
CNEs down-regulated the expressions levels of many hepatic genes overlaps. Finally, we aligned all longs reads and 10× contigs to the
related to fatty acid β-oxidation and lipolysis. Importantly, the suscep­ backbone and calculated the most likely sequence as output using Sparc
tibility of Muscovy ducks to fatty liver may be an evolutionary by- [70]. DBG2OLC was executed by using the following commands:
product of their adaptation to a carbohydrate-dominated diet and non- DBG2OLC k 17 KmerCovTh 10 MinOverlap 150 AdaptiveTh 0.005 LD1
migratory habit. 0 Contigs Contigs.fa RemoveChimera 1 f Nanopore.fasta; Consensus step
was run with following parameters: split_and_run_sparc.sh backbone.
5. Materials and methods fasta DBG2OLC_Consensus_info.txt tg_na.fasta consensus_batch_dir 2 10.
The contigs generated by hybrid assembly were polished using
5.1. Sample collection and sequencing NextPolish (v1.0.2) [71], with three rounds of alignment with long reads
and three rounds of alignment with short reads. To generate an assembly
All animal experiments were approved by the Ethics and Experi­ with chromosome-length scaffolds, clean Hi-C reads were mapped to the
mental Animal Committee of the Kunming Institute of Zoology, Chinese draft assembly with Juicer (v1.5.6) [72], and a candidate chromosome-
Academy of Sciences, China (Approval ID: IACUC-OE-2021-06-001). length assembly was then generated automatically using the 3D-DNA
Genomic DNA was extracted from whole blood and muscle of a male (v180114) [73] pipeline. Manual review and refinement of the candi­
Muscovy duck collected in Yunnan in southwestern China (Fig. S1a). For date assembly was performed in Juicebox Assembly Tools [72] for
10× Genomics data, sample-indexed and barcoded libraries were quality control and interactive correction. The genome was then re-

9
M.-M. Xu et al. Genomics 114 (2022) 110518

Fig. 6. Formation of fatty liver in Muscovy duck. (a) Schema of metabolic causes of fatty liver of Muscovy duck. (b) Metabolic pathway of weak lipid catabolism in
Muscovy duck liver. Genes in red indicate Muscovy-low DEGs. FA: fatty acid; LFA: long-chain fatty acid; VLFA: very long-chain fatty acid; FA acyl-CoA: fatty acyl-
CoA; M/L FA acyl-CoA: medium/long-chain fatty acyl-CoA; MAG: monoacylglycerol; DAG: diacylglycerol; TAG: triacylglycerol; HSD17B12: hydroxysteroid 17-beta
dehydrogenase 12; ACSL1: acyl-CoA synthetase long chain family member 1; ACOT13: acyl-CoA thioesterase 13; ABCD3: ATP binding cassette subfamily D member
3; CROT: carnitine O-octanoyltransferase; CPT1A: carnitine palmitoyltransferase 1A; ACADM: acyl-CoA dehydrogenase medium chain; LDAH: lipid droplet asso­
ciated hydrolase; ATGL: adipose triacylglycerol lipase; MGLL: monoglyceride lipase; AKT3: AKT serine/threonine kinase 3; PGC-1α: peroxisome proliferator-activated
receptor-gamma coactivator 1 alpha; FOXO1: forkhead box O proteins 1; ADIPOR2: adiponectin receptor 2. (For interpretation of the references to colour in this
figure legend, the reader is referred to the web version of this article.)

assembled using 3D-DNA [73] according to manual adjustment. The based prediction, known repeats were identified using RepeatMasker
chromosomes were anchored using 3D-DNA and Juicebox workflow. (v4.0.9) [74] and RepeatProteinMask [74] against Repbase [75]
(Repbase Release 20,181,026). RepeatMasker was applied for DNA-level
identification using a custom library. At the protein level, RepeatPro­
5.3. Genome annotation and evaluation teinMask was used to perform an RMBLAST search against the TE pro­
tein database. For de novo prediction, RepeatModeler (http://repeatma
Transposable elements (TEs) in the genome were identified by a sker.org/) and LTR FINDER (v1.07) [76] were used to identify de novo
combination of de novo and homology-based approaches. For homology-

10
M.-M. Xu et al. Genomics 114 (2022) 110518

evolved repeats inferred from the assembled genome. evolving gene families for each species were determined based on P-
For gene prediction and functional annotation, we employed EVi­ value ≤0.01. GO and KEGG enrichment analyses of expanded and con­
dence Modeler (EVM, v1.1.0) [77] to consolidate RNA-seq data, protein tracted gene families were conducted using clusterProfiler (v 4.0.5)
alignments with ab initio gene predictions and homology-based anno­ [100]. GO and KEGG items with P-value <0.05 and q-value <0.05 were
tations into a final gene set. For RNA-seq data, reads were cleaned with regarded as significantly enriched.
Trimmomatic (v0.32) [78] and aligned to the genome with Tophat2
(v2.1.1) [79]. Alignments were then assembled independently with
Cufflinks (v2.0.0) [80] and de novo assembled with Trinity 5.6. Detection of gene evolution
(vr20140413p1) [81]. RNA-seq assemblies were combined and further
refined using PASA (v2.2.0) [77]. The ab initio predictions were per­ We applied the same method described in previous study [101] to
formed using AUGUSTUS (v3.3.1) [82] with a chicken training set. detect gene evolution signals. Single copy orthologous genes of the 10
Protein sequences of Anas platyrhynchos, Gallus gallus, Meleagris gallo­ species (same as used for phylogenetic tree construction) were extrac­
pavo, and Taeniopygia guttata were used in the homology-based gene ted, aligned, and trimmed as described above. PSGs and REGs were
annotation. In brief, tBLASTn was first performed to map protein se­ analyzed using the Codeml program in the PAML package (v 4.9e)
quences of the aforementioned species to the genome with the param­ [102]. A positive selection signal on a gene within the Muscovy duck
eter “E-value = 1e-5”. For a single protein, all matching DNA sequences lineage was detected using the branch-site model. The branch model was
in the reference genome were concatenated by Solar after low-quality used to detect fast evolution signal of genes. P-values were computed
records were filtered. A protein-coding region was obtained by extend­ based on Chi-square statistics, and genes with P-values <0.05 were
ing 2000 bp upstream and downstream of the concatenated sequence. treated as candidates undergoing positive selection or rapid evolution.
GeneWise (v2.2.0) [83] was then used to predict gene structure within The PSGs and REGs were cross-referenced against the Reactome data­
each protein-coding region. All lines of evidence were then fed into EVM base [103] (Release 78, Pathway browser 3.7) based on seven
using intuitive weighting (RNA-seq > protein > ab initio gene pre­ metabolism-related categories, including lipid metabolism, carbohy­
dictions). Finally, the EVM models were updated into PASA. Gene drates metabolism, metabolism of amino acids and derivatives, pyruvate
functions were assigned according to the best match alignment using metabolism and citric acid cycle, metabolism of vitamins and cofactors,
BLASTP against the Swiss-Prot [84], TrEMBL and Kyoto Encyclopedia of and integration of energy metabolism.
Genes and Genomes (KEGG) [85] databases. InterProScan functional
analysis and Gene Ontology (GO) IDs were obtained using InterProScan
[86]. The pathway to which the gene may belong was derived from the 5.7. Identification of conserved noncoding elements (CNEs)
matching genes in KEGG.
For non-coding gene annotation, tRNAscan-SE [87] was specified for Pairwise whole genome alignments (WGAs) were constructed for the
Eukaryotic tRNA and tRNA annotation. We used the homology-based 10 species above using LAST (v983) [104] (parameter: -E0.05), with the
method to identify rRNA. The rRNA sequence data downloaded from chicken as a reference genome. The pairwise WGAs were merged and
the Rfam [88] database were used as a reference. INFERNAL [89] was used to build multiple alignments with the roast program in Multiz
used to identify snRNA and miRNA. (v11.2) [105].
To evaluate genome quality, completeness of Muscovy duck assem­ To identify CNEs, fourfold degenerative sites were extracted from the
bly was assessed against the avian lineage BUSCO (v3.0.2) [90]. The whole-genome alignments and used to estimate a neutral model with the
syntenic relationship between Muscovy duck, chicken, and Peking duck phylogenetic tree described above as input using phyloFit (v1.4) [106].
genomes was conducted using Mummer (v3.23) [91] with default pa­ Conserved elements were detected using the phastCons (v1.4) package
rameters and visualized using gnuplot (v5.2.8) (http://www.gnuplot. [107]. Conserved elements within 5 bp were merged into single ele­
info). The genome assembly statistics were calculated using stats.sh ments and subsequently removed when their lengths were < 50 bp. After
script in BBMap (v38.45) (https://sourceforge.net/projects/bbmap/). filtering misassembled CNEs that split into two or more discrete regions
on the Muscovy duck genome, a final set of CNEs was obtained that did
5.4. Phylogenetic tree construction not intersect the coding regions of the annotated genes in the chicken
genome.
The Muscovy duck phylogenetic tree was constructed based on nine Muscovy-accelerated CNEs were screened using two methods. First,
additional species (Anas platyrhynchos, Anser cygnoides, Gallus gallus, phyloP (v1.4) was used to test each CNE for lineage-specific accelerated
Meleagris gallopavo, Numida meleagris, Ficedula albicollis, Taeniopygia evolution in the Muscovy duck branch. CNEs showing false discovery
guttata, Dromaius novaehollandiae, and Alligator mississippiensis). Except rate (FDR)-adjusted P-values <0.05 were considered to show signifi­
for Anser cygnoides [92], Numida meleagris [93], and Gallus gallus cantly accelerated evolution. Second, CNEs for each species were
(downloaded from ENSEMBL v104), the genomes of the other species concatenated into super-sequences as input in PhyloAcc (v1.0) [108], a
were downloaded from the NCBI database. The single-copy orthologous new Bayesian method to detect genomic region with rate shifts and
genes of 10 species were identified by the reciprocal best bit (RBH) identify genomic elements accelerated in target species from a set of
method. The coding sequences of each orthologous pair were aligned conserved elements. The accelerated CNEs in the Muscovy duck were
using prank (v170427) [94] and trimmed using Gblock (v 0.91b) [95]. obtained with practical thresholds [32]: BF ≥ 10 and BF2 ≥ 1. After
The filtered sequences were joined into supergenes and used as input in removing misassembled CNEs, the Muscovy-accelerated CNEs were
RAxML (v8.2.12) [96] to infer a maximum likelihood (ML) tree identified by the intersection of results of the two methods.
(Fig. S6), with parameters: -f a -m GTRGAMMA -p 2021 -x 20210808 -N To identify CNEs specifically lost in the Muscovy duck, we searched
100 -T 10 -n ex. the chicken CNEs against the genomes of the remaining species using
BLASTN (v.2.9.0) (E-value <1e-10; ≥ 80% identity; ≥ 30% coverage).
5.5. Gene family evolution analysis Those CNEs that only had no significant match in the Muscovy duck
genome were considered as lost. In addition, we used AVID (v 2.1) [109]
The protein sequences of the 10 species were clustered with Ortho­ to perform alignments between each reference and query sequence with
Finder (v2.3.11) [97]. The r8s (v1.81) [98] program was used to obtain the parameter “-nm = both”. The input sequences were prepared by
an ultrametric tree of the 10 species with fossil records from the extending 50 kb upstream and downstream of the genes. VISTA
TIMETREE website (http://www.timetree.org/). Expanded and con­ (v1.4.26) [110] was used to visualize and plot the Muscovy-accelerated
tracted gene families were identified by CAFÉ (v4.2) [99]. Rapidly CNEs and lost CNEs after alignment.

11
M.-M. Xu et al. Genomics 114 (2022) 110518

5.8. ATAC analysis Data availability

The data were trimmed and adapters were removed using Cutadapt The assembled genome and annotation were deposited in the
(v3.5) [111] (parameters: -a CTGTCTCTTATA -A CTGTCTCTTATA -q 10 Genome Warehouse of the National Genomics Data Center (https://
–trim-n –minimum-length 30). The clean data were aligned to the ngdc.cncb.ac.cn/gwh/) under BioProject accession code: PRJCA009398
Muscovy duck genome, with no mitogenome sequences, using bowtie2 (accession code: GWHBJBF0000000). All raw data were submitted to
(v2.3.5.1) [112] (parameters: –very-sensitive -X 2000 -p 5). Polymerase the Genome Sequences Archive (GSA) of the National Genomics Data
chain reaction (PCR) duplicated reads were filtered by Picard (htt Center under BioProject accession code: PRJCA007991. In addition,
ps://broadinstitute.github.io/picard/, v1.131) and low-quality map­ transcriptomic data of the livers from Peking and Muscovy ducks under
ped reads were removed. SAMtools (v1.9) [113] was used to merge the two feeding conditions were downloaded from the Sequence Read
aligned bam files of the three replicates. The merged bam files were fed Archive (SRA) database of the NCBI under accession number
into MACS2 (v2.2.7.1) [114] (parameters: -f BAMPE –nomodel –shift SRP144764 (https://www.ncbi.nlm.nih.gov/sra/SRP144764).
-100 –extsize 200 -B –SPMR –keep-dup all -g 1091823599) for narrow
peak calling.
Acknowledgments

This study was supported by the Spring City Plan: the High-level
5.9. HiC analysis
Talent Promotion and Training Project of Kunming (2022SCP001),
Chinese Modern Technology System of Agricultural Industry (CARS-42-
The Hi-C data were processed using the HiC-Pro (v2.11.1) [115]
50), Yunnan Revitalization Talent Support Program, and the Animal
pipeline to generate a contact matrix. TADs were called at 40-kb reso­
Branch of the Germplasm Bank of Wild Species, Chinese Academy of
lution using hicFindTADs wrapped in HiCExplorer (v3.6) [116] (pa­
Sciences (the Large Research Infrastructure Funding). We thank Prof.
rameters: –minDepth 300,000 –maxDepth 3,000,000 –step 300,000
Qiang Qiu and Cheng-Long Zhu (School for Ecological and Environ­
–minBoundaryDistance 400,000 –correctForMultipleTesting fdr
mental Sciences, Northwestern Polytechnical University, Xi'an, China)
–thresholdComparisons 0.01 –numberOfProcessors 5). Loop calling was
for their helpful advice on comparative genomic analysis. We are also
conducted by FitHiC2 (v 2.0.8) [117] with default parameters at 5-kb
grateful to all volunteers who assisted in sampling.
resolution.

Author contributions
5.10. Transcriptional analysis
M.S.P. and Y.P.Z. led the project, and designed and conceived the
One-to-one orthologous genes between Muscovy and Peking ducks study. M.M.X. and M.S.P. prepared the manuscript. Y.P.Z., Y.D., T.S.X.,
were identified by RBH using BLASTN. The quantification of gene and A.C.A. revised the paper. L.W.L., S.C.D., and Y.D. assembled and
expression was limited to orthologous genes between the two species. annotated the genome assembly. M.M.X. and L.H.G. performed subse­
Transcriptional sequencing data of the liver were downloaded from quent data analysis. W.Y.L. conducted experiments. L.Z.L., T.Z., Z.C.H.,
previous study [14]. RNA-seq reads were aligned to the reference Z.S.M., W.C., and J.L.H. provided technical assistance. All authors read
genome with HISAT2 (v2.1.0) [50]. The read counts were calculated and approved the final manuscript. The authors declare that they have
using featureCounts (v1.6.2) [51]. The expression matrix was loaded no competing interests.
into R and DEGs were selected using the DESeq2 [52] package (v1.32.0)
with thresholds: padj <0.05 and |log2FoldChange| ≥ 1. GSEA was Appendix A. Supplementary data
performed using the gseGO function in the clusterProfiler package. The
GSEA results were filtered with parameters: q-values <0.25 and P-values Supplementary data to this article can be found online at https://doi.
<0.05. org/10.1016/j.ygeno.2022.110518.
The WGCNA (v1.68) package [118] was used to conduct weighted
gene coexpression network analyses. The significant module was iden­
References
tified and imported into Cytoscape (v3.8.2) [119] for network con­
struction and visualization. Hub genes were selected with the highest [1] T.B. Rodenburg, M.B.M. Bracke, J. Berk, J. Cooper, J.M. Faure, D. Guémené,
degree of connectivity within a module. GO enrichment analysis was G. Guy, A. Harlander, T. Jones, U. Knierim, K. Kuhnt, H. Pingel, K. Reiter,
completed using the clusterProfiler package [100]. J. Servière, M.A.W. Ruis, Welfare of ducks in European duck husbandry systems,
World's Poult. Sci. J. 61 (2019) 633–646.
[2] P.W. Stahl, An exploratory osteological study of the muscovy duck (Cairina
moschata) (Aves: Anatidae) with implications for neotropical archaeology,
5.11. Dual-luciferase assay J. Archaeol. Sci. 32 (2005) 915–929.
[3] D. Gade, Muscovy ducks, in: K.F. Kiple, K.C. Ornelas (Eds.), The Cambridge World
History of Food, Cambridge University Press, Cambridge, 2000, pp. 559–561.
The CNE sequences of Peking duck-MGLL (NCBI: NC_051784.1: [4] E.R. Woodyard, E.G. Bolen, Ecological studies of Muscovy ducks in Mexico,
12,092,780-12,092,874) and Muscovy duck-MGLL (HiC_scaffold_11: Southwest. Nat. 29 (1984) 453–461.
[5] J. Gamboa, The modern ontological natures of the Cairina moschata (Linnaeus,
10,559,921-10,560,015) were synthesized chemically and inserted into 1758) duck. Cases from Perú, the northern hemisphere, and digital communities,
the pGL4.23 vector (Promega, USA) and sequenced (Tsingke Biotech, Anthropozoologica 54 (2019).
China). Duck embryo hepatocytes (DEHCs) were seeded in 24-well [6] P.W. Stahl, Animal domestication in South America, in: H. Silverman, W.H. Isbell
(Eds.), The Handbook of South American Archaeology, Springer, New York, New
plates with six replicates for every plasmid. The constructed pGL4.23 York, NY, 2008, pp. 121–130.
(500 ng) vectors were co-transfected with 25 ng of pRL-TK (internal [7] A.P. Aronal, N. Huda, R. Ahmad, Amino acid and fatty acid profiles of Peking and
control) into the DEHCs using Turbofect reagent (ThermoFisher Scien­ Muscovy duck meat, Int. J. Poult. Sci. 11 (2012) 229–236.
[8] T. Zeng, L. Zhang, J. Li, D. Wang, Y. Tian, L. Lu, De novo assembly and
tific, USA). Cellular lysates were collected 24–48 h after transfection
characterization of Muscovy duck liver transcriptome and analysis of
using passive lysis buffer. Luciferase activity was measured using the differentially regulated genes in response to heat stress, Cell Stress Chaperones 20
Dual-Luciferase Reporter Assay System (Promega, USA) with a multi­ (2015) 483–493.
mode microplate reader (Spark Tecan, Switzerland). Light output from [9] A. Yakubu, Characterisation of the local Muscovy duck in Nigeria and its
potential for egg and meat production, World's Poult. Sci. J. 69 (2013) 931–938.
transcriptional activity was divided by the output from Renilla luciferase [10] L.A. Arias-Sosa, A.L. Rojas, A review on the productive potential of the Muscovy
activity to normalize the samples. duck, World’s Poult. Sci. J. 77 (2021) 565–588.

12
M.-M. Xu et al. Genomics 114 (2022) 110518

[11] C.G. Scanes, E. Braun, Avian metabolism: its control and evolution, Front. Biol. 8 X. Gan, Y. Li, H. Zeng, Q. Liu, Y. Zhang, F. Shao, S. Hao, H. Zhang, X. Xu, X. Liu,
(2012) 134–159. D. Wang, M. Zhu, G. Zhang, W. Zhao, Q. Qiu, S. He, W. Wang, African lungfish
[12] D. Hermier, A. Saadoun, M.R. Salichon, N. Sellier, D. Rousselot-Paillet, M. genome sheds light on the vertebrate water-to-land transition, Cell 184 (2021)
J. Chapman, Plasma lipoproteins and liver lipids in two breeds of geese with 1362–1376.
different susceptibility to hepatic steatosis: changes induced by development and [32] T.B. Sackton, P. Grayson, A. Cloutier, Z. Hu, J.S. Liu, N.E. Wheeler, P.P. Gardner,
force-feeding, Lipids 26 (1991) 331–339. J.A. Clarke, A.J. Baker, M. Clamp, S.V. Edwards, Convergent regulatory evolution
[13] F. Jiang, Y. Jiang, W. Wang, C. Xiao, R. Lin, T. Xie, W.K. Sung, S. Li, I. Jakovlic, and loss of flight in paleognathous birds, Science 364 (2019) 74–78.
J. Chen, X. Du, A chromosome-level genome assembly of Cairina moschata and [33] T. Takeuchi, Y. Adachi, Y. Ohtsuki, M. Furihata, Adiponectin receptors, with
comparative genomic analyses, BMC Genomics 22 (2021) 581. special focus on the role of the third receptor, T-cadherin, in vascular disease,
[14] F. Hérault, M. Houée-Bigot, E. Baéza, O. Bouchez, D. Esquerré, C. Klopp, C. Diot, Med. Mol. Morphol. 40 (2007) 115–120.
RNA-seq analysis of hepatic gene expression of common Pekin, Muscovy, mule [34] H. Morisaki, I. Yamanaka, N. Iwai, Y. Miyamoto, Y. Kokubo, T. Okamura,
and hinny ducks fed ad libitum or overfed, BMC Genomics 20 (2019) 13. A. Okayama, T. Morisaki, CDH13 gene coding T-cadherin influences variations in
[15] P. Chartrin, M.D. Bernadet, G. Guy, J. Mourot, J.F. Hocquette, N. Rideau, M. plasma adiponectin levels in the Japanese population, Hum. Mutat. 33 (2012)
J. Duclos, E. Baeza, Does overfeeding enhance genotype effects on liver ability for 402–410.
lipogenesis and lipid secretion in ducks? Comp. Biochem. Physiol. A Mol. Integr. [35] Y. Wu, Y. Li, E.M. Lange, D.C. Croteau-Chonka, C.W. Kuzawa, T.W. McDade,
Physiol. 145 (2006) 390–396. L. Qin, G. Curocichin, J.B. Borja, L.A. Lange, L.S. Adair, K.L. Mohlke, Genome-
[16] D. Hermier, G. Guy, S. Guillaumin, S. Davail, J.-M. André, R. Hoo-Paris, wide association study for adiponectin levels in Filipino women identifies CDH13
Differential channelling of liver lipids in relation to susceptibility to hepatic and a novel uncommon haplotype at KNG1-ADIPOQ, Hum. Mol. Genet. 19 (2010)
steatosis in two species of ducks, Comp. Biochem. Physiol. B Biochem. Mol. Biol. 4955–4964.
135 (2003) 663–675. [36] C.M. Chung, T.H. Lin, J.W. Chen, H.B. Leu, H.C. Yang, H.Y. Ho, C.T. Ting, S.
[17] W. Skippon, The animal health and welfare consequences of foie gras production, H. Sheu, W.C. Tsai, J.H. Chen, S.J. Lin, Y.T. Chen, W.H. Pan, A genome-wide
Can. Vet. J. 54 (2013) 403–404. association study reveals a quantitative trait locus of adiponectin on CDH13 that
[18] Z. Younossi, Q.M. Anstee, M. Marietti, T. Hardy, L. Henry, M. Eslam, J. George, predicts cardiometabolic outcomes, Diabetes 60 (2011) 2417–2423.
E. Bugianesi, Global burden of NAFLD and NASH: trends, predictions, risk factors [37] T. Yamauchi, J. Kamon, Y. Minokoshi, Y. Ito, H. Waki, S. Uchida, S. Yamashita,
and prevention, Nat. Rev. Gastroenterol. Hepatol. 15 (2018) 11–20. M. Noda, S. Kita, K. Ueki, K. Eto, Y. Akanuma, P. Froguel, F. Foufelle, P. Ferre,
[19] Z.M. Younossi, A.B. Koenig, D. Abdelatif, Y. Fazel, L. Henry, M. Wymer, Global D. Carling, S. Kimura, R. Nagai, B.B. Kahn, T. Kadowaki, Adiponectin stimulates
epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of glucose utilization and fatty-acid oxidation by activating AMP-activated protein
prevalence, incidence, and outcomes, Hepatology 64 (2016) 73–84. kinase, Nat. Med. 8 (2002) 1288–1295.
[20] G.A. Michelotti, M.V. Machado, A.M. Diehl, NAFLD, NASH and liver cancer, Nat. [38] M.J. Yoon, G.Y. Lee, J.J. Chung, Y.H. Ahn, S.H. Hong, J.B. Kim, Adiponectin
Rev. Gastroenterol. Hepatol. 10 (2013) 656–665. increases fatty acid oxidation in skeletal muscle cells by sequential activation of
[21] R.J. Wong, R. Cheung, A. Ahmed, Nonalcoholic steatohepatitis is the most rapidly AMP-activated protein kinase, p38 mitogen-activated protein kinase, and
growing indication for liver transplantation in patients with hepatocellular peroxisome proliferator-activated receptor alpha, Diabetes 55 (2006) 2562–2570.
carcinoma in the U.S, Hepatology 59 (2014) 2188–2195. [39] T.Y. Huang, D. Zheng, J.A. Houmard, J.J. Brault, R.C. Hickner, R.N. Cortright,
[22] S. McPherson, T. Hardy, E. Henderson, A.D. Burt, C.P. Day, Q.M. Anstee, Overexpression of PGC-1α increases peroxisomal activity and mitochondrial fatty
Evidence of NAFLD progression from steatosis to fibrosing-steatohepatitis using acid oxidation in human primary myotubes, Am. J. Physiol. Endocrinol. Metab.
paired biopsies: implications for prognosis and clinical management, J. Hepatol. 312 (2017) E253–E263.
62 (2015) 1148–1155. [40] F.R. van der Leij, N.C. Huijkman, C. Boomsma, J.R. Kuipers, B. Bartelds,
[23] F.B. Islam, Y. Uno, M. Nunome, O. Nishimura, H. Tarui, K. Agata, Y. Matsuda, Genomics of the human carnitine acyltransferase genes, Mol. Genet. Metab. 71
Comparison of the chromosome structures between the chicken and three Anserid (2000) 139–153.
species, the domestic duck (Anas platyrhynchos), Muscovy duck (Cairina [41] A. Ohkuni, Y. Ohno, A. Kihara, Identification of acyl-CoA synthetases involved in
moschata), and Chinese goose (Anser cygnoides), and the delineation of their the mammalian sphingosine 1-phosphate metabolic pathway, Biochem. Biophys.
karyotype evolution by comparative chromosome mapping, J. Poult. Sci. 51 Res. Commun. 442 (2013) 195–201.
(2014) 1–13. [42] J. Cao, H. Xu, H. Zhao, W. Gong, D. Dunaway-Mariano, The mechanisms of
[24] J. Li, J. Zhang, J. Liu, Y. Zhou, C. Cai, L. Xu, X. Dai, S. Feng, C. Guo, J. Rao, human hotdog-fold thioesterase 2 (hTHEM2) substrate recognition and catalysis
K. Wei, E.D. Jarvis, Y. Jiang, Z. Zhou, G. Zhang, Q. Zhou, A new duck genome illuminated by a structure and function based analysis, Biochemistry 48 (2009)
reveals conserved and convergently evolved chromosome architectures of birds 1293–1304.
and mammals, Gigascience 10 (2021) giaa142. [43] I.J. L, H. Mandel, W. Oostheim, J.P. Ruiter, A. Gutman, R.J. Wanders, Molecular
[25] A. Woolfe, M. Goodson, D.K. Goode, P. Snell, G.K. McEwen, T. Vavouri, S. basis of hepatic carnitine palmitoyltransferase I deficiency, J. Clin. Invest. 102
F. Smith, P. North, H. Callaway, K. Kelly, K. Walter, I. Abnizova, W. Gilks, Y. (1998) 527–531.
J. Edwards, J.E. Cooke, G. Elgar, Highly conserved non-coding sequences are [44] A. Nandy, V. Kieweg, F.G. Kräutle, P. Vock, B. Küchler, P. Bross, J.J. Kim,
associated with vertebrate development, PLoS Biol. 3 (2005), e7. I. Rasched, S. Ghisla, Medium-long-chain chimeric human Acyl-CoA
[26] K. Lindblad-Toh, M. Garber, O. Zuk, M.F. Lin, B.J. Parker, S. Washietl, dehydrogenase: medium-chain enzyme with the active center base arrangement
P. Kheradpour, J. Ernst, G. Jordan, E. Mauceli, L.D. Ward, C.B. Lowe, A. of long-chain Acyl-CoA dehydrogenase, Biochemistry 35 (1996) 12402–12411.
K. Holloway, M. Clamp, S. Gnerre, J. Alfoldi, K. Beal, J. Chang, H. Clawson, [45] M. Vaughan, J.E. Berger, D. Steinberg, Hormone-sensitive lipase and
J. Cuff, F. Di Palma, S. Fitzgerald, P. Flicek, M. Guttman, M.J. Hubisz, D.B. Jaffe, monoglyceride lipase activities in adipose tissue, J. Biol. Chem. 239 (1964)
I. Jungreis, W.J. Kent, D. Kostka, M. Lara, A.L. Martins, T. Massingham, I. Moltke, 401–409.
B.J. Raney, M.D. Rasmussen, J. Robinson, A. Stark, A.J. Vilella, J. Wen, X. Xie, M. [46] J.K. Reddy, M.S. Rao, Lipid metabolism and liver inflammation. II. Fatty liver
C. Zody, Broad Institute Sequencing, T. Whole Genome Assembly, J. Baldwin, disease and fatty acid oxidation, Am. J. Physiol. Gastrointest. Liver Physiol. 290
T. Bloom, C.W. Chin, D. Heiman, R. Nicol, C. Nusbaum, S. Young, J. Wilkinson, K. (2006) G852–G858.
C. Worley, C.L. Kovar, D.M. Muzny, R.A. Gibbs, T. Baylor College of Medicine [47] J.P. Camporez, Y. Wang, K. Faarkrog, N. Chukijrungroat, K.F. Petersen, G.
Human Genome Sequencing Center Sequencing, A. Cree, H.H. Dihn, G. Fowler, I. Shulman, Mechanism by which arylamine N-acetyltransferase 1 ablation causes
S. Jhangiani, V. Joshi, S. Lee, L.R. Lewis, L.V. Nazareth, G. Okwuonu, insulin resistance in mice, Proc. Natl. Acad. Sci. U. S. A. 114 (2017)
J. Santibanez, W.C. Warren, E.R. Mardis, G.M. Weinstock, R.K. Wilson, U. E11285–E11292.
Genome Institute at Washington, K. Delehaunty, D. Dooling, C. Fronik, L. Fulton, [48] I. Chennamsetty, M. Coronado, K. Contrepois, M.P. Keller, I. Carcamo-Orive,
B. Fulton, T. Graves, P. Minx, E. Sodergren, E. Birney, E.H. Margulies, J. Herrero, J. Sandin, G. Fajardo, A.J. Whittle, M. Fathzadeh, M. Snyder, G. Reaven, A.
E.D. Green, D. Haussler, A. Siepel, N. Goldman, K.S. Pollard, J.S. Pedersen, E. D. Attie, D. Bernstein, T. Quertermous, J.W. Knowles, Nat1 deficiency is
S. Lander, M. Kellis, A high-resolution map of human evolutionary constraint associated with mitochondrial dysfunction and exercise intolerance in mice, Cell
using 29 mammals, Nature 478 (2011) 476–482. Rep. 17 (2016) 527–540.
[27] J.G. Roscito, K. Sameith, B.M. Kirilenko, N. Hecker, S. Winkler, A. Dahl, M. [49] N.L. Gluchowski, M. Becuwe, T.C. Walther, R.V.J. Farese, Lipid droplets and liver
T. Rodrigues, M. Hiller, Convergent and lineage-specific genomic differences in disease: from basic biology to clinical implications, Nat. Rev. Gastroenterol.
limb regulatory elements in limbless reptile lineages, Cell Rep. 38 (2022), Hepatol. 14 (2017) 343–355.
110280. [50] R. Zimmermann, J.G. Strauss, G. Haemmerle, G. Schoiswohl, R. Birner-
[28] P. Navratilova, D. Fredman, T.A. Hawkins, K. Turner, B. Lenhard, T.S. Becker, Gruenberger, M. Riederer, A. Lass, G. Neuberger, F. Eisenhaber, A. Hermetter,
Systematic human/zebrafish comparative identification of cis-regulatory activity R. Zechner, Fat mobilization in adipose tissue is promoted by adipose triglyceride
around vertebrate developmental transcription factor genes, Dev. Biol. 327 lipase, Science 306 (2004) 1383–1386.
(2009) 526–540. [51] A. Lass, R. Zimmermann, G. Haemmerle, M. Riederer, G. Schoiswohl,
[29] A. Visel, S. Prabhakar, J.A. Akiyama, M. Shoukry, K.D. Lewis, A. Holt, I. Plajzer- M. Schweiger, P. Kienesberger, J.G. Strauss, G. Gorkiewicz, R. Zechner, Adipose
Frick, V. Afzal, E.M. Rubin, L.A. Pennacchio, Ultraconservation identifies a small triglyceride lipase-mediated lipolysis of cellular fat stores is activated by CGI-58
subset of extremely constrained developmental enhancers, Nat. Genet. 40 (2008) and defective in Chanarin-Dorfman Syndrome, Cell Metab. 3 (2006) 309–319.
158–160. [52] R. Seki, C. Li, Q. Fang, S. Hayashi, S. Egawa, J. Hu, L. Xu, H. Pan, M. Kondo,
[30] J.G. Roscito, K. Sameith, G. Parra, B.E. Langer, A. Petzold, C. Moebius, M. Bickle, T. Sato, H. Matsubara, N. Kamiyama, K. Kitajima, D. Saito, Y. Liu, M.T. Gilbert,
M.T. Rodrigues, M. Hiller, Phenotype loss is associated with widespread Q. Zhou, X. Xu, T. Shiroishi, N. Irie, K. Tamura, G. Zhang, Functional roles of Aves
divergence of the gene regulatory landscape in evolution, Nat. Commun. 9 (2018) class-specific cis-regulatory elements on macroevolution of bird-specific features,
4737. Nat. Commun. 8 (2017) 14229.
[31] K. Wang, J. Wang, C. Zhu, L. Yang, Y. Ren, J. Ruan, G. Fan, J. Hu, W. Xu, X. Bi,
Y. Zhu, Y. Song, H. Chen, T. Ma, R. Zhao, H. Jiang, B. Zhang, C. Feng, Y. Yuan,

13
M.-M. Xu et al. Genomics 114 (2022) 110518

[53] G. Saez, S. Davail, G. Gentes, J.F. Hocquette, T. Jourdan, P. Degrace, E. Baeza, [82] M. Stanke, M. Diekhans, R. Baertsch, D. Haussler, Using native and syntenically
Gene expression and protein content in relation to intramuscular fat content in mapped cDNA alignments to improve de novo gene finding, Bioinformatics 24
Muscovy and Pekin ducks, Poult. Sci. 88 (2009) 2382–2391. (2008) 637–644.
[54] D.L. Stern, V. Orgogozo, Is genetic evolution predictable? Science 323 (2009) [83] E. Birney, R. Durbin, Using GeneWise in the Drosophila annotation experiment,
746–751. Genome Res. 10 (2000) 547–548.
[55] L. Jenni, S. Jenni-Eiermann, Fuel supply and metabolic constraints in migrating [84] B. Boeckmann, A. Bairoch, R. Apweiler, M.C. Blatter, A. Estreicher, E. Gasteiger,
birds, J. Avian Biol. 29 (1998). M.J. Martin, K. Michoud, C. O’Donovan, I. Phan, S. Pilbout, M. Schneider, The
[56] E.R. Woodyard, Some Aspects in the Ecology of Muscovy Ducks in Mexico, Texas SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucleic
Tech University, 1982. Acids Res. 31 (2003) 365–370.
[57] B. Kaimal, R. Johnson, R. Hannigan, Distinguishing breeding populations of [85] M. Kanehisa, S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nucleic
mallards (Anas platyrhynchos) using trace elements, J. Geochem. Explor. 102 Acids Res. 28 (2000) 27–30.
(2009) 176–180. [86] E.M. Zdobnov, R. Apweiler, InterProScan–an integration platform for the
[58] Z. Zhou, M. Li, H. Cheng, W. Fan, Z. Yuan, Q. Gao, Y. Xu, Z. Guo, Y. Zhang, J. Hu, signature-recognition methods in InterPro, Bioinformatics 17 (2001) 847–848.
H. Liu, D. Liu, W. Chen, Z. Zheng, Y. Jiang, Z. Wen, Y. Liu, H. Chen, M. Xie, [87] T.M. Lowe, S.R. Eddy, tRNAscan-SE: a program for improved detection of transfer
Q. Zhang, W. Huang, W. Wang, S. Hou, Y. Jiang, An intercross population study RNA genes in genomic sequence, Nucleic Acids Res. 25 (1997) 955–964.
reveals genes associated with body size and plumage color in ducks, Nat. [88] S.W. Burge, J. Daub, R. Eberhardt, J. Tate, L. Barquist, E.P. Nawrocki, S.R. Eddy,
Commun. 9 (2018) 2648. P.P. Gardner, A. Bateman, Rfam 11.0: 10 years of RNA families, Nucleic Acids
[59] M.C. Hidalgo, E. Urea, A. Sanz, Comparative study of digestive enzymes in fish Res. 41 (2013) D226–D232.
with different nutritional habits. Proteolytic and amylase activities, Aquaculture [89] E.P. Nawrocki, S.R. Eddy, Infernal 1.1: 100-fold faster RNA homology searches,
170 (1999) 267–283. Bioinformatics 29 (2013) 2933–2935.
[60] T. Corring, The adaptation of digestive enzymes to the diet: its physiological [90] F.A. Simão, R.M. Waterhouse, P. Ioannidis, E.V. Kriventseva, E.M. Zdobnov,
significance, Reprod. Nutr. Dev. 20 (1980) (1980) 1217–1235. BUSCO: assessing genome assembly and annotation completeness with single-
[61] D. Józefiak, A. Rutkowski, S.A. Martin, Carbohydrate fermentation in the avian copy orthologs, Bioinformatics 31 (2015) 3210–3212.
ceca: a review, Anim. Feed Sci. Technol. 113 (2004) 1–15. [91] S. Kurtz, A. Phillippy, A.L. Delcher, M. Smoot, M. Shumway, C. Antonescu, S.
[62] S. Pan, Y. Lin, Q. Liu, J. Duan, Z. Lin, Y. Wang, X. Wang, S.M. Lam, Z. Zou, L. Salzberg, Versatile and open software for comparing large genomes, Genome
G. Shui, Y. Zhang, Z. Zhang, X. Zhan, Convergent genomic signatures of flight loss Biol. 5 (2004) R12.
in birds suggest a switch of main fuel, Nat. Commun. 10 (2019) 2756. [92] Y. Li, G. Gao, Y. Lin, S. Hu, Y. Luo, G. Wang, L. Jin, Q. Wang, J. Wang, Q. Tang,
[63] M. Kohjima, M. Enjoji, N. Higuchi, M. Kato, K. Kotoh, T. Yoshimoto, T. Fujino, M. Li, Pacific biosciences assembly with Hi-C mapping generates an improved,
M. Yada, R. Yada, N. Harada, R. Takayanagi, M. Nakamuta, Re-evaluation of fatty chromosome-level goose genome, Gigascience 9 (2020) giaa114.
acid metabolism-related gene expression in nonalcoholic fatty liver disease, Int. J. [93] Q.K. Shen, M.S. Peng, A.C. Adeola, L. Kui, S. Duan, Y.W. Miao, N.M. Eltayeb, J.
Mol. Med. 20 (2007) 351–358. K. Lichoti, N.O. Otecko, M.G. Strillacci, E. Gorla, A. Bagnato, O.S. Charles, O.
[64] W. Sun, T. Nie, K. Li, W. Wu, Q. Long, T. Feng, L. Mao, Y. Gao, Q. Liu, X. Gao, J. Sanke, P.M. Dawuda, A.O. Okeyoyin, J. Musina, P. Njoroge, B. Agwanda,
D. Ye, K. Yan, P. Gu, Y. Xu, X. Zhao, K. Chen, K.M. Loomes, S. Lin, D. Wu, X. Hui, S. Kusza, H.A. Nanaei, R. Pedar, M.M. Xu, Y. Du, L.M. Nneji, R.W. Murphy, M.
Hepatic CPT1A facilitates liver–adipose cross talk via induction of FGF21 in mice, S. Wang, A. Esmailizadeh, Y. Dong, S.C. Ommeh, Y.P. Zhang, Genomic analyses of
Diabetes 71 (2022) 31–42. unveil helmeted guinea fowl (Numida meleagris) domestication in West Africa,
[65] J.W. Wu, S.P. Wang, F. Alvarez, S. Casavant, N. Gauthier, L. Abed, K.G. Soni, Genome Biol. Evol. 13 (2021), evab090.
G. Yang, G.A. Mitchell, Deficiency of liver adipose triglyceride lipase in mice [94] A. Löytynoja, Phylogeny-aware alignment with PRANK, Methods Mol. Biol. 1079
causes progressive hepatic steatosis, Hepatology 54 (2011) 122–132. (2014) 155–170.
[66] X. Pan, Y. Zhang, H.G. Kim, S. Liangpunsakul, X.C. Dong, FOXO transcription [95] J. Castresana, Selection of conserved blocks from multiple alignments for their
factors protect against the diet-induced fatty liver disease, Sci. Rep. 7 (2017) use in phylogenetic analysis, Mol. Biol. Evol. 17 (2000) 540–552.
44597. [96] A. Stamatakis, RAxML version 8: a tool for phylogenetic analysis and post-
[67] J.D. Buenrostro, P.G. Giresi, L.C. Zaba, H.Y. Chang, W.J. Greenleaf, Transposition analysis of large phylogenies, Bioinformatics 30 (2014) 1312–1313.
of native chromatin for fast and sensitive epigenomic profiling of open chromatin, [97] D.M. Emms, S. Kelly, OrthoFinder: phylogenetic orthology inference for
DNA-binding proteins and nucleosome position, Nat. Methods 10 (2013) comparative genomics, Genome Biol. 20 (2019) 238.
1213–1218. [98] M.J. Sanderson, r8s: inferring absolute rates of molecular evolution and
[68] Z.S. Ma, L. Li, C. Ye, M. Peng, Y.P. Zhang, Hybrid assembly of ultra-long nanopore divergence times in the absence of a molecular clock, Bioinformatics 19 (2003)
reads augmented with 10x-genomics contigs: demonstrated with a human 301–302.
genome, Genomics 111 (2019) 1896–1901. [99] M.V. Han, G.W. Thomas, J. Lugo-Martinez, M.W. Hahn, Estimating gene gain and
[69] C. Ye, C.M. Hill, S. Wu, J. Ruan, Z.S. Ma, DBG2OLC: efficient assembly of large loss rates in the presence of error in genome assembly and annotation using CAFE
genomes using long erroneous reads of the third generation sequencing 3, Mol. Biol. Evol. 30 (2013) 1987–1997.
technologies, Sci. Rep. 6 (2016) 31900. [100] G. Yu, L.G. Wang, Y. Han, Q.Y. He, clusterProfiler: an R package for comparing
[70] C. Ye, Z.S. Ma, Sparc: a sparsity-based consensus algorithm for long erroneous biological themes among gene clusters, OMICS 16 (2012) 284–287.
sequencing reads, PeerJ 4 (2016), e2016. [101] L. Chen, Q. Qiu, Y. Jiang, K. Wang, Z. Lin, Z. Li, F. Bibi, Y. Yang, J. Wang, W. Nie,
[71] J. Hu, J. Fan, Z. Sun, S. Liu, NextPolish: a fast and efficient genome polishing tool W. Su, G. Liu, Q. Li, W. Fu, X. Pan, C. Liu, J. Yang, C. Zhang, Y. Yin, Y. Wang,
for long-read assembly, Bioinformatics 36 (2020) 2253–2255. Y. Zhao, C. Zhang, Z. Wang, Y. Qin, W. Liu, B. Wang, Y. Ren, R. Zhang, Y. Zeng, R.
[72] N.C. Durand, J.T. Robinson, M.S. Shamim, I. Machol, J.P. Mesirov, E.S. Lander, E. R. da Fonseca, B. Wei, R. Li, W. Wan, R. Zhao, W. Zhu, Y. Wang, S. Duan, Y. Gao,
L. Aiden, Juicebox provides a visualization system for hi-C contact maps with Y.E. Zhang, C. Chen, C. Hvilsom, C.W. Epps, L.G. Chemnick, Y. Dong, S. Mirarab,
unlimited zoom, Cell. Syst. 3 (2016) 99–101. H.R. Siegismund, O.A. Ryder, M.T.P. Gilbert, H.A. Lewin, G. Zhang, R. Heller,
[73] O. Dudchenko, S.S. Batra, A.D. Omer, S.K. Nyquist, M. Hoeger, N.C. Durand, M. W. Wang, Large-scale ruminant genome sequencing provides insights into their
S. Shamim, I. Machol, E.S. Lander, A.P. Aiden, E.L. Aiden, De novo assembly of evolution and distinct traits, Science 364 (2019) eaav6202.
the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds, [102] Z. Yang, PAML 4: phylogenetic analysis by maximum likelihood, Mol. Biol. Evol.
Science 356 (2017) 92–95. 24 (2007) 1586–1591.
[74] M. Tarailo-Graovac, N. Chen, Using RepeatMasker to identify repetitive elements [103] B. Jassal, L. Matthews, G. Viteri, C. Gong, P. Lorente, A. Fabregat, K. Sidiropoulos,
in genomic sequences, Curr. Protoc. Bioinformatics 25 (2009), 4.10.11-14.10.14. J. Cook, M. Gillespie, R. Haw, F. Loney, B. May, M. Milacic, K. Rothfels, C. Sevilla,
[75] W. Bao, K.K. Kojima, O. Kohany, Repbase update, a database of repetitive V. Shamovsky, S. Shorser, T. Varusai, J. Weiser, G. Wu, L. Stein, H. Hermjakob,
elements in eukaryotic genomes, Mob. DNA 6 (2015) 11. P. D’Eustachio, The reactome pathway knowledgebase, Nucleic Acids Res. 48
[76] Z. Xu, H. Wang, LTR_FINDER: an efficient tool for the prediction of full-length (2020) D498–D503.
LTR retrotransposons, Nucleic Acids Res. 35 (2007) W265–W268. [104] S.M. Kielbasa, R. Wan, K. Sato, P. Horton, M.C. Frith, Adaptive seeds tame
[77] B.J. Haas, S.L. Salzberg, W. Zhu, M. Pertea, J.E. Allen, J. Orvis, O. White, C. genomic sequence comparison, Genome Res. 21 (2011) 487–493.
R. Buell, J.R. Wortman, Automated eukaryotic gene structure annotation using [105] M. Blanchette, W.J. Kent, C. Riemer, L. Elnitski, A.F. Smit, K.M. Roskin,
EVidenceModeler and the program to assemble spliced alignments, Genome Biol. R. Baertsch, K. Rosenbloom, H. Clawson, E.D. Green, D. Haussler, W. Miller,
9 (2008) R7. Aligning multiple genomic sequences with the threaded blockset aligner, Genome
[78] A.M. Bolger, M. Lohse, B. Usadel, Trimmomatic: a flexible trimmer for Illumina Res. 14 (2004) 708–715.
sequence data, Bioinformatics 30 (2014) 2114–2120. [106] M.J. Hubisz, K.S. Pollard, A. Siepel, PHAST and RPHAST: phylogenetic analysis
[79] D. Kim, G. Pertea, C. Trapnell, H. Pimentel, R. Kelley, S.L. Salzberg, TopHat2: with space/time models, Brief. Bioinform. 12 (2011) 41–51.
accurate alignment of transcriptomes in the presence of insertions, deletions and [107] A. Siepel, G. Bejerano, J.S. Pedersen, A.S. Hinrichs, M. Hou, K. Rosenbloom,
gene fusions, Genome Biol. 14 (2013) R36. H. Clawson, J. Spieth, L.W. Hillier, S. Richards, G.M. Weinstock, R.K. Wilson, R.
[80] C. Trapnell, A. Roberts, L. Goff, G. Pertea, D. Kim, D.R. Kelley, H. Pimentel, S. A. Gibbs, W.J. Kent, W. Miller, D. Haussler, Evolutionarily conserved elements in
L. Salzberg, J.L. Rinn, L. Pachter, Differential gene and transcript expression vertebrate, insect, worm, and yeast genomes, Genome Res. 15 (2005) 1034–1050.
analysis of RNA-seq experiments with TopHat and cufflinks, Nat. Protoc. 7 (2012) [108] Z. Hu, T.B. Sackton, S.V. Edwards, J.S. Liu, Bayesian detection of convergent rate
562–578. changes of conserved noncoding elements on phylogenetic trees, Mol. Biol. Evol.
[81] M.G. Grabherr, B.J. Haas, M. Yassour, J.Z. Levin, D.A. Thompson, I. Amit, 36 (2019) 1086–1100.
X. Adiconis, L. Fan, R. Raychowdhury, Q. Zeng, Z. Chen, E. Mauceli, N. Hacohen, [109] N. Bray, I. Dubchak, L. Pachter, AVID: a global alignment program, Genome Res.
A. Gnirke, N. Rhind, F. di Palma, B.W. Birren, C. Nusbaum, K. Lindblad-Toh, 13 (2003) 97–102.
N. Friedman, A. Regev, Full-length transcriptome assembly from RNA-Seq data
without a reference genome, Nat. Biotechnol. 29 (2011) 644–652.

14
M.-M. Xu et al. Genomics 114 (2022) 110518

[110] K.A. Frazer, L. Pachter, A. Poliakov, E.M. Rubin, I. Dubchak, VISTA: [115] N. Servant, N. Varoquaux, B.R. Lajoie, E. Viara, C.J. Chen, J.P. Vert, E. Heard,
computational tools for comparative genomics, Nucleic Acids Res. 32 (2004) J. Dekker, E. Barillot, HiC-pro: an optimized and flexible pipeline for Hi-C data
W273–W279. processing, Genome Biol. 16 (2015) 259.
[111] M. Martin, Cutadapt removes adapter sequences from high-throughput [116] F. Ramírez, V. Bhardwaj, L. Arrigoni, K.C. Lam, B.A. Grüning, J. Villaveces,
sequencing reads, EMBnet.journal 17 (2011) 10–12. B. Habermann, A. Akhtar, T. Manke, High-resolution TADs reveal DNA sequences
[112] B. Langmead, S.L. Salzberg, Fast gapped-read alignment with bowtie 2, Nat. underlying genome organization in flies, Nat. Commun. 9 (2018) 189.
Methods 9 (2012) 357–359. [117] A. Kaul, S. Bhattacharyya, F. Ay, Identifying statistically significant chromatin
[113] P. Danecek, J.K. Bonfield, J. Liddle, J. Marshall, V. Ohan, M.O. Pollard, contacts from Hi-C data with FitHiC2, Nat. Protoc. 15 (2020) 991–1012.
A. Whitwham, T. Keane, S.A. McCarthy, R.M. Davies, H. Li, Twelve years of [118] P. Langfelder, S. Horvath, WGCNA: an R package for weighted correlation
SAMtools and BCFtools, Gigascience 10 (2021) giab008. network analysis, BMC Bioinform. 9 (2008) 559.
[114] Y. Zhang, T. Liu, C.A. Meyer, J. Eeckhoute, D.S. Johnson, B.E. Bernstein, [119] P. Shannon, A. Markiel, O. Ozier, N.S. Baliga, J.T. Wang, D. Ramage, N. Amin,
C. Nusbaum, R.M. Myers, M. Brown, W. Li, X.S. Liu, Model-based analysis of B. Schwikowski, T. Ideker, Cytoscape: a software environment for integrated
ChIP-Seq (MACS), Genome Biol. 9 (2008) R137. models of biomolecular interaction networks, Genome Res. 13 (2003)
2498–2504.

15

You might also like