Professional Documents
Culture Documents
A R T I C L E I N F O A B S T R A C T
Keywords: Circadian rhythms are endogenous 24-hour rhythmic oscillations affecting human behaviors, such as sleep,
Circadian rhythm blood pressure and other biological processes, the disturbance of which lead to circadian rhythm sleep disorders
Early wake-up (CRSDs). In this study, based on the data from genome-wide association studies (GWASs) and expression
Dagging quantitative trait loci (eQTLs), we tried to identify novel gene expression patterns in brain tissues that were
Maximum-relevance-minimum-redundancy
associated with early wake-up. First, the maximum-relevance-minimum-redundancy (mRMR) method was
adopted to analyze the involved gene expression patterns, yielding a feature list. Second, the incremental feature
selection (IFS) method and the Dagging algorithm were applied to extract important gene expression patterns,
which yield the best performance for Dagging. As a result, 4374 gene expression patterns were obtained, and
they were further used to build an optimal classifier with a good performance of a Matthews's correlation
coefficient of 0.933. Furthermore, the most important 49 gene expression patterns were extensively analyzed.
Four genes were found to be related to circadian rhythm, as reported in previous studies. As a first attempt in
identifying the target genes whose expression levels are associated with sleep-wake rhythms through integrating
GWAS and eQTL results, this study can motivate more investigations in this regard.
This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic
Big Data Analysis edited by Yudong Cai & Tao Huang.
1. Introduction As one of the intrinsic CRSDs, ASPD leads to the early wake up
phenotype and patients with ASPD have chronic or recurrent difficulty
Circadian rhythms, controlled by endogenous circadian clocks, are staying awake until the desired or socially acceptable bedtime, together
rhythmic oscillations in our behavior and physiological processes with a with an earlier than desired wake-up time [5]. The estimated pre-
period close to 24 h, and they exist in diverse organisms on the Earth, valence of ASPD is 1% in the general population, which is likely an
ranging from bacteria and fungi to plants and animals [1]. Circadian underestimate since many individuals successfully adapt their social
rhythms are generated by the suprachiasmatic nucleus (SCN), located in and work schedules to the advanced sleep phase [7]. ASPD is believed
the anterior hypothalamus [2] and are synchronized with the earth's to result from the dysfunction of the circadian clock or its afferent and
rotation by daily adjustments in the timing of the SCN, following ex- efferent pathways [5]. A previous study demonstrated that increased
posure to stimuli that signal the time of day. The SCN generates retinal sensitivity to light was one of the reasons leading to ASPD [8].
rhythmic cues that entrain the circadian clocks of peripheral organs and Based on this knowledge, early evening light therapy is the most
cells in the body by orchestrating hormonal, body temperature, neural, commonly used treatment for ASPD, which is effective for some pa-
feeding, metabolic, and locomotor activity rhythms [3]. The sleep-wake tients but not uniformly positive [9]. This triggers more efforts on the
rhythm is one of the most important and observable circadian rhythms identification of the genetic and molecular basis of ASPD to shed light
[4], and a disturbance of the sleep-wake cycle causes circadian rhythm on a novel diagnosis and therapies. Promisingly, studies in the past few
sleep disorders (CRSDs) [5], which includes advanced sleep phase decades have established that the circadian rhythm is determined by a
disorder (ASPD), delayed sleep phase disorder (DSPD), free-running core set of clock genes [10], and two causative gene mutations of ASPD
disorder (FRD), and irregular sleep-wake rhythm (ISWD), and only were identified through target gene studies [11].
ASPD has the phenotype of early wake up [6]. However, an abundant number of transcripts were reported to
☆
This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang.
⁎
Corresponding author at: Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, People's Republic of China.
E-mail address: huangtao@sibs.ac.cn (T. Huang).
http://dx.doi.org/10.1016/j.bbadis.2017.10.036
Received 4 September 2017; Received in revised form 19 October 2017; Accepted 30 October 2017
Available online 03 November 2017
0925-4439/ © 2017 Elsevier B.V. All rights reserved.
J. Li, T. Huang BBA - Molecular Basis of Disease 1864 (2018) 2241–2246
fluctuate in their expression level with the circadian rhythm in both the
hypothalamus and peripheral organs [12]. Meanwhile, different genes
are associated with different CRSDs [13], suggesting the individual
roles of the circadian genes in CRSDs and the complexity of the disease.
Because of technological advances, especially the emergence of mi-
croarray studies and next-generation sequencing, genome-wide asso-
ciation studies (GWASs) have become more affordable and have been
widely performed to study complex traits to identify many disease-as-
sociated loci and provide insights into the allelic architecture of com-
plex traits [14]. To dissect the genetic basis of the circadian rhythm,
such a study was recently applied in a large cohort of self-reported
morningness population, which identified an abundant number of ge-
netic polymorphisms associated with morningness [15], including
seven that are near well-known circadian genes. However, as with other
GWASs, this study identified many morningness-associated genetic
polymorphisms that were located in non-coding regions apart from the
genes with established functions and were unable to be correlated with
biological processes. Thus, it is still unclear how these polymorphic loci
contribute to sleep-wake timing variations. A systematic identification Fig. 1. The IFS-curve for the prediction performances of Dagging on different feature sets
of expression quantitative trait loci (eQTLs) combined with GWAS was with X-values representing number of features used and Y-values indicating the MCC
demonstrated as one of the approaches for unveiling biological me- value. The maximum MCC value is marked with red dot on the curve.
2242
J. Li, T. Huang BBA - Molecular Basis of Disease 1864 (2018) 2241–2246
Table 1
The selected top 49 features in mRMR feature list.
mRMR feature list can be built in a way that the first selected feature the best performance was identified. This feature set was termed the
takes the first place in the list, followed by the second selected feature, optimal feature set, and the features in this set were called the optimal
the third selected feature and so forth. For formulation, the obtained features, which capture the essential differences between the positive
mRMR feature list is denoted as and negative SNPs. In addition, an optimal classifier was built, which
used the optimal features to represent the SNPs and the classification
F = [f1 , f2 ,…, fN ] (2) algorithm as the prediction engine.
2243
J. Li, T. Huang BBA - Molecular Basis of Disease 1864 (2018) 2241–2246
Nucleus_accumbens_basal_ganglia
method, like the jackknife test [35], this method takes less time and
Anterior_cingulate_cortex_BA24
always yields similar results.
In the dataset Sd, only two types of SNPs were involved. Thus, it is a
binary classification problem. The predicted results of this type of
Putamen_basal_ganglia
Cerebellar_Hemisphere
Caudate_basal_ganglia
problem can always be counted as four measurements, including sen-
Frontal_Cortex_BA9
sitivity (SN), specificity (SP), accuracy (ACC) [37–40], and the Mat-
thews's correlation coefficient (MCC) [41], which are computed by the
Hypothalamus
Hippocampus
following equations:
Cerebellum
TP
Cortex
SN =
ENSEMBL Gene ID Gene Name Score TP + FN (3)
ENSG00000115524 SF3B1 0.091
ENSG00000132911 NMUR2 0.061
TN
SP =
ENSG00000255098 RP11-481A20.11 0.046 TN + FP (4)
ENSG00000116786 PLEKHM2 0.046
ENSG00000145888 GLRA1 0.024 TP + TN
ENSG00000203817 FAM72C 0.022
ACC =
TP + TN + FP + FN (5)
ENSG00000226747 AC007966.1 0.019
ENSG00000247828 TMEM161B-AS1 0.018 TP × TN − FP × FN
ENSG00000227888 FAM66A 0.013 MCC =
ENSG00000119711 ALDH6A1 0.013 (TP + FP )(TP + FN )(TN + FP )(TN + FN ) (6)
ENSG00000173295 FAM86B3P 0.013
ENSG00000184905 TCEAL2 0.013 where TP, TN, FP, and FN refer to the number of positive samples that
ENSG00000255020 AF131216.5 0.012 are predicted correctly, the number of negative samples that are pre-
ENSG00000242353 RPL12P30 0.012 dicted correctly, the number of negative samples that are predicted
ENSG00000254507 RP11-481A20.10 0.011
ENSG00000132436 FIGNL1 0.011 incorrectly, and the number of positive samples that are predicted in-
ENSG00000255310 AF131215.2 0.010 correctly, respectively.
ENSG00000255556 RP11-351I21.6 0.007 Among the above four measurements, MCC is deemed as a balanced
ENSG00000254532 RP11-624D11.2 0.007
ENSG00000076003 MCM6 0.006
measurement that can give fair evaluating results even if the sizes of the
ENSG00000254423 RP11-351I21.7 0.006 classes are of great differences. MCC was first proposed by Matthews in
ENSG00000179344 HLA-DQB1 0.005 1975 [41]. Its value ranges from − 1 to 1. In detail, 1 means a perfect
ENSG00000114735 HEMK1 0.005
classification, 0 indicates that the predicted results are no better than
ENSG00000253893 FAM85B 0.005
ENSG00000149485 FADS1 0.004 random predictions, and − 1 represents a total misclassification. In this
study, it was used as the key measurement, i.e., the performance of the
Fig. 2. The top 49 features, including 25 genes expressing in ten brain tissues, identified Dagging on the different feature sets is mainly measured by MCC. Other
to be related to early wake-up. Gene in red was reported to be associated with narcolepsy.
three measurements were provided as references.
Genes in cyan have clues suggest the correlation with circadian rhythm. The 25 genes
were sorted based on the highest mRMR score in the brain tissues.
3. Results
2244
J. Li, T. Huang BBA - Molecular Basis of Disease 1864 (2018) 2241–2246
ACCs and MCCs were obtained, which are available in Supplementary LCPUFAs may cause sleep problems [50], the expression of this gene is
material S2. As described in Section 2.5, the MCC was selected as the likely to affect the sleep-wake cycle. The second gene NMUR2 shows a
major measurement. Thus, we tried to find the feature set yielding the circadian expression in rat [51] and encodes the receptor neuromedin S
maximum MCC. For easy observation, a curve, namely an IFS curve, (NMS) [52], which is also reported to be expressed in the suprachias-
was plotted and is shown in Fig. 1, in which the MCC was set to the Y- matic nucleus (SCN, which is believed to control the circadian rhythm
axis, and the number of features used was set to the X-axis. It was ob- [13]) and might play a role in the circadian rhythm [53]. The last gene
served that the IFS curve first follows a sharp increasing trend and then GLRA1 is reported to have a mutation leading to hyperekplexia, which
becomes stable. The maximum MCC was 0.933 when the number of has the symptom of periodic limb movements in sleep. These findings
features was 4374, meaning the top 4374 features in the mRMR feature indicated the potential influence of all three genes in the sleep-wake
list yield the best performance for Dagging, which indicates that these cycle.
features have a strong association with sleep-wake time variations. The In summary, these four genes are functionally related to the circa-
obtained 4374 features were called optimal features and comprised the dian rhythm and sleeping regulation via direct or indirect evidence,
optimal feature set. In addition, an optimal classifier was built, which while there is lack of evidence for the other 21 to support their re-
used the optimal 4374 features to represent SNPs and Dagging as the lationship with the circadian rhythm. This result indicated the sig-
prediction engine. The SN, SP and ACC yielded by this classifier were nificance of our method in identifying the morningness-related gene
0.943, 0.990, and 0.966, respectively, suggesting it is a good classifier. expressions affected by previously reported SNPs. Interestingly, we
found most of these 25 genes showed significant expression changes in
4. Discussion the cerebellar hemisphere and cerebellum, but not the hypothalamus,
in which the SCN controls the circadian rhythm [13]. The distribution
In total, 4374 optimal eQTL features were extracted in Section 3. of the 4373 optimal features also show the same enrichment (Fig. 3).
These features are deemed to be highly related to sleep-wake time This phenomenon could be attributed to three reasons. First, the sig-
variations. However, it is impossible to analyze them one by one. As nificant expression changes could be cues generated by the SCN or the
mentioned above, the IFS curve follows a sharp increasing trend in the behaviors of the peripheral organs and cells. Second, circadian gene
beginning, which means that some of the top features are more im- expression might be observed not only in the SCN but also in peripheral
portant than others. By carefully checking the IFS curve, we found the tissues, such as the liver [54]. Third, the tissue samples were from post-
IFS curve first exceeds 0.850 at the X-axis 49 (SN, SP, ACC and MCC are mortem donors, which might not contain all of the expression changes
0.852, 0.994, 0.923, and 0.855, respectively) meaning the top 49 fea- related to sleep/wake cycle control.
tures in the mRMR feature list are more important, which are listed in
Table 1. Thus, in this study, we only analyzed them to identify their 5. Conclusions
functional roles in the sleep-wake cycle as follows. Four genes were
found to be related to circadian rhythm by some previous studies. In this study, we tried to extract morningness-associated gene ex-
The 49 tissue-gene expression patterns of the 49 features included pression patterns. Based on the results, four genes were functionally
25 genes expressed in ten tissues of human brain. We hypothesized that related to circadian rhythm and sleeping regulation by previous studies,
those 25 key genes were related to circadian rhythm through their while the other 21 genes could also be associated with early wake-up.
expression in brain tissues. To validate this hypothesis, we investigated However, there is lack of evidence for this currently. We believe that
these genes in biological functions, pathways and processes. this study, as a pioneer investigation on interpreting the mechanisms of
We first did the functional annotation of the 25 genes, including the morningness-associated SNPs in affecting the sleep-wake cycle by
GO terms, the KEGG pathway, and Interpro et al., though the online identifying the downstream gene expression patterns, will shed light on
database and tool DAVID [42]. The result showed none of these 25 further research involving circadian rhythm.
genes were related to circadian rhythm or sleep disorder. However, Supplementary data to this article can be found online at https://
since the information in the database is usually incomprehensive, we doi.org/10.1016/j.bbadis.2017.10.036.
did a further literature review of the 25 genes. Promisingly, we found
one gene that was reported to be directly associated with a sleeping Transparency document
disorder, narcolepsy, while another three had clues suggesting potential
roles in circadian rhythm (Fig. 2). The Transparency document associated with this article can be
Narcolepsy is a sleep disorder of the regulation of sleep and wake- found, in the online version.
fulness, resulting in a variety of symptoms, such as excessive daytime
sleepiness (EDS), cataplexy, hypnagogic hallucinations, sleep paralysis Acknowledgements
and disturbed nocturnal sleep [43]. Narcolepsy is tightly associated
with human leukocyte antigen (HLA) or in other words a specific HLA This study was supported by the National Natural Science
allele, i.e., HLA-DQB1*06:02 [44]. A functional HLA-DQ molecule ori- Foundation of China (31371335, 31701151), the Natural Science
ginates from the binding of an α chain (DQA1) with a β chain (DQB1). Foundation of Shanghai (17ZR1412500), the Shanghai Sailing
Worldwide approximately 85–95% of the narcolepsy with cataplexy Program, The Youth Innovation Promotion Association of Chinese
patients carry a specific haplotype DQB1*06:02-DQA1*01:02, com- Academy of Sciences (CAS) (2016245), Training and Assistance Plan of
pared to 12–38% of the general population [45]. Another haplotype Shanghai Young College Teacher.
DRB1*1501-DQB1*0602 is suggested as almost necessary but not suf-
ficient for developing narcolepsy [46,47]. Further studies suggest that References
the dosage of HLA also affects the development of narcolepsy [48]. In
our study, we predicted 25 key genes, including HLA-DQB1. This result [1] J.C. Dunlap, Molecular bases for circadian clocks, Cell 96 (1999) 271–290.
suggests an important role of the expression of this gene as well as HLA [2] U. Schibler, P. Sassone-Corsi, A web of circadian pacemakers, Cell 111 (2002)
919–922.
in sleeping control. [3] C. Dibner, U. Schibler, U. Albrecht, The mammalian circadian timing system: or-
There are another three genes without direct experimental evidence ganization and coordination of central and peripheral clocks, Annu. Rev. Physiol.
but with some clues indicating their potential roles in the circadian 72 (2010) 517–549.
[4] K.G. Baron, K.J. Reid, Circadian misalignment and health, Int. Rev. Psychiatry 26
clock. The first gene FADS1 is one member of the acid desaturase (2014) 139–154.
(FADS) gene cluster (11q12-13.1), which mediates long-chain poly- [5] P.C. Zee, H. Attarian, A. Videnovic, Circadian rhythm abnormalities, Continuum 19
unsaturated fatty acids (LCPUFAs) [49]. Since a low proportion of (2013) 132–147.
2245
J. Li, T. Huang BBA - Molecular Basis of Disease 1864 (2018) 2241–2246
[6] R.L. Sack, D. Auckley, R.R. Auger, M.A. Carskadon, K.P. Wright Jr., M.V. Vitiello, [31] Z. Cai, D. Xu, Q. Zhang, J. Zhang, S.-M. Ngai, J. Shao, Classification of lung cancer
I.V. Zhdanova, M. American Academy of Sleep, Circadian rhythm sleep disorders: using ensemble-based feature selection and machine learning methods, Mol.
part II, advanced sleep phase disorder, delayed sleep phase disorder, free-running BioSyst. 11 (2015) 791–800.
disorder, and irregular sleep-wake rhythm. An American Academy of Sleep [32] L. He, Z. Cao, Y. Wang, W. Du, Y. Liang, An ensemble feature selection method
Medicine review, Sleep 30 (2007) 1484–1501. based on mRMR for paired microarray data, J. Comput. Inf. Syst. 10 (2014)
[7] K. Ando, D.F. Kripke, S. Ancoli-Israel, Delayed and advanced sleep phase symptoms, 4875–4882.
Isr. J. Psychiatry Relat. Sci. 39 (2002) 11–18. [33] L. Chen, C. Chu, T. Huang, X. Kong, Y.-D. Cai, Prediction and analysis of cell-pe-
[8] K.J. Reid, P.C. Zee, Circadian rhythm sleep disorders, Handb. Clin. Neurol. 99 netrating peptides using pseudo-amino acid composition and random forest models,
(2011) 963–977. Amino Acids 47 (2015) 1485–1493.
[9] T.I. Morgenthaler, T. Lee-Chiong, C. Alessi, L. Friedman, R.N. Aurora, B. Boehlecke, [34] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and
T. Brown, A.L. Chesson Jr., V. Kapur, R. Maganti, J. Owens, J. Pancer, T.J. Swick, model selection, International Joint Conference on Artificial Intelligence, vol. 14,
R. Zak, M. Standards of Practice Committee of the American Academy of Sleep, Lawrence Erlbaum Associates Ltd, 1995, pp. 1137–1145.
Practice parameters for the clinical evaluation and treatment of circadian rhythm [35] L. Chen, W.M. Zeng, Y.D. Cai, K.Y. Feng, K.C. Chou, Predicting anatomical ther-
sleep disorders. An American Academy of Sleep Medicine report, Sleep 30 (2007) apeutic chemical (ATC) classification of drugs by integrating chemical-chemical
1445–1459. interactions and similarities, PLoS One 7 (2012) e35254.
[10] J.S. Takahashi, H.K. Hong, C.H. Ko, E.L. McDearmon, The genetics of mammalian [36] I.H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and
circadian order and disorder: implications for physiology and disease, Nat. Rev. Techniques, 2nd edn, Morgan, Kaufmann, San Francisco, 2005.
Genet. 9 (2008) 764–775. [37] J. Lu, S. Wang, Y.D. Cai, Q. Zhang, Analysis and prediction of nitrated tyrosine sites
[11] Y. Xu, Q.S. Padiath, R.E. Shapiro, C.R. Jones, S.C. Wu, N. Saigoh, K. Saigoh, with mRMR method and support vector machine algorithm, Curr. Bioinforma. 11
L.J. Ptacek, Y.H. Fu, Functional consequences of a CKIdelta mutation causing fa- (2016), http://dx.doi.org/10.2174/1574893611666160608075753 (E-pub ahead
milial advanced sleep phase syndrome, Nature 434 (2005) 640–644. of print).
[12] R. Zhang, N.F. Lahens, H.I. Ballance, M.E. Hughes, J.B. Hogenesch, A circadian gene [38] B.Q. Li, Y.D. Cai, K.Y. Feng, G.J. Zhao, Prediction of protein cleavage site with
expression atlas in mammals: implications for biology and medicine, Proc. Natl. feature selection by random forest, PLoS One 7 (2012) e45854.
Acad. Sci. U. S. A. 111 (2014) 16219–16224. [39] Y. Cai, J. He, L. Lu, Predicting sumoylation site by feature selection method, J.
[13] J.M. Parish, Genetic and immunologic aspects of sleep and sleep disorders, Chest Biomol. Struct. Dyn. 28 (2011) 797–804.
143 (2013) 1489–1499. [40] L. Chen, K.Y. Feng, Y.D. Cai, K.C. Chou, H.P. Li, Predicting the network of substrate-
[14] P.M. Visscher, M.A. Brown, M.I. McCarthy, J. Yang, Five years of GWAS discovery, enzyme-product triads by combining compound similarity and functional domain
Am. J. Hum. Genet. 90 (2012) 7–24. composition, BMC Bioinf. 11 (2010) 293.
[15] Y. Hu, A. Shmygelska, D. Tran, N. Eriksson, J.Y. Tung, D.A. Hinds, GWAS of 89,283 [41] B.W. Matthews, Comparison of the predicted and observed secondary structure of
individuals identifies genetic variants associated with self-reporting of being a T4 phage lysozyme, Biochim. Biophys. Acta 405 (1975) 442–451.
morning person, Nat. Commun. 7 (2016) 10448. [42] W. Huang da, B.T. Sherman, R.A. Lempicki, Systematic and integrative analysis of
[16] W. Cookson, L. Liang, G. Abecasis, M. Moffatt, M. Lathrop, Mapping complex dis- large gene lists using DAVID bioinformatics resources, Nat. Protoc. 4 (2009) 44–57.
ease traits with global gene expression, Nat. Rev. Genet. 10 (2009) 184–194. [43] Y. Dauvilliers, I. Arnulf, E. Mignot, Narcolepsy with cataplexy, Lancet 369 (2007)
[17] P. Li, M. Guo, C. Wang, X. Liu, Q. Zou, An overview of SNP interactions in genome- 499–511.
wide association studies, Brief. Funct. Genomics 14 (2015) 143–155. [44] E. Thorsby, Invited anniversary review: HLA associated diseases, Hum. Immunol.
[18] G.T. Consortium, Human genomics. The genotype-tissue expression (GTEx) pilot 53 (1997) 1–11.
analysis: multitissue gene regulation in humans, Science 348 (2015) 648–660. [45] E. Mignot, R. Hayduk, J. Black, F.C. Grumet, C. Guilleminault, HLA DQB1*0602 is
[19] H. Peng, F. Long, C. Ding, Feature selection based on mutual information: criteria of associated with cataplexy in 509 narcoleptic patients, Sleep 20 (1997) 1012–1020.
max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. [46] H. Hor, Z. Kutalik, Y. Dauvilliers, A. Valsesia, G.J. Lammers, C.E. Donjacour,
Mach. Intell. 27 (2005) 1226–1238. A. Iranzo, J. Santamaria, R. Peraita Adrados, J.L. Vicario, S. Overeem, I. Arnulf,
[20] K.M. Ting, I.H. Witten, Stacking bagged and dagged models, In Fourteenth I. Theodorou, P. Jennum, S. Knudsen, C. Bassetti, J. Mathis, M. Lecendreux,
International Conference on Machine Learning, 1997 (San Francisco, CA.). G. Mayer, P. Geisler, A. Beneto, B. Petit, C. Pfister, J.V. Burki, G. Didelot,
[21] Human genomics. The genotype-tissue expression (GTEx) pilot analysis: multitissue M. Billiard, G. Ercilla, W. Verduijn, F.H. Claas, P. Vollenweider, G. Waeber,
gene regulation in humans, Science 348 (2015) 648–660. D.M. Waterworth, V. Mooser, R. Heinzer, J.S. Beckmann, S. Bergmann, M. Tafti,
[22] L. Chen, Y.-H. Zhang, G. Lu, T. Huang, Y.-D. Cai, Analysis of cancer-related lncRNAs Genome-wide association study identifies new HLA class II haplotypes strongly
using gene ontology and KEGG pathways, Artif. Intell. Med. 76 (2017) 27–36. protective against narcolepsy, Nat. Genet. 42 (2010) 786–789.
[23] L. Liu, L. Chen, Y.-H. Zhang, L. Wei, S. Cheng, X.-Y. Kong, M. Zheng, T. Huang, Y.- [47] E. Mignot, L. Lin, W. Rogers, Y. Honda, X. Qiu, X. Lin, M. Okun, H. Hohjoh, T. Miki,
D. Cai, Analysis and prediction of drug-drug interaction by minimum redundancy S. Hsu, M. Leffell, F. Grumet, M. Fernandez-Vina, M. Honda, N. Risch, Complex
maximum relevance and incremental feature selection, J. Biomol. Struct. Dyn. 35 HLA-DR and -DQ interactions confer risk of narcolepsy-cataplexy in three ethnic
(2017) 312–329. groups, Am. J. Hum. Genet. 68 (2001) 686–699.
[24] H. Mohabatkar, M. Mohammad Beigi, K. Abdolahi, S. Mohsenzadeh, Prediction of [48] A. van der Heide, W. Verduijn, G.W. Haasnoot, J.J. Drabbels, G.J. Lammers,
allergenic proteins by means of the concept of Chous pseudo amino acid compo- F.H. Claas, HLA dosage effect in narcolepsy with cataplexy, Immunogenetics 67
sition and a machine learning approach, Med. Chem. 9 (2013) 133–137. (2015) 1–6.
[25] L. Chen, C. Chu, K. Feng, Predicting the types of metabolic pathway of compounds [49] J.Y. Zhang, K.S. Kothapalli, J.T. Brenna, Desaturase and elongase-limiting en-
using molecular fragments and sequential minimal optimization, Comb. Chem. dogenous long-chain polyunsaturated fatty acid biosynthesis, Curr. Opin. Clin.
High Throughput Screen. 19 (2016) 136–143. Nutr. Metab. Care 19 (2016) 103–110.
[26] Q. Ni, L. Chen, A feature and algorithm selection method for improving the pre- [50] J.R. Burgess, L. Stevens, W. Zhang, L. Peck, Long-chain polyunsaturated fatty acids
diction of protein structural classes, Comb. Chem. High Throughput Screen (2017), in children with attention-deficit hyperactivity disorder, Am. J. Clin. Nutr. 71
http://dx.doi.org/10.2174/1386207320666170314103147 (E-pub ahead of print). (2000) 327S–330S.
[27] Z. Li, X. Zhou, Z. Dai, X. Zou, Classification of G-protein coupled receptors based on [51] S. Aizawa, I. Sakata, M. Nagasaka, Y. Higaki, T. Sakai, Negative regulation of
support vector machine with maximum relevance minimum redundancy and ge- neuromedin U mRNA expression in the rat pars tuberalis by melatonin, PLoS One 8
netic algorithm, BMC Bioinf. 11 (2010) 325. (2013) e67118.
[28] L. Chen, Y.H. Zhang, M. Zheng, T. Huang, Y.D. Cai, Identification of compound- [52] P.J. Brighton, P.G. Szekeres, G.B. Willars, Neuromedin U and its receptors: struc-
protein interactions through the analysis of gene ontology, KEGG enrichment for ture, function, and physiological roles, Pharmacol. Rev. 56 (2004) 231–248.
proteins and molecular fragments of compounds, Mol Gen Genet 291 (2016) [53] K. Mori, M. Miyazato, T. Ida, N. Murakami, R. Serino, Y. Ueta, M. Kojima,
2065–2079. K. Kangawa, Identification of neuromedin S and its possible role in the mammalian
[29] Y. Zhang, C. Ding, T. Li, Gene selection algorithm by combining reliefF and mRMR, circadian oscillator system, EMBO J. 24 (2005) 325–335.
BMC Genomics 9 (2008) S27. [54] S. Luck, P.O. Westermark, Circadian mRNA expression: insights from modeling and
[30] L. Chen, Y.-H. Zhang, T. Huang, Y.-D. Cai, Gene expression profiling gut microbiota transcriptomics, Cell. Mol. Life Sci. 73 (2016) 497–521.
in different races of humans, Sci. Rep. 6 (2016) 23075.
2246