You are on page 1of 8

2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

Predicting gene-disease associations from the


heterogeneous network using graph embedding

;LDRFKDQ:DQJ<XFKRQJ*RQJ-LQJ<L:HQ=KDQJ 

&ROOHJHRI,QIRUPDWLFV+XD]KRQJ$JULFXOWXUDO8QLYHUVLW\:XKDQ&KLQD
6FKRRORI&RPSXWHU6FLHQFH:XKDQ8QLYHUVLW\:XKDQ&KLQD
ZDQJ[LDRFKDQ#ZKXHGXFQJRQJ\XFKRQJ#ZKXHGXFQ\LMLQJY#ZKXHGXFQ]KDQJZHQ#PDLOK]DXHGXFQ 
FRUUHVSRQGLQJDXWKRU

Abstract-The discovery of gene-disease associations is PHWKRGZKLFKXVHVIHDWXUHYHFWRUVRIJHQHSKHQRW\SHSDLUV


important for the prevention, diagnosis and treatment of diseases. GHULYHG IURP K\EULG ZDONV RQ WKH KHWHURJHQHRXV QHWZRUN
The studies on gene-disease associations have produced diverse 1DJDUDMDQHWDO>@SURSRVHGDQLQGXFWLYHPDWUL[FRPSOHWLRQ
data, which can facilitate the gene-disease association prediction. EDVHGPHWKRGZKLFKFRPELQHGPXOWLSOHW\SHVRIIHDWXUHVIRU
Integrating diverse information is critical for developing high-
accuracy prediction models. In this paper, we propose a
GLVHDVHV DQG JHQHV WR OHDUQ ODWHQW IDFWRUV WKDW H[SODLQ
heterogeneous network-based method that enhances gene-disease REVHUYHGJHQHGLVHDVHDVVRFLDWLRQV=KRXHWDO>@SURSRVHG
association prediction by using graph embedding and ensemble D NQRZOHGJHEDVHG DSSURDFK FDOOHG .QRZ*(1( WR
learning, abbreviated as “HNEEM”. A heterogeneous network is SULRULWL]H FDQGLGDWH JHQHV DVVRFLDWHG ZLWK D JLYHQ GLVHDVH
constructed based on gene-disease associations, gene-chemical .QRZ*(1(ILUVWFDOFXODWHGJHQHJHQHPXWXDOLQIRUPDWLRQ
associations and disease-chemical associations, to combine diverse E\XWLOL]LQJWKHFRRFFXUUHQFHRIJHQHVLQNQRZQJHQHGLVHDVH
information. The network uses genes, diseases and chemicals as DVVRFLDWLRQ GDWD DQG WKH PXWXDO LQIRUPDWLRQ ZDV
nodes, and uses their associations as edges. The graph embedding VXEVHTXHQWO\ FRPELQHG ZLWK NQRZQ SURWHLQSURWHLQ
methods are utilized to extract representation vectors of nodes in the LQWHUDFWLRQ QHWZRUNV E\ D ERRVWHG WUHH UHJUHVVLRQ PHWKRG
heterogeneous network, and the feature vectors of genes and
diseases are merged to represent gene-disease pairs, and the random
=HQJ HW DO >@ SURSRVHG D ODWHQW IDFWRU PHWKRG ZLWK
forest is employed to build the prediction model based on gene- KHWHURJHQHRXV VLPLODULW\ UHJXODUL]DWLRQ WR SUHGLFW JHQH
disease pairs. We consider six types of graph embedding methods, GLVHDVHDVVRFLDWLRQV&KHQHWDO>@EXLOWDFRQWH[WVHQVLWLYH
and take the individual graph embedding method-generated features QHWZRUNEDVHG SKHQRPHGULYHQ DSSURDFK IRU WKH JHQH
to build prediction models and use them as base predictors, and then GLVHDVH DVVRFLDWLRQ SUHGLFWLRQ =HQJ HW DO >@ SURSRVHG D
combine base predictors to develop the ensemble learning model SUREDELOLW\EDVHG FROODERUDWLYH ILOWHULQJ PRGHO WR SUHGLFW
HNEEM. We comprehensively compare different graph embedding GLVHDVHFDXVLQJ JHQHV 0HQJ HW DO>@ SURSRVHG DQ LWHUDWLYH
methods, and results demonstrate that the graph embedding methods VHOIXSGDWLQJ DSSURDFK EDVHG RQ WKH KHWHURJHQHRXV
produce satisfying results in the gene-disease association prediction, LQIRUPDWLRQQHWZRUNWRSUHGLFWJHQHGLVHDVHDVVRFLDWLRQV 
and integrating different graph embedding methods can make
further improvements. In computational experiments, HNEEM
7KHUHDUHWZRFULWLFDOSRLQWVIRUGHYHORSLQJKLJKDFFXUDF\
produces better results compared to the state-of-the-art gene-disease JHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQPRGHOV2QHLVWRXWLOL]H
perdition methods, and HNEEM is robust to the data richness as well. GLYHUVH IHDWXUHV RI JHQHV DQG GLVHDVHV DQG WKH RWKHU LV WR
Moreover, the usefulness of the proposed method HNEEM is LQWHJUDWH WKHVH IHDWXUHV LQ D UHDVRQDEOH PDQQHU $
validated by the case studies. In conclusion, HNEEM is a promising KHWHURJHQHRXVQHWZRUNXVXDOO\FRQVLVWVRIGLIIHUHQWW\SHVRI
method for predicting gene-disease associations. ELRORJLFDO REMHFWV DQG WKHLU DVVRFLDWLRQV DQG WKHQ QDWXUDOO\
Keywords: gene-disease association; heterogeneous network; FRQWDLQ FRPSUHKHQVLYH LQIRUPDWLRQ ,Q UHFHQW \HDUV
graph embedding UHSUHVHQWDWLRQ OHDUQLQJ EHFRPHV D KRW WRSLF LQ WKH PDFKLQH
,,1752'8&7,21 OHDUQLQJ FRPPXQLW\ EHFDXVH RI WKH SRZHUIXO FDSDELOLW\ RI
H[WUDFWLQJ LQIRUPDWLRQ DERXW QRGHV IURP KHWHURJHQHRXV
8QGHUVWDQGLQJ WKH JHQHWLF PHFKDQLVP RI GLVHDVHV KHOSV QHWZRUNV 
ERWK WKH FRPSOHWLRQ RI WKH KXPDQ JHQRPH DQG WKH ,QWKLVSDSHUZHSURSRVHDKHWHURJHQHRXVQHWZRUNEDVHG
GHYHORSPHQWRIWKH1H[W*HQHUDWLRQ6HTXHQFLQJWHFKQLTXHV PHWKRGWKDWHQKDQFHVJHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQE\
DQGSURYLGHVDVVLVWDQFHIRUGLVHDVHSUHYHQWLRQGLDJQRVLVDQG XVLQJ JUDSK HPEHGGLQJ DQG HQVHPEOH OHDUQLQJ DEEUHYLDWHG
WKHUDSHXWLFV>@ 6LQFHZHW PHWKRGVUHTXLUH ORWVRI WLPH DQG DV³+1((0´$KHWHURJHQHRXVQHWZRUNLVFRQVWUXFWHGZLWK
ODERUVLWLVXUJHQWWRGHYHORSWKHFRPSXWDWLRQDOPHWKRGVIRU JHQHGLVHDVH DVVRFLDWLRQV JHQHFKHPLFDO DVVRFLDWLRQV DQG
WKHJHQHDVVRFLDWLRQSUHGLFWLRQ  GLVHDVHFKHPLFDO DVVRFLDWLRQV 7KH QHWZRUN XVHV JHQHV
5HFHQWO\ D JUHDW QXPEHU RI PDFKLQH OHDUQLQJEDVHG GLVHDVHVDQGFKHPLFDOVDVQRGHVDQGXVHVWKHLUDVVRFLDWLRQV
FRPSXWDWLRQDOPHWKRGVKDYHEHHQSURSRVHGWRSUHGLFWJHQH DVHGJHV7KHUHSUHVHQWDWLRQOHDUQLQJPHWKRGVDUHXWLOL]HGWR
GLVHDVHDVVRFLDWLRQV,QVSLUHGE\VRFLDOQHWZRUNDQDO\VHV8 H[WUDFWUHSUHVHQWDWLRQYHFWRUVRIQRGHVLQWKHKHWHURJHQHRXV
0DUWLQ6LQJK%ORPHWDO>@SUHVHQWHGWKH.DW]PHWKRGDQG QHWZRUN DQG WKH IHDWXUH YHFWRUV RI JHQHV DQG GLVHDVHV DUH
&$7$38/7 PHWKRG WR SUHGLFW JHQHGLVHDVH DVVRFLDWLRQV PHUJHGWRUHSUHVHQWJHQHGLVHDVHSDLUVDQGWKHUDQGRPIRUHVW
EDVHG RQ IXQFWLRQDO JHQH DVVRFLDWLRQV DQG JHQHSKHQRW\SH LV HPSOR\HG WR EXLOG WKH SUHGLFWLRQ PRGHOV EDVHG RQ JHQH
DVVRFLDWLRQV .DW] PHWKRG LQWHJUDWHG IXQFWLRQDO JHQH GLVHDVHSDLUV)XUWKHUZHFRQVLGHUVL[W\SHVRIUHSUHVHQWDWLYH
LQWHUDFWLRQVDQGSKHQRW\SHGDWDWRGHYHORSDKHWHURJHQHRXV JUDSK HPEHGGLQJ PHWKRGV DQG WDNH WKH LQGLYLGXDO JUDSK
QHWZRUN DQG FRPSXWHG ZDONV RI GLIIHUHQW OHQJWKV EHWZHHQ HPEHGGLQJEDVHGPRGHODVEDVHSUHGLFWRUVDQGWKHQGHYHORS
JHQH DQG SKHQRW\SH &$7$38/7 LV D VXSHUYLVHG OHDUQLQJ
978-1-7281-1867-3/19/$31.00 ©2019 IEEE 504
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
WKHHQVHPEOHOHDUQLQJPRGHO:HFRPSUHKHQVLYHO\FRPSDUH VDSLHQV(DFKLQWHUDFWLRQLQ+XPDQ1HWKDVDQDVVRFLDWHGORJ
VHYHUDO SRSXODU JUDSK HPEHGGLQJ PHWKRGV DQG UHVXOWV OLNHOLKRRG VFRUH WKDW PHDVXUHV WKH SUREDELOLW\ RI DQ
GHPRQVWUDWH WKDW WKH JUDSK HPEHGGLQJ PHWKRGV SURGXFH LQWHUDFWLRQUHSUHVHQWLQJDWUXHIXQFWLRQDOOLQNDJHEHWZHHQWZR
VDWLVI\LQJUHVXOWVIRUWKHJHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQ JHQHV 20,0>@ LV D FRPSUHKHQVLYH DXWKRULWDWLYH
DQGLQWHJUDWLQJGLIIHUHQWJUDSKHPEHGGLQJPHWKRGVFDQPDNH FRPSHQGLXPRIKXPDQJHQHVDQGJHQHWLFSKHQRW\SHV20,0
IXUWKHU LPSURYHPHQWV ,Q WKH FRPSXWDWLRQDO H[SHULPHQWV FRQWDLQVLQIRUPDWLRQRQDOONQRZQPHQGHOLDQGLVRUGHUVDQG
+1((0 SURGXFHV EHWWHU UHVXOWV ZKHQ FRPSDUHG ZLWK WKH RYHU  JHQHV DQG IRFXVHV RQ WKH UHODWLRQV EHWZHHQ
VWDWHRIWKHDUWJHQHGLVHDVHSHUGLWLRQPHWKRGVDQGLVUREXVW SKHQRW\SH DQG JHQRW\SH %LR*5,' >@ LV DQ LQWHUDFWLRQ
WRWKHGDWDULFKQHVVDVZHOO0RUHRYHUWKHXVHIXOQHVVRIWKH UHSRVLWRU\ ZLWK GDWD FRPSLOHG WKURXJK VHDUFKLQJ 
SURSRVHGPHWKRG+1((0LVYDOLGDWHGE\WKHFDVHVWXGLHV,Q SXEOLFDWLRQV IRU  SURWHLQ DQG JHQHWLF LQWHUDFWLRQV
FRQFOXVLRQ +1(00 LV D SURPLVLQJ PHWKRG IRU SUHGLFWLQJ FKHPLFDODVVRFLDWLRQVDQGSRVWWUDQVODWLRQDO
JHQHGLVHDVHDVVRFLDWLRQV  PRGLILFDWLRQVIURPPDMRUPRGHORUJDQLVPVSHFLHV 
:HFRPSLOHRXUGDWDVHWLQVHYHUDOVWHSV)LUVWZHFROOHFW
,,0$7(5,$/6$1'0(7+2'6 JHQHGLVHDVH DVVRFLDWLRQV JHQHFKHPLFDO DVVRFLDWLRQV DQG
A. Datasets GLVHDVHFKHPLFDO DVVRFLDWLRQV IURP &RPSDUDWLYH
7R[LFRJHQRPLFV 'DWDEDVH &7'  DQG RQO\ H[SHULPHQWDOO\
5HVHDUFKHUVKDYHFRQVWUXFWHGVHYHUDOGDWDEDVHVLH&7' GHWHUPLQHG DVVRFLDWLRQV DUH DGRSWHG 6HFRQG ZH FROOHFW
+XPDQ1HW20,0DQG%LR*5,'ZKLFKIDFLOLWDWHWKHJHQH JHQHJHQHDVVRFLDWLRQVIURP+XPDQ1HWGDWDEDVH7KHVHGDWD
GLVHDVH DVVRFLDWLRQ SUHGLFWLRQ 7KH &RPSDUDWLYH DUHVKRZQLQ7DEOH,7KHUHIRUHZHFRQVWUXFWDKHWHURJHQHRXV
7R[LFRJHQRPLFV 'DWDEDVH &7'  >@ LV D FRPSUHKHQVLYH QHWZRUNRIJHQHVGLVHDVHVDQGFKHPLFDOVDQGWKHLUUHODWLRQV
GDWDEDVH ZKLFK VXSSOLHV WKH DVVRFLDWLRQV DPRQJ JHQHV 7KH QHWZRUN FRQVLVWV RI WKUHH W\SHV RI QRGHV UHSUHVHQWLQJ
GLVHDVHV DQG FKHPLFDOV 7KHUH DUH  JHQHGLVHDVH JHQHV GLVHDVHV DQG FKHPLFDOV IRXU W\SHV RI OLQNV
LQWHUDFWLRQV  JHQHFKHPLFDO LQWHUDFWLRQV DQG UHSUHVHQWLQJ GLVHDVHFKHPLFDO LQWHUDFWLRQV JHQHFKHPLFDO
 GLVHDVHFKHPLFDO LQWHUDFWLRQV LQ &7'¶V UDZ GDWD LQWHUDFWLRQV JHQHGLVHDVH LQWHUDFWLRQV DQG JHQHJHQH
+XPDQ1HW>@LVDSUREDELOLVWLFIXQFWLRQDOJHQHQHWZRUNRI LQWHUDFWLRQV 7KH KHWHURJHQHRXV QHWZRUN KDV  JHQH
 YDOLGDWHG SURWHLQHQFRGLQJ JHQHV RI +RPR VDSLHQV QRGHVGLVHDVHQRGHVDQGFKHPLFDOQRGHV
E\1&%,0DUFK FRQVWUXFWHGE\DPRGLILHG%D\HVLDQ JHQHGLVHDVHLQWHUDFWLRQVJHQHFKHPLFDOLQWHUDFWLRQV
LQWHJUDWLRQ RI  W\SHV RI RPLFV  GDWD IURP PXOWLSOH JHQHJHQHLQWHUDFWLRQVDQGGLVHDVHFKHPLFDO
RUJDQLVPV ZLWK HDFK GDWD W\SH ZHLJKWHG DFFRUGLQJ WR KRZ LQWHUDFWLRQV
ZHOOLWOLQNVJHQHVWKDWDUHNQRZQWRIXQFWLRQWRJHWKHULQ+

7$%/(პ7+(6800$5<2)285'$7$6(7
'DWD  6RXUFH 'HVFULSWLRQ
GLVHDVHFKHPLFDOLQWHUDFWLRQV &7' LQWHUDFWLRQVEHWZHHQGLVHDVHVDQGFKHPLFDOV
JHQHFKHPLFDOLQWHUDFWLRQV &7' LQWHUDFWLRQVEHWZHHQJHQHVDQGFKHPLFDOV
JHQHGLVHDVHLQWHUDFWLRQV &7' LQWHUDFWLRQVEHWZHHQJHQHVDQGGLVHDVHV
JHQHJHQHLQWHUDFWLRQV +XPDQ1HW LQWHUDFWLRQVEHWZHHQJHQHV

7KH GDWD IURP =HQJ HW DO >@ DUH DOVR DGRSWHG IRU
B. Graph Embedding Methods
FRPSDULVRQ =HQJ¶V GDWD FRQWDLQ WHQ W\SHV RI QRGHV JHQHV
GLVHDVHV FRPSRXQGV JHQH IDPLOLHV VXEVWUXFWXUHV VLGH $JUDSKLVDQLPSRUWDQWGDWDUHSUHVHQWDWLRQIRUORWVRIUHDO
HIIHFWV SDWKZD\V WLVVXHV JHQH RQWRORJ\ *2  DQG FDUGLDF ZRUOG SUREOHPV *UDSK HPEHGGLQJ LV DLPHG WR OHDUQ WKH
RXWSXWV &2 =HQJ¶VGDWDKDYHHOHYHQW\SHVRIDVVRFLDWLRQV YHFWRUUHSUHVHQWDWLRQIRUHDFKQRGHLQDJUDSK,QUHFHQW\HDUV
)LYHW\SHVRIDVVRFLDWLRQVDUHEHWZHHQFRPSRXQGVDQGJHQHV WKHJUDSKHPEHGGLQJKDVJDLQHGPRUHDQGPRUHDWWHQWLRQDQG
JHQHIDPLOLHVVXEVWUXFWXUHVVLGHHIIHFWV&2V7KHRWKHUVL[ KDV ORWV RI XVHIXO DSSOLFDWLRQV VXFK DV QRGH FODVVL¿FDWLRQ
W\SHVRIDVVRFLDWLRQVDUHJHQHJHQHDVVRFLDWLRQVJHQHIDPLO\ QRGH UHFRPPHQGDWLRQ OLQN SUHGLFWLRQ HWF :H LQWURGXFH
DVVRFLDWLRQV DQG JHQHSDWKZD\ DVVRFLDWLRQV JHQH*2 VHYHUDO PDWKHPDWLFDO QRWDWLRQV VKRZQ LQ7DEOH ,, *LYHQ D
DVVRFLDWLRQV JHQHGLVHDVH DVVRFLDWLRQV DQG JHQHWLVVXH JUDSK ‫ ܩ‬ൌ ሺܸǡ ‫ܧ‬ሻDJUDSKHPEHGGLQJPHWKRGLVDIXQFWLRQ
DVVRFLDWLRQV 6LPLODUO\ D KHWHURJHQHRXV QHWZRUN FDQ EH ݂ ZKLFKPDSVWKHJUDSK¶VQRGHVWRDORZGLPHQVLRQDOIHDWXUH
FRQVWUXFWHGRQ=HQJ¶VGDWDVHW VSDFH ܸ௜ ՜ ܷ௜ ‫ א‬Թௗ ǡ † ‫ ا‬ȁܸȁ 

7$%/(ჟ0$7+(0$7,&$/127$7,216)25*5$3+6
1RWDWLRQV 0HDQLQJ
 7KHJLYHQJUDSKLWUHIHUVWRWKHUHSUHVHQWDWLRQRIWKHKHWHURJHQHRXVQHWZRUN
 6HWRIYHUWLFHVLQWKHJLYHQJUDSK
 6HWRIHGJHVLQWKHJLYHQJUDSK
 7KHDGMDFHQWPDWUL[RIWKHJUDSK ‫ܩ‬
 7KHUHSUHVHQWDWLRQPDWUL[RIWKHJUDSK ‫ܩ‬
ܸ௜  7KHYHUWH[QXPEHUHGDV ݅
ܷ௜  7KHUHSUHVHQWDWLRQYHFWRURI ܸ௜ 


7KHJUDSKHPEHGGLQJPHWKRGVDUHJHQHUDOO\FDWHJRUL]HG 3UR[LPLW\3UHVHUYHG(PEHGGLQJ +23( >@UDQGRPZDON
DVIDFWRUL]DWLRQEDVHGPHWKRGVVXFKDV/DSODFLDQ(LJHQPDSV EDVHG PHWKRGV VXFK DV 'HHS:DON>@ QRGHYHF>@ DQG
/(  >@ *UDSK )DFWRUL]DWLRQ *) >@ DQG +LJKHU2UGHU GHHS OHDUQLQJ EDVHG PHWKRGV VXFK DV 6WUXFWXUDO 'HHS

505
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
1HWZRUN(PEHGGLQJ 6'1( >@:HEULHIO\LQWURGXFHWKHVH QRGHVE\PD[LPL]LQJWKHSUREDELOLW\RIREVHUYLQJWKHODVW ݇
UHSUHVHQWDWLYHPHWKRGVDVIROORZV QRGHVDQGWKHQH[W ݇ QRGHVLQWKHUDQGRPZDONFHQWHUHGDW
$QHWZRUNRUJUDSKLVUHSUHVHQWHGE\DQDGMDFHQF\PDWUL[ ܸ௜  6LPLODUO\ QRGHYHF SUHVHUYHV KLJKHURUGHU SUR[LPLW\
DQG URZV DQG FROXPQV FRUUHVSRQG WR QRGHV )DFWRUL]DWLRQ EHWZHHQQRGHVE\PD[LPL]LQJWKHSUREDELOLW\RIRFFXUUHQFH
EDVHG PHWKRGV DUH EDVHG RQ WKH PDWUL[ IDFWRUL]DWLRQ RI VXEVHTXHQW QRGHV LQ IL[HGOHQJWK UDQGRP ZDONV EXW
DSSURDFKHV DQG DUH DLPHG WR PDS URZ YHFWRUV DQG FROXPQ QRGHYHFHPSOR\VELDVHGUDQGRPZDONWKDWSURYLGHVDWUDGH
YHFWRUV LQWR WKH ORZUDQN VSDFHV /( SUHVHUYHV WKH JUDSK RIIEHWZHHQEUHDGWKILUVW %)6 DQGGHSWKILUVW ')6 JUDSK
SURSHUW\ EDVHG RQ WKH SDLUZLVH QRGH VLPLODULWLHV E\ VHDUFKHV 
ଵ ଶ
PLQLPL]LQJ WKH REMHFWLYH IXQFWLRQ σ௜ǡ௝หܷ௜ െ ܷ௝ ห ܹ௜௝  *) 'HHSOHDUQLQJEDVHGPHWKRGVDSSO\GHHSQHXUDOQHWZRUNV

WRJUDSKV6WUXFWXUDO'HHS1HWZRUN(PEHGGLQJ 6'1( XVHV
IDFWRUL]HVWKHDGMDFHQF\PDWUL[RIWKHJUDSKDQGPLQLPL]HV
ଵ DQ DXWRHQFRGHU ZLWK ܰ  OD\HUV WR SUHVHUYH WKH ILUVW DQG
WKH IROORZLQJ ORVV IXQFWLRQ σሺ௜ǡ௝ሻ‫א‬ாሺܹ௜௝ െ൏ ܷ௜ ǡ ܷ௝ ൐ሻଶ ൅ VHFRQGRUGHU QHWZRUN SUR[LPLWLHV 7KH JRDO RI WKH


σ ԡܷ௜ ԡଶ ZKHUH൏ ܷ௜ ǡ ܷ௝ ൐ LVWKHLQQHUSURGXFWRI ܷ௜  DQG DXWRHQFRGHU LV WR PLQLPL]H WKH UHFRQVWUXFWLRQ HUURU RI WKH
ଶ ௜
ܷ௝  DQG ߣ  LV D UHJXODUL]DWLRQ FRHIILFLHQW +LJK2UGHU RXWSXW E\ MRLQWO\ RSWLPL]LQJ WKH WZR SUR[LPLWLHV DQG WKH

3UR[LPLW\SUHVHUYHG(PEHGGLQJ +23( GHILQHVDSUR[LPLW\ REMHFWLYHIXQFWLRQLV ฮሺܺ෠ െ ܺሻ ٖ ‫ܤ‬ฮி ൅ ߙ σ௡௜ǡ௝ୀଵ ܹ௜௝ ฮ൫‫ݕ‬௜ െ

PDWUL[ ܵ DQG D UHSUHVHQWDWLRQ PDWUL[ ܷ ൌ ሾܷ ௦ ǡ ܷ ௧ ሿ ZKHUH ఔ
‫ݕ‬௝ ൯ฮଶ ൅ σே ሺԡ‫ ܭ‬௞ ԡଶி ൅ ฮ‫ܭ‬ ෡ ௞ ฮଶ ሻ  ZKHUH ‫ ܭ‬௞  DQG ‫ܭ‬
෡ ௞  DUH
ଶ ௞ୀଵ ி
ܷ ௦  DQG ܷ ௧  DUH WKH VRXUFH HPEHGGLQJ YHFWRUV DQG WDUJHW
WKH ݇–Š OD\HU ZHLJKW PDWULFHV RI WKH DXWRHQFRGHUܺ ൌ ܹ
HPEHGGLQJ YHFWRUV 7KH JRDO RI +23( LV PLQLPL]LQJ WKH
DQGHDFKURZ ‫ݔ‬௜ ൌ ‫ܩ‬௜ǡǣ  FRUUHVSRQGVWRLQLWLDOUHSUHVHQWDWLRQ
REMHFWLYH IXQFWLRQ ԡܵ െ ܷ ௦ ή ܷ ௧ ԡଶி  ZKHUH ܵ ൌ ሺ߇ െ
ߚܹሻିଵ ߚܹDQG ߚ LVDGHFD\SDUDPHWHUDQG ‫ ܫ‬LVDQLGHQWLW\ RI WKH QRGH ݅ DQG ܺ෠  UHSUHVHQWV WKH UHFRQVWUXFWHG YHFWRUV
PDWUL[ IRUQRGHVLQ ܺ ‫ݕ‬௜  LVWKHORZUDQNUHSUHVHQWDWLRQRIQRGH ݅
7KHUDQGRPZDONEDVHGPHWKRGVXWLOL]HWKHUDQGRPZDONV ٖ LVWKH+DGDPDUGSURGXFW ‫ ܤ‬LVD ݊ ൈ ݊ PDWUL[ ‫ܤ‬௜௝ ൌ ͳ
RQJUDSKVWRDSSUR[LPDWHSURSHUWLHV VXFKDVFHQWUDOLW\DQG LIܹ௜௝ ൌ Ͳ HOVH ‫ܤ‬௜௝ ൌ ߚ ZKHUHߚ LV D IUHH SDUDPHWHU DQG
VLPLODULW\ RIWKHJUDSKVDQGWKH\DUHVXLWDEOHIRUODUJHVFDOH ߚ ൐ ͳ
JUDSKV'HHS:DONSUHVHUYHVKLJKHURUGHUSUR[LPLW\EHWZHHQ



)LJ )ORZFKDUWRIWKH+HWHURJHQHRXV1HWZRUNEDVHG(PEHGGLQJ(QVHPEOH0HWKRG


OHDUQLQJEDVHG PHWKRG  DUH FRQVLGHUHG WR OHDUQ WKH
C. Heterogeneous Network-based Representation Learning
HPEHGGLQJ YHFWRUV RI QRGHV LQ WKH KHWHURJHQHRXV QHWZRUN
Ensemble Method for the Gene-disease Association
/( *) DQG +23( DUH IDFWRUL]DWLRQEDVHG PHWKRGV
Prediction
'HHS:DON DQG QRGHYHF DUH UDQGRP ZDON EDVHG PHWKRGV
7KH KHWHURJHQHRXV QHWZRUNEDVHG HPEHGGLQJ HQVHPEOH 6'1( LV D GHHS OHDUQLQJEDVHG PHWKRG 6LQFH WKH\ DUH
PHWKRG +1((0 IRUWKHJHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQ UHSUHVHQWDWLYH JUDSK HPEHGGLQJ PHWKRGV ZH FDQ FRPSDUH
LV GHPRQVWUDWHG LQ )LJ +1((0 WDNHV VHYHUDO VWHSV WR WKHLUSHUIRUPDQFHVRQWKHJHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQ
FRQVWUXFW D SUHGLFWLRQ PRGHO LQFOXGLQJ WKH KHWHURJHQHRXV :HPHUJHWKHHPEHGGLQJYHFWRUVRIJHQHQRGHVDQGGLVHDVH
QHWZRUN FRQVWUXFWLRQ WKH JUDSK HPEHGGLQJEDVHG YHFWRU QRGHVWRUHSUHVHQWJHQHGLVHDVHSDLUV7KLUGWKHNQRZQJHQH
OHDUQLQJWKHFRQVWUXFWLRQRIEDVHSUHGLFWRUVDQGWKHHQVHPEOH GLVHDVHSDLUVDUHWDNHQDVSRVLWLYHLQVWDQFHVDQGRWKHUJHQH
RIEDVHSUHGLFWRUV GLVHDVH DVVRFLDWLRQV DUH WDNHQ DV QHJDWLYH LQVWDQFHV 7KH
)LUVW D KHWHURJHQHRXV QHWZRUN LV FRQVWUXFWHG E\ FODVVLILFDWLRQ PRGHOV FDQ EH FRQVWUXFWHG EDVHG RQ WKH WZR
FRPELQLQJ JHQHGLVHDVH DVVRFLDWLRQV FKHPLFDOGLVHDVH FODVVHVRILQVWDQFHV&RQVLGHULQJWKHDFFXUDF\DQGHIILFLHQF\
DVVRFLDWLRQV DQG JHQHGLVHDVH DVVRFLDWLRQV 7KH ZHFKRVHUDQGRPIRUHVWDVWKHFODVVLILFDWLRQHQJLQH5DQGRP
KHWHURJHQHRXVQHWZRUNXVHVJHQHVFKHPLFDOVDQGGLVHDVHVDV IRUHVW LV DQ HQVHPEOH OHDUQLQJ PHWKRG FRQWDLQLQJ PXOWLSOH
QRGHV DQG XVHV WKH DVVRFLDWLRQV DV HGJHV &OHDUO\ WKH FODVVLILFDWLRQ WUHHV (DFK WUHH LV FRQVWUXFWHG E\ XVLQJ D
KHWHURJHQHRXV QHWZRUN FRQWDLQV GLYHUVH LQIRUPDWLRQ DQG ERRWVWUDSVDPSOH)RUHDFKQRGHZLWKLQHDFKWUHHDUDQGRPO\
GLIIHUHQWUHODWLRQV  VHOHFWHGVXEVHWRIWKHLQSXWIHDWXUHVLVXVHG7KHHPEHGGLQJ
6HFRQGVL[VWDWHRIWKHDUWHPEHGGLQJPHWKRGVLQFOXGLQJ YHFWRUV SURGXFHG E\ GLIIHUHQW JUDSK HPEHGGLQJ PHWKRGV
/( *) +23( IDFWRUL]DWLRQEDVHG PHWKRGV  'HHS:DON UHIOHFWGLIIHUHQWFKDUDFWHULVWLFVRIQRGHVDQGSURYLGHGLYHUVH
QRGHYHF UDQGRP ZDON EDVHG PHWKRGV  DQG 6'1( GHHS LQIRUPDWLRQ ,Q WKLV ZD\ ZH FRQVWUXFW VL[ UDQGRP IRUHVW

506
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
EDVHG SUHGLFWLRQ PRGHOV E\ XVLQJ VL[ W\SHV RI OHDUQHG LQFOXGLQJWKHDUHDXQGHUWKHSUHFLVHUHFDOOFXUYH $835 WKH
HPEHGGLQJ YHFWRUV IRU JHQHGLVHDVH SDLUV 7KHVH PHWKRGV DUHDXQGHUWKHUHFHLYHURSHUDWLQJFKDUDFWHULVWLFFXUYH $8& 
FRXOGSHUIRUPGLIIHUHQWO\ZKHQGHDOLQJZLWKGDWDZLWKQRLVH 0DWWKHZV FRUUHODWLRQ FRHIILFLHQW 0&&  DFFXUDF\ $&& 
DQGKHWHURVFHGDVWLFLW\ )PHDVXUH ) UHFDOO 5(& DQGSUHFLVLRQ 35( 7RDYRLG
$W ODVW DOO EDVH SUHGLFWRUV DUH FRPELQHG WR GHYHORS WKH WKHGDWDELDVZHFRQGXFWLQGHSHQGHQWUXQVRI&9IRUHDFK
HQVHPEOH PHWKRG +1((0 3UHYLRXV VWXGLHV >@ KDYH SUHGLFWLRQPRGHODQGDGRSWWKHDYHUDJHSHUIRUPDQFHV
GHPRQVWUDWHG WKDW FRPELQLQJ GLYHUVH IHDWXUHV FRXOG OHDG WR
B. Performances of Graph Embedding Methods
KLJKDFFXUDF\SHUIRUPDQFHV:KHQPDNLQJSUHGLFWLRQVIRUD
JHQHDQGDGLVHDVHWKLVJHQHGLVHDVHSDLULVILUVWUHSUHVHQWHG +1((0PDNHVXVHRIGLIIHUHQWHPEHGGLQJPHWKRGV/(
E\VL[UHSUHVHQWDWLRQYHFWRUVDQGWKHQLWLVVFRUHGE\VL[EDVH *) +23( 'HHS:DON 1RGHYHF DQG 6'1( WR H[WUDFW
SUHGLFWRUVWKURXJKUHSUHVHQWDWLRQYHFWRUVDVLQSXWV7KHILQDO IHDWXUH YHFWRUV DQG FRQVWUXFW EDVH SUHGLFWRUV DQG WKHQ
SUHGLFWLRQ LV \LHOGHG E\ DYHUDJLQJ VL[ VFRUHV IURP EDVH HQVHPEOHV WKHP WR GHYHORS WKH SUHGLFWLRQ PRGHO *UDSK
SUHGLFWRUV HPEHGGLQJPHWKRGVDUHNH\FRPSRQHQWVRI+1((0+HUH
ZH HYDOXDWH WKH EDVH SUHGLFWRUV EDVHG RQ LQGLYLGXDO JUDSK
რ5(68/76$1'',6&866,216 HPEHGGLQJPHWKRGVDQGFRPSDUHWKHLUSHUIRUPDQFHVRQWKH
A. Evaluation Metrics JHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQ 
+HUHZHPDLQO\GLVFXVVWKHLQIOXHQFHVRIGLPHQVLRQVRI
+HUH ZH FRQGXFW ILYHIROG FURVVYDOLGDWLRQ &9  WR HPEHGGLQJYHFWRUVDQGVHWRWKHUSDUDPHWHUVDFFRUGLQJWRWKH
HYDOXDWH WKH SHUIRUPDQFHV RI SUHGLFWLRQ PRGHOV$OO JHQH GHVFULSWLRQLQWKHLUSXEOLFDWLRQV:HFRQVLGHUWKHGLPHQVLRQV
GLVHDVH DVVRFLDWLRQV DUH XVHG DV SRVLWLYH LQVWDQFHV DQG WKH RIHPEHGGLQJYHFWRUVUDQJLQJIURPWR ʹ௜ ǡ ݅ ൌ ͷǡ ǥ ǡͻ 
VDPHQXPEHURIQRQDVVRFLDWHGJHQHGLVHDVHDVVRFLDWLRQVDUH DQG WKH $8& VFRUHV DQG $835 VFRUHV RI FRUUHVSRQGLQJ
UDQGRPO\ VHOHFWHG DV QHJDWLYH LQVWDQFHV $OO SRVLWLYH PRGHOVHYDOXDWHGE\&9RQRXUGDWDVHWDUHGHPRQVWUDWHGLQ
LQVWDQFHV DQG QHJDWLYH LQVWDQFHV DUH GLYLGHG LQWR ILYH )LJ  &OHDUO\ WKHVH HPEHGGLQJ PHWKRGV SURGXFH KLJK
PXWXDOO\H[FOXVLYHVXEVHWV,QHDFKIROGIRXUVXEVHWVDUHXVHG DFFXUDF\UHVXOWV $835PRUHWKDQDQG$8&PRUHWKDQ
DVWKHWUDLQLQJVHWDQGWKHRQHOHIWRXWLVXVHGDVWKHWHVWLQJ   EXW KDYH GLIIHUHQW SHUIRUPDQFHV 6HYHUDO HPEHGGLQJ
VHW 7KH HPEHGGLQJ YHFWRUV DUH OHDUQHG IURP WKH PHWKRGV *) DQG +23( ZLOO SURGXFH EHWWHU UHVXOWV ZKHQ
KHWHURJHQHRXV QHWZRUN ZKLFK XVHV DVVRFLDWLRQV LQ WKH LQFUHDVLQJ WKH GLPHQVLRQ RI HPEHGGLQJ YHFWRUV VHYHUDO
WUDLQLQJ VHW DV HGJHV 7KH FODVVLILFDWLRQEDVHG PRGHOV DUH HPEHGGLQJ PHWKRGV 'HHS:DON QRGHYHF DQG 6'1( ZLOO
FRQVWUXFWHGRQWKHSRVLWLYHLQVWDQFHVDQGQHJDWLYHLQVWDQFHV SURGXFH ORZHU UHVXOWV ZKHQ LQFUHDVLQJ WKH GLPHQVLRQ RI
IURPWKHWUDLQLQJVHWDQGWKHQPDNHSUHGLFWLRQVIRULQVWDQFHV HPEHGGLQJYHFWRUV/(LVOLNHO\WRSURGXFHWKHVLPLODUUHVXOWV
LQWKHWHVWLQJVHW7KHWUDLQLQJWHVWLQJSURFHGXUHLVUHSHDWHG ZKHQYDU\LQJGLPHQVLRQVRIHPEHGGLQJYHFWRUV
XQWLO HYHU\ VXEVHW KDV EHHQ XVHG IRU WHVWLQJ 6HYHUDO
HYDOXDWLRQ PHWULFV DUH DGRSWHG WR VFRUH SUHGLFWLRQ UHVXOWV


)LJ'LPHQVLRQVRIHPEHGGLQJYHFWRUVDQGSHUIRUPDQFHVRIFRUUHVSRQGLQJPRGHOV


6LQFHWKHGLPHQVLRQRIHPEHGGLQJYHFWRUVKDYHGLIIHUHQW DQG$8&VFRUHLV$FFRUGLQJWRGHFUHDVLQJRUGHU
LQIOXHQFHRQWKHJUDSKHPEHGGLQJPHWKRGVZHFRQVLVWHQWO\ RIWKH$8&VFRUHVRWKHUPHWKRGVDUHUDQNHGDVQRGHYHF*)
VHW WKH GLPHQVLRQ RI HPEHGGLQJ YHFWRUV DV  IRU DOO /(DQG+23(&OHDUO\+1((0SURGXFHVWKH$835VFRUH
HPEHGGLQJ PHWKRGV LQ WKH IROORZLQJ VWXG\ WR DYRLG WKH RI  DQG WKH$8& VFRUH RI  ZKLFK DUH DOO KLJKHU
RYHUHVWLPDWHGSHUIRUPDQFHV   WKDQWKDWRIHDFKEDVHSUHGLFWRU6LQFHZHUHSHDWUXQVRI
&9 IRU HDFK PRGHO ZH FRQGXFW WWHVW WR WHVW WKH GLIIHUHQFH
C. Performances of HNEEM
EHWZHHQ +1((0 DQG EDVH SUHGLFWRUV LQ WHUPV RI $835
,QWKLVVHFWLRQZHHYDOXDWHDQGGLVFXVV+1((0ZKLFK VFRUHV 7KH UHVXOWV GHPRQVWUDWH WKH UHVXOWV RI +1((0 DUH
HQVHPEOHVWKHUHVXOWVRIPXOWLSOHJUDSKHPEHGGLQJPHWKRGV VLJQLILFDQWO\ GLIIHUHQW S  IURP WKDW RI +23(EDVHG
$OOLQGLYLGXDOJUDSKHPEHGGLQJEDVHGSUHGLFWLRQPRGHOVDQG SUHGLFWRU H  /(EDVHG SUHGLFWRU H 
+1((0 PRGHOV DUH HYDOXDWHG E\ &9 XQGHU WKH VDPH 'HHS:DONEDVHG SUHGLFWRU H  QRGHYHFEDVHG
FRQGLWLRQV  SUHGLFWRU H  6'1(EDVHG SUHGLFWRU H  DQG
7DEOH ,,, VKRZV WKH UHVXOWV RI LQGLYLGXDO JUDSK *)EDVHGSUHGLFWRU H  
HPEHGGLQJEDVHGSUHGLFWLRQPRGHOVDQG+1((0PRGHOVRQ 
RXUGDWDVHW$PRQJWKHVHPRGHOV6'1(ZKRVH$835VFRUH 
LV  DQG $8& VFRUH LV  SURGXFHV WKH EHVW
SHUIRUPDQFHVIROORZHGE\'HHS:DONZKRVH$835VFRUHLV

507
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
7$%/(რ  3(5)250$1&(62),1',9,'8$/*5$3+(0%('',1*%$6('02'(/6$1'+1((021285'$7$6(7(9$/8$7('%<58162)&9
0HWKRGV $835 $8& 35( 5(& $&& 0&& )
+23(       
“ “ “ “ “ “ “
/(       
“ “ “ “ “ “ “
'HHS:DON       
“ “ “ “ “ “ “
1RGHYHF       
“ “ “ “ “ “ “
6'1(       
“ “ “ “ “ “ “
*)       
“ “ “ “ “ “ “
+1((0       
“ “ “ “ “ “ “

0RUHRYHU ZH DOVR HYDOXDWH WKH LQGLYLGXDO JUDSK REVHUYHWKDWWKHUHVXOWVRQ=HQJ VGDWDVHWDUHXVXDOO\ZRUVH


HPEHGGLQJEDVHGPRGHOVDQGWKHHQVHPEOHPRGHO+1((0 WKDQ WKH UHVXOWV RQ RXU GDWDVHW $PRQJ LQGLYLGXDO JUDSK
RQ =HQJ¶V GDWDVHWV DQG WKH UHVXOWV DUH VKRZQ LQ 7DEOH ,9 HPEHGGLQJEDVHG PRGHOV *) SURGXFHV WKH EHVW
&OHDUO\ LQGLYLGXDO JUDSK HPEHGGLQJEDVHG PRGHOV SURGXFH SHUIRUPDQFHV $835$8& DQG/(SURGXFHV
GLIIHUHQWSHUIRUPDQFHVRQ=HQJ¶VGDWDVHWV$KHWHURJHQHRXV WKH ZRUVW SHUIRUPDQFHV $835  $8&  
QHWZRUN FDQ DOVR EH FRQVWUXFWHG IURP =HQJ¶V GDWDVHW 7KH 7KHUHIRUH +1((0 DOVR SURGXFHV EHWWHU UHVXOWV $835
QHWZRUNKDVQRGHVZLWKDERXWHGJHVZKLOHWKHUH  $8&   WKDQ DOO LQGLYLGXDO JUDSK HPEHGGLQJ
DUHPRUHWKDQQRGHVZLWKRXWDQ\HGJH7KHUHIRUHZH EDVHGPRGHOV 

7$%/(ს3(5)250$1&(62),1',9,'8$/*5$3+(0%('',1*%$6('02'(/6$1'+1((021=(1*¶6'$7$6(7(9$/8$7('%<58162)&9
0HWKRGV $835 $8& 35( 5(& $&& 0&& )
+23(       
“ “ “ “ “ “ “
/(       
“ “ “ “ “ “ “
'HHS:DON       
“ “ “ “ “ “ “
1RGHYHF       
“ “ “ “ “ “ “
6'1(       
“ “ “ “ “ “ “
*)       
“ “ “ “ “ “ “
+1((0       
“ “ “ “ “ “ “

7KH QXPEHU RI HGJHV LQ D KHWHURJHQHRXV QHWZRUN PD\ SHUIRUPDQFHVRI+1((0ZLOOLQFUHDVHDVPRUHDVVRFLDWLRQV


LQIOXHQFHWKHHPEHGGLQJYHFWRUVDQGWKXVKDVWKHLPSDFWRQ DUH UHPRYHG )RU H[DPSOH WKH$835 VFRUHV LQFUHDVH IURP
WKH IROORZXS SUHGLFWLRQ PRGHOV 7R WHVW WKH UREXVWQHVV RI WRE\UHPRYLQJWRDVVRFLDWLRQVLQWKH
+1((0WRWKHQXPEHURIQHWZRUNHGJHVZHUHPRYH KHWHURJHQHRXVQHWZRUNDQGWKH$8&VFRUHVLQFUHDVHGIURP
  RI JHQHGLVHDVH DVVRFLDWLRQV IURP RXU GDWDVHW WRE\UHPRYLQJWRDVVRFLDWLRQV7KLV
UHVSHFWLYHO\DQGWKHQFRQGXFW&9WRHYDOXDWH+1((0RQ UHVXOWGHPRQVWUDWHVWKDW+1((0DOVRSURGXFHVUREXVWUHVXOWV
WKHVHGDWDVHWVZLWKIHZHUDVVRFLDWLRQV$VVKRZQLQ7DEOH9 RQGDWDVHWVZLWKIHZHUDVVRFLDWLRQV

7$%/(ტ3(5)250$1&(62)+1((021285'$7$6(7:+(15(029,1*/(66*(1(',6($6($662&,$7,216
3HUFHQWDJH $835 $8& 35( 5(& $&& 0&& )
       
“ “ “ “ “ “ “
       
“ “ “ “ “ “ “
       
“ “ “ “ “ “ “
       
“ “ “ “ “ “ “

D. Comparison with Other state-of-the-art methods


,QWKLVVHFWLRQZHFRPSDUH+1((0ZLWKVHYHUDOVWDWH RIWKHDUW JHQHGLVHDVH DVVRFLDWLRQ SUHGLFWLRQ PHWKRGV

508
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
LQFOXGLQJ &$7$38/7>@ .DW]>@ ,6/>@ DQG .QRZ QHLJKERUVRIJHQHVDQGGLVHDVHVDQGVLPLODULWLHVDUHPHUJHG
*(1(>@$OOWKHVHPHWKRGVH[WUDFWIHDWXUHYHFWRUVIRUJHQH DV IHDWXUH YHFWRUV IRU WKH GHFLVLRQ WUHH WR EXLOG SUHGLFWLRQ
GLVHDVH SDLUV IURP WKH KHWHURJHQHRXV QHWZRUN DQG ZH FDQ PRGHOV $OO PHWKRGV DUH HYDOXDWHG E\ ILYHIROG FURVV
IDLUO\FRPSDUHWKHP YDOLGDWLRQRQRXUGDWDVHW$FFRUGLQJWRWKHUHVXOWVLQ7DEOH9,
&$7$38/7 .DW] ,6/ DQG .QRZ*(1( DUH DOO +1((0 SURGXFHV EHWWHU UHVXOWV WKDQ IRXU EHQFKPDUN
KHWHURJHQHRXV QHWZRUNEDVHG PHWKRGV &$7$38/7 PHWKRGV LQ WHUPV RI GLIIHUHQW HYDOXDWLRQ PHWULFV &OHDUO\
FRQVLGHUHG ZDONV RI VSHFLILHG OHQJWKV RQ D KHWHURJHQHRXV +1((0DQGIRXUFRPSDUHGPHWKRGVREWDLQIHDWXUHYHFWRUV
QHWZRUNWRH[WUDFWIHDWXUHVDQGWKHQDGRSWHG690WREXLOG IURPGLIIHUHQWDVSHFWVDQGWKHVXSHULRULW\RI+1((0OLHVLQ
SUHGLFWLRQ PRGHOV .DW] PHWKRG FDOFXODWHG WKH OHQJWKV RI WKH JUDSK HPEHGGLQJ ZKLFK FDQ EHWWHU UHIOHFW WKH
ZDONVIURPDQRGHWRDQRWKHUDQGXVHGLWWRPHDVXUHWKHJHQH FKDUDFWHULVWLFVRIWKHKHWHURJHQHRXVQHWZRUN 
GLVHDVHGLVWDQFHVZKLFKFDQEHWDNHQDVVFRUHVIRUWKHJHQH )XUWKHU ZH FRQGXFW D VWDWLVWLFDO DQDO\VLV WR WHVW WKH
GLVHDVHDVVRFLDWLRQV,6/GHILQHGVSHFLILHGSDWKVIRUDUDQGRP GLIIHUHQFHV EHWZHHQ +1((0 DQG FRPSDUH PHWKRGV 7KH
ZDON DQG FDOFXODWHG WKH QRGHV¶ VLPLODULWLHV LQ WKH UHVXOWV GHPRQVWUDWH WKH UHVXOWV RI +1((0 DUH VLJQLILFDQWO\
KHWHURJHQHRXV QHWZRUN DQG WKH VFRUHV IRU JHQHGLVHDVH GLIIHUHQW S  IURP WKDW RI &$7$38/7 H  .DW]
DVVRFLDWLRQVDUHPHUJHGDVIHDWXUHYHFWRUVWRWUDLQWKH690 H  ,6/ SUHGLFWRU H  .QRZ*(1( H
EDVHGSUHGLFWLRQ PRGHO .QRZ*(1( FDOFXODWHGJHQHJHQH  7KHUHIRUH+1((0SURGXFHVVLJQLILFDQWO\EHWWHUUHVXOWV
VLPLODULW\DQGGLVHDVHGLVHDVHVLPLODULW\E\FRXQWLQJFRPPRQ WKDQRWKHUVWDWHRIWKHDUWPHWKRGV

7$%/(9,&203$5,621%(7:((1+1((0$1'27+(567$7(2)7+($570(7+2'6
0HWKRGV $835 $8& 35( 5(& $&& 0&&63(& )
+1((0       
“ “ “ “ “ “ 0&&  “
&$7$38/7       
“ “ “ “ “ “ 63(&  “
.DW]       
“ “ “ “ “ “ 63(&  “
,6/       
“ “ “ “ “ “ 63(&  “
.QRZ*(1(       
“ “ “ “ “ “ 63(&  “

LQFOXGHG)RUH[DPSOH,/ZDVGLVFRYHUHGWREHLQYROYHGLQ
E. Case study
WKH SDWKRJHQHVLV RI LQIODPPDWRU\ ERZHO GLVHDVH DQG FRORQ
,QWKLVVHFWLRQZHXVHWKHFDVHVWXGLHVWRGHPRQVWUDWHWKH FDQFHU>@LWZDVUHSRUWHG>@WKDWH[SUHVVLRQRI&'.1$
FDSDELOLW\ RI RXU HQVHPEOH PRGHO LQ SUHGLFWLQJ XQNQRZQ LVVLJQLILFDQWLQEUHDVWFDQFHUDQGWKHLUSURJQRVWLFWKHVWXG\
JHQHGLVHDVHDVVRFLDWLRQV:HFRQVWUXFWWKH+1((0PRGHO >@ UHYHDOHG DVVRFLDWLRQV EHWZHHQ 62' DQG OXQJ FDQFHU
XVLQJ DOO DVVRFLDWLRQV LQ RXU GDWDVHW DQG WKHQ PDNH 7KHWKDQGWKDVVRFLDWLRQVDUHDOVRUHODWHGWR,/VKRZLQJ
SUHGLFWLRQVIRUDOOQRQDVVRFLDWHGJHQHGLVHDVHSDLUV  WKDWOLYHUFLUUKRVLVRUDUVHQLFSRLVRQLQJPLJKWFRQFHUQZLWK
:HFKHFNXSRQWKHWRSJHQHGLVHDVHSDLUVVFRUHGE\ WKLVJHQH¶VH[LVWHQFHRUFKDQJHV2WKHUDVVRFLDWLRQVSUHGLFWHG
+1((0DQGWU\WRILQGHYLGHQFHWRVXSSRUWRXUILQGLQJV$V DUHPDLQO\DERXWRWKHUFDQFHUVOLNHSURVWDWLFQHRSODVPVOLYHU
VKRZQLQ7DEOH9,,DOOJHQHGLVHDVHSDLUVDUHFRQILUPHG FDQFHUDQGEUHDVWFDQFHU 
WREHWKHQRYHODVVRFLDWLRQVDQGVXSSRUWLQJPDWHULDOVDUHDOVR

7$%/(9,,723*(1(',6($6($662&,$7,21635(',&7('%<+1((0
12 *HQHV 'LVHDVHV (YLGHQFH
 ,/ &RORQLF1HRSODVPV ,QYROYHPHQWRI,/LQWKHSDWKRJHQHVLVRILQIODPPDWRU\ERZHOGLVHDVHDQGFRORQFDQFHU>@
 &'.1$ %UHDVW1HRSODVPV ([SUHVVLRQRI&'.1$SDQG7*)%5LQEUHDVWFDQFHUDQGWKHLUSURJQRVWLFVLJQLILFDQFH>@
 62' /XQJ1HRSODVPV $VVRFLDWLRQEHWZHHQ62'&7SRO\PRUSKLVPDQGOXQJFDQFHUVXVFHSWLELOLW\DPHWDDQDO\VLV>@
 ,)1* 3URVWDWLF1HRSODVPV ,)1JDPPD ,)1*  VHQVLWL]DWLRQ RI SURVWDWH FDQFHU FHOOV WR )DVPHGLDWHG GHDWK D JHQH WKHUDS\
DSSURDFK>@
 ,/ /LYHU&LUUKRVLV ,PSDLUHGKHSDWLFUHPRYDORILQWHUOHXNLQ ,/ LQSDWLHQWVZLWKOLYHUFLUUKRVLV>@
([SHULPHQWDO
 7*)% %UHDVW1HRSODVPV +LJK VHUXP WUDQVIRUPLQJ JURZWK IDFWRU EHWD  7*)%  OHYHO SUHGLFWV EHWWHU VXUYLYDO LQ EUHDVW
FDQFHU>@
 71) 3URVWDWLF1HRSODVPV 3DUDGR[LFDO5ROHVRI7XPRXU1HFURVLV)DFWRU$OSKD 71)Į LQ3URVWDWH&DQFHU%LRORJ\>@
 ,/ $UVHQLF3RLVRQLQJ ,PSOLFDWLRQVRIR[LGDWLYHVWUHVVDQGKHSDWLFF\WRNLQH 71)ĮDQG,/ UHVSRQVHLQWKHSDWKRJHQHVLVRI
KHSDWLFFROODJHQHVLVLQFKURQLFDUVHQLFWR[LFLW\>@
 71) &DUFLQRPD6TXDPRXV 6DOLYDU\FRQFHQWUDWLRQRI71)DOSKD,/DOSKD,/DQG,/LQRUDOVTXDPRXVFHOOFDUFLQRPD>@
&HOO
 ,/% &DUFLQRPD+HSDWRFHOOXODU 7KH 71)Į ,/% DQG ,/ SRO\PRUSKLVPV DQG ULVN IRU KHSDWRFHOOXODU FDUFLQRPD D PHWD
DQDO\VLV>@

0RUHRYHU ZH SD\ DWWHQWLRQ WR WKH SUHGLFWHG JHQHV LQPLOOLRQSHRSOHDQGUHVXOWHGLQPLOOLRQGHDWKV/XQJ


DVVRFLDWLRQVZLWKGLVHDVHVRIZLGHLQWHUHVWV/XQJFDQFHULVD FDQFHULVWKHPRVWFRPPRQFDXVHRIFDQFHUUHODWHGGHDWKLQ
PDOLJQDQW OXQJ WXPRU FKDUDFWHUL]HG E\ XQFRQWUROOHG FHOO PHQDQGWKHVHFRQGPRVWFRPPRQLQZRPHQ2YHUDOO
JURZWKLQWLVVXHVRIWKHOXQJ,QOXQJFDQFHURFFXUUHG RI SHRSOH LQ WKH 8QLWHG 6WDWHV VXUYLYH ILYH \HDUV DIWHU WKH

509
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
GLDJQRVLV %UHDVW FDQFHU LV WKH OHDGLQJ W\SH RI FDQFHU LQ DVVRFLDWLRQVLVDOVRSURYLGHG$PRQJWKHSUHGLFWLRQVIRU/XQJ
ZRPHQ DFFRXQWLQJ IRU  RI DOO FDVHV DQG LW LV PRVW 1HRSODVPV WKUHH DVVRFLDWHG JHQHV FDQ EH FRQILUPHG DOO
FRPPRQLQZRPHQRYHUDJH&DUGLRP\RSDWK\LVDJURXS SUHGLFWHGJHQHVKDYHEHHQGLVFRYHUHGWREHDVVRFLDWHGZLWK
RIGLVHDVHVWKDWDIIHFWWKHKHDUWPXVFOH7KRVHGLVHDVHVPD\ %UHDVW 1HRSODVPV IRXU JHQHV KDYH EHHQ FRQILUPHG IRU
LQFUHDVH WKH ULVN RI VXGGHQ FDUGLDF GHDWK:H OLVW WKH WRS  &DUGLRP\RSDWKLHV
JHQHV SUHGLFWHG E\ +1((0 WKDW DUH DVVRFLDWHG ZLWK WKUHH 
GLVHDVHV LQ 7DEOH 9,,, DQG WKH HYLGHQFH DERXW WKHVH

7$%/(9,,,723*(1(6$662&,$7(':,7+7+5((',6($6(635(',&7('%<+1((0
'LVHDVH *HQH (YLGHQFH
/XQJ 62' $VVRFLDWLRQEHWZHHQ62'&7SRO\PRUSKLVPDQGOXQJFDQFHUVXVFHSWLELOLW\DPHWDDQDO\VLV>@
1HRSODVPV 37*6 $JHQHWLFSRO\PRUSKLVPLQSURVWDJODQGLQV\QWKDVH 37*6  7ĺ& DQGWKHULVNRIOXQJFDQFHU>@
&;&/ 1$
33$5* 1$
67$7 2UDOO\ELRDYDLODEOHVPDOOPROHFXOHLQKLELWRURIWUDQVFULSWLRQIDFWRU6WDWUHJUHVVHVKXPDQEUHDVWDQGOXQJFDQFHU
[HQRJUDIWV>@
%UHDVW &'.1$ ([SUHVVLRQRI&'.1$SDQG7*)%5LQEUHDVWFDQFHUDQGWKHLUSURJQRVWLFVLJQLILFDQFH>@
1HRSODVPV 7*)% $VVRFLDWLRQRI7*)%&!7SRO\PRUSKLVPZLWKEUHDVWFDQFHUHYLGHQFHIURPDPHWDDQDO\VLVLQYROYLQJ
VXEMHFWV>@
0<& $OWHUDWLRQVWRHLWKHUFHUE% QHX RUFP\FSURWRRQFRJHQHVLQEUHDVWFDUFLQRPDVFRUUHODWHZLWKSRRUVKRUWWHUP
SURJQRVLV>@
9(*)$ 7XPRU6SHFLILF9(*)$DQG9(*)5.'53URWHLQDUH&RH[SUHVVHGLQ%UHDVW&DQFHU>@ 
$.5& ([SUHVVLRQRISURJHVWHURQHPHWDEROL]LQJHQ]\PHJHQHV $.5&$.5&$.5&65'$65'$ LVDOWHUHG
LQKXPDQEUHDVWFDUFLQRPD>@
&DUGLRP\RSDW %&/ ([SUHVVLRQRI%FODQGPLFUR51$VLQFDUGLDFWLVVXHVRISDWLHQWVZLWKGLODWHGFDUGLRP\RSDWK\>@
KLHV &&/ 0&3&&/DVDWKHUDSHXWLFWDUJHWLQP\RFDUGLDOLQIDUFWLRQDQGLVFKHPLFFDUGLRP\RSDWK\>@
9(*)$ 3RVWWUDQVFULSWLRQDOPRGLILFDWLRQVRI9(*)$P51$LQQRQLVFKHPLFGLODWHGFDUGLRP\RSDWK\>@
73 1$
7*)% 7UDQVIRUPLQJJURZWKIDFWRU 7*) ȕVLJQDOLQJLQFDUGLDFUHPRGHOLQJ>@


ს&RQFOXVLRQ 5()(5(1&(6
3UHGLFWLQJ JHQHGLVHDVH DVVRFLDWLRQV DWWUDFWV JUHDW  %R\FRWW.09DQVWRQH05%XOPDQ'(0DF.HQ]LH$(5DUH
DWWHQWLRQIURPWKHVFLHQWLILFFRPPXQLW\,QWKLVZRUNZHWDNH GLVHDVH JHQHWLFV LQ WKH HUD RI QH[WJHQHUDWLRQ VHTXHQFLQJ
GLVFRYHU\WRWUDQVODWLRQNature Reviews Genetics 
DGYDQWDJHRIJUDSKHPEHGGLQJOHDUQLQJPHWKRGVDQGSURSRVH  6LQJK%ORP801DWDUDMDQ17HZDUL$:RRGV-2'KLOORQ,6
D JUDSK HPEHGGLQJ EDVHG HQVHPEOH PRGHO WR SUHGLFW JHQH 0DUFRWWH (0 3UHGLFWLRQ DQG YDOLGDWLRQ RI JHQHGLVHDVH
GLVHDVHDVVRFLDWLRQV$KHWHURJHQHRXVQHWZRUNLVFRQVWUXFWHG DVVRFLDWLRQV XVLQJ PHWKRGV LQVSLUHG E\ VRFLDO QHWZRUN
E\ LQWHJUDWLQJ JHQHGLVHDVH DVVRFLDWLRQV JHQHFKHPLFDO DQDO\VHVPloS one   H
 1DWDUDMDQ 1 'KLOORQ ,6 ,QGXFWLYH PDWUL[ FRPSOHWLRQ IRU
DVVRFLDWLRQV DQG GLVHDVHFKHPLFDO DVVRFLDWLRQV DQG WKH SUHGLFWLQJ JHQHGLVHDVH DVVRFLDWLRQV Bioinformatics 
KHWHURJHQHRXVQHWZRUNFRQWDLQVFRPSUHKHQVLYHLQIRUPDWLRQ   L
DERXWELRORJLFDO REMHFWV DQGWKHLUUHODWLRQV7KHHPEHGGLQJ  =KRX + 6NROQLFN - $ NQRZOHGJHEDVHG DSSURDFK IRU
OHDUQLQJ PHWKRGV OHDUQ WKH UHSUHVHQWDWLRQ YHFWRUV RI JHQHV SUHGLFWLQJ JHQHGLVHDVH DVVRFLDWLRQV Bioinformatics 
  
DQGGLVHDVHVIURPWKHJUDSKZKLFKLQWHJUDWHVPXOWLSOHGDWD  =HQJ ;; 'LQJ 1; =RX 4 /DWHQW IDFWRU PRGHO ZLWK
VRXUFHV6LQFHWKHUHDUHDYDULHW\RISRSXODUJUDSKHPEHGGLQJ KHWHURJHQHRXV VLPLODULW\ UHJXODUL]DWLRQ IRU SUHGLFWLQJ JHQH
OHDUQLQJ PHWKRGV ZH FRPSDUH WKHLU SHUIRUPDQFHV LQ WKH GLVHDVHDVVRFLDWLRQVIeee Int C Bioinform 
JHQHGLVHDVHDVVRFLDWLRQSUHGLFWLRQDQGWKHQLQWHJUDWHWKHP  &KHQ < ;X 5 &RQWH[WVHQVLWLYH QHWZRUNEDVHG GLVHDVH
JHQHWLFV SUHGLFWLRQ DQG LWV LPSOLFDWLRQV LQ GUXJ GLVFRYHU\
WR GHYHORS WKH HQVHPEOH OHDUQLQJ PRGHO IRU WKH LPSURYHG Bioinformatics   
SHUIRUPDQFHV7KHH[SHULPHQWDOUHVXOWVGHPRQVWUDWHWKDWWKH  =HQJ;'LQJ15RGULJXH]3DWRQ$=RX43UREDELOLW\EDVHG
HPEHGGLQJPHWKRGVKDYHWKHJUHDWSRWHQWLDOLQDQDO\]LQJWKH FROODERUDWLYH ILOWHULQJ PRGHO IRU SUHGLFWLQJ JHQHGLVHDVH
QHWZRUNGDWDDQGFDQEHDSSOLHGWRRWKHUVLPLODUWDVNVVXFK DVVRFLDWLRQVBMC medical genomics  6XSSO 
 =KRX-)X%47KHUHVHDUFKRQJHQHGLVHDVHDVVRFLDWLRQEDVHG
DV WKH PL51$GLVHDVH DVVRFLDWLRQ SUHGLFWLRQ WKH GUXJ RQWH[WPLQLQJRI3XE0HGBmc Bioinformatics 
GLVHDVH DVVRFLDWLRQ SUHGLFWLRQ DQG GUXJGUXJ LQWHUDFWLRQ  'DYLV $3 *URQGLQ &- -RKQVRQ 5- 6FLDN\ ' 0F0RUUDQ 5
SUHGLFWLRQ  :LHJHUV - :LHJHUV 7& 0DWWLQJO\ &- 7KH &RPSDUDWLYH
7R[LFRJHQRPLFV'DWDEDVHXSGDWHNucleic acids research
  ' ''
 /HH,%ORP80:DQJ3,6KLP-(0DUFRWWH(03ULRULWL]LQJ
$&.12:/('*0(17 FDQGLGDWH GLVHDVH JHQHV E\ QHWZRUNEDVHG ERRVWLQJ RI
JHQRPHZLGH DVVRFLDWLRQ GDWD Genome research 
7KLVZRUNLVVXSSRUWHGE\WKH1DWLRQDO1DWXUDO6FLHQFH   
 $PEHUJHU -6 +DPRVK $ 6HDUFKLQJ 2QOLQH 0HQGHOLDQ
)RXQGDWLRQ RI &KLQD    1DWLRQDO .H\ ,QKHULWDQFH LQ 0DQ 20,0 $ .QRZOHGJHEDVH RI +XPDQ
5HVHDUFK DQG 'HYHORSPHQW 3URJUDP <)&  *HQHV DQG *HQHWLF 3KHQRW\SHV Curr Protoc Bioinformatics
+XD]KRQJ$JULFXOWXUDO8QLYHUVLW\6FLHQWLILF 7HFKQRORJLFDO 
6HOILQQRYDWLRQ )RXQGDWLRQ 7KH IXQGHUV KDYH QR UROH LQ  2XJKWUHG56WDUN&%UHLWNUHXW]%-5XVW-%RXFKHU/&KDQJ&
.RODV12 'RQQHOO//HXQJ*0F$GDP5 et al7KH%LR*5,'
VWXG\GHVLJQGDWDFROOHFWLRQGDWDDQDO\VLVGDWDLQWHUSUHWDWLRQ LQWHUDFWLRQGDWDEDVHXSGDWHNucleic acids research 
RUZULWLQJRIWKHPDQXVFULSW  ' ''
  0LNKDLO % 3DUWKD 1 /DSODFLDQ HLJHQPDSV DQG VSHFWUDO
WHFKQLTXHV IRU HPEHGGLQJ DQG FOXVWHULQJ ,Q Proceedings of

510
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.
the 14th International Conference on Neural Information  &LIWFL57DV)<DVDVHYHU&7$NVLW(.DUDEXOXW66HQ).HVNLQ
Processing Systems: Natural and Synthetic. 9DQFRXYHU %ULWLVK 6.LOLF/<LOGL],%R]EH\+8 et al+LJKVHUXPWUDQVIRUPLQJ
&ROXPELD&DQDGD0,73UHVV JURZWKIDFWRUEHWD 7*)% OHYHOSUHGLFWVEHWWHUVXUYLYDOLQ
 $PU$1LQR66KUDYDQ19DQMD-$OH[DQGHU-6'LVWULEXWHG EUHDVWFDQFHUTumour Biol   
ODUJHVFDOHQDWXUDOJUDSKIDFWRUL]DWLRQ,QProceedings of the  7VH %: 6FRWW .) 5XVVHOO 3- 3DUDGR[LFDO UROHV RI WXPRXU
22nd international conference on World Wide Web %@ 978-1- QHFURVLV IDFWRUDOSKD LQ SURVWDWH FDQFHU ELRORJ\ Prostate
4503-2035-1.5LRGH-DQHLUR%UD]LO$&0 Cancer 
 0LQJGRQJ23HQJ&-LDQ3=LZHL=:HQZX=$V\PPHWULF  'DV66DQWUD$/DKLUL6*XKD0D]XPGHU'1,PSOLFDWLRQVRI
7UDQVLWLYLW\3UHVHUYLQJ*UDSK(PEHGGLQJ,QProceedings of R[LGDWLYH VWUHVV DQG KHSDWLF F\WRNLQH 71)DOSKD DQG ,/ 
the 22nd ACM SIGKDD International Conference on Knowledge UHVSRQVHLQWKHSDWKRJHQHVLVRIKHSDWLFFROODJHQHVLVLQFKURQLF
Discovery and Data Mining %@ 978-1-4503-4232-2. 6DQ DUVHQLFWR[LFLW\Toxicol Appl Pharmacol   
)UDQFLVFR&DOLIRUQLD86$$&0  6DKHE-DPHH0(VODPL0$WDUEDVKL0RJKDGDP)6DUDIQHMDG$
 %U\DQ 3 5DPL$5 6WHYHQ 6 'HHS:DON RQOLQH OHDUQLQJ RI 6DOLYDU\FRQFHQWUDWLRQRI71)DOSKD,/DOSKD,/DQG,/
VRFLDO UHSUHVHQWDWLRQV ,Q Proceedings of the 20th ACM LQ RUDO VTXDPRXV FHOO FDUFLQRPD Med Oral Patol Oral Cir
SIGKDD international conference on Knowledge discovery and Bucal   (
data mining %@ 978-1-4503-2956-9.1HZ<RUN1HZ<RUN86$  <DQJ</XR&)HQJ5%L67KH71)DOSKD,/%DQG,/
$&0 SRO\PRUSKLVPV DQG ULVN IRU KHSDWRFHOOXODU FDUFLQRPD D
 $GLW\D * -XUH / QRGHYHF 6FDODEOH )HDWXUH /HDUQLQJ IRU PHWDDQDO\VLVJ Cancer Res Clin Oncol   
1HWZRUNV ,Q Proceedings of the 22nd ACM SIGKDD  6RUHQVHQ 0 $XWUXS + 7MRQQHODQG $ 2YHUYDG . 5DDVFKRX
International Conference on Knowledge Discovery and Data 1LHOVHQ2$JHQHWLFSRO\PRUSKLVPLQSURVWDJODQGLQV\QWKDVH
Mining %@ 978-1-4503-4232-2.6DQ)UDQFLVFR&DOLIRUQLD86$  7!& DQGWKHULVNRIOXQJFDQFHUCancer Lett 
$&0   
 'DL[LQ : 3HQJ & :HQZX = 6WUXFWXUDO 'HHS 1HWZRUN  =KDQJ;<XH33DJH%'/L7=KDR:1DPDQMD$73DODGLQR
(PEHGGLQJ ,Q Proceedings of the 22nd ACM SIGKDD '=KDR-&KHQ<*XQQLQJ37 et al2UDOO\ELRDYDLODEOHVPDOO
International Conference on Knowledge Discovery and Data PROHFXOH LQKLELWRU RI WUDQVFULSWLRQ IDFWRU 6WDW UHJUHVVHV
Mining %@ 978-1-4503-4232-2.6DQ)UDQFLVFR&DOLIRUQLD86$ KXPDQEUHDVWDQGOXQJFDQFHU[HQRJUDIWVProc Natl Acad Sci
$&0 U S A   
 =KDQJ:-LQJ.+XDQJ)&KHQ</L%/L-*RQJ-6)//1  1LX:4L<*DR3=KX'$VVRFLDWLRQRI7*)%&!7
$ VSDUVH IHDWXUH OHDUQLQJ HQVHPEOH PHWKRG ZLWK OLQHDU SRO\PRUSKLVP ZLWK EUHDVW FDQFHU HYLGHQFH IURP D PHWD
QHLJKERUKRRG UHJXODUL]DWLRQ IRU SUHGLFWLQJ GUXJ±GUXJ DQDO\VLV LQYROYLQJ  VXEMHFWV Breast Cancer Res Treat
LQWHUDFWLRQVInformation Sciences    
 =KDQJ:<XH;7DQJ*:X:+XDQJ)=KDQJ;6)3(/  9DUOH\-06ZDOORZ-(%UDPPDU:-:KLWWDNHU-/:DONHU5$
/3, 6HTXHQFHEDVHG IHDWXUH SURMHFWLRQ HQVHPEOH OHDUQLQJ $OWHUDWLRQVWRHLWKHUFHUE% QHX RUFP\FSURWRRQFRJHQHV
IRU SUHGLFWLQJ /QF51$SURWHLQ LQWHUDFWLRQV PLoS Comput LQEUHDVWFDUFLQRPDVFRUUHODWHZLWKSRRUVKRUWWHUPSURJQRVLV
Biol   H Oncogene   
 =KDQJ : <XH ; /LQ : :X : /LX 5 +XDQJ ) /LX )  5\GHQ / /LQGHUKROP % 1LHOVHQ 1+ (PGLQ 6 -RQVVRQ 3(
3UHGLFWLQJ GUXJGLVHDVH DVVRFLDWLRQV E\ XVLQJ VLPLODULW\ /DQGEHUJ * 7XPRU VSHFLILF 9(*)$ DQG 9(*)5.'5
FRQVWUDLQHG PDWUL[ IDFWRUL]DWLRQ BMC bioinformatics  SURWHLQDUHFRH[SUHVVHGLQEUHDVWFDQFHUBreast Cancer Res
   Treat   
 =KDQJ:4X4=KDQJ<:DQJ:7KHOLQHDUQHLJKERUKRRG  /HZLV0-:LHEH-3+HDWKFRWH-*([SUHVVLRQRISURJHVWHURQH
SURSDJDWLRQ PHWKRG IRU SUHGLFWLQJ ORQJ QRQFRGLQJ 51$± PHWDEROL]LQJ HQ]\PH JHQHV $.5& $.5& $.5&
SURWHLQLQWHUDFWLRQVNeurocomputing  65'$ 65'$  LV DOWHUHG LQ KXPDQ EUHDVW FDUFLQRPD
 7DQJ * 6KL - :X : <XH ; =KDQJ : 6HTXHQFHEDVHG BMC Cancer 
EDFWHULDO VPDOO 51$V SUHGLFWLRQ XVLQJ HQVHPEOH OHDUQLQJ  :DQJ</L0;X//LX-:DQJ'/L4:DQJ//L3&KHQ6
VWUDWHJLHVBMC bioinformatics    /LX7([SUHVVLRQRI%FODQGPLFUR51$VLQFDUGLDFWLVVXHV
 =KDQJ : &KHQ < /LX ) /XR ) 7LDQ * /L ; 3UHGLFWLQJ RISDWLHQWVZLWKGLODWHGFDUGLRP\RSDWK\Mol Med Rep 
SRWHQWLDO GUXJGUXJ LQWHUDFWLRQV E\ LQWHJUDWLQJ FKHPLFDO   
ELRORJLFDOSKHQRW\SLFDQGQHWZRUNGDWDBMC bioinformatics  ;LD<)UDQJRJLDQQLV1*0&3&&/DVDWKHUDSHXWLFWDUJHW
   LQ P\RFDUGLDO LQIDUFWLRQ DQG LVFKHPLF FDUGLRP\RSDWK\
 *RQJ< 1LX< =KDQJ: /L;$ QHWZRUNHPEHGGLQJEDVHG Inflamm Allergy Drug Targets   
PXOWLSOH LQIRUPDWLRQ LQWHJUDWLRQ PHWKRG IRU WKH 0L51$  .RZDOF]\N-'RPDO.ZLDWNRZVND'0D]XUHN8=HPEDOD0
GLVHDVH DVVRFLDWLRQ SUHGLFWLRQ BMC bioinformatics  0LFKDOVNL%=HPEDOD03RVWWUDQVFULSWLRQDOPRGLILFDWLRQVRI
   9(*)$ P51$ LQ QRQLVFKHPLF GLODWHG FDUGLRP\RSDWK\
 =KDQJ : /L = *XR : <DQJ : +XDQJ ) $ IDVW OLQHDU Cell Mol Biol Lett   
QHLJKERUKRRG VLPLODULW\EDVHG QHWZRUN OLQN LQIHUHQFH  'REDF]HZVNL 0 &KHQ : )UDQJRJLDQQLV 1* 7UDQVIRUPLQJ
PHWKRGWRSUHGLFWPLFUR51$GLVHDVHDVVRFLDWLRQVIEEE/ACM JURZWKIDFWRU 7*) EHWDVLJQDOLQJLQFDUGLDFUHPRGHOLQJJ
transactions on computational biology and bioinformatics / IEEE, Mol Cell Cardiol   
ACM  
 0HQJ ; =RX 4 5RGUtJXH]3DWyQ $ =HQJ ; ,WHUDWLYHO\
FROOHFWLYHSUHGLFWLRQRIGLVHDVHJHQHDVVRFLDWLRQVWKURXJKWKH
LQFRPSOHWHQHWZRUN,Q2017 IEEE International Conference on
Bioinformatics and Biomedicine (BIBM): 13-16 Nov. 2017 2017

 $WUH\D51HXUDWK0),QYROYHPHQWRI,/LQWKHSDWKRJHQHVLV
RI LQIODPPDWRU\ ERZHO GLVHDVH DQG FRORQ FDQFHU Clin Rev
Allergy Immunol   
 :HL&<7DQ4;=KX;4LQ4+=KX)%0R4*<DQJ:3
([SUHVVLRQ RI &'.1$S DQG 7*)%5 LQ EUHDVW FDQFHU
DQGWKHLUSURJQRVWLFVLJQLILFDQFHInt J Clin Exp Pathol 
  
 /L1+XDQJ+4=KDQJ*6$VVRFLDWLRQEHWZHHQ62'&7
SRO\PRUSKLVPDQGOXQJFDQFHUVXVFHSWLELOLW\DPHWDDQDO\VLV
Tumour Biol   
 6HOOHFN:$&DQILHOG6(+DVVHQ:$0HVHFN0.X]PLQ$,
(LVHQVPLWK5&&KHQ6++DOO6-,)1JDPPDVHQVLWL]DWLRQRI
SURVWDWH FDQFHU FHOOV WR )DVPHGLDWHG GHDWK D JHQH WKHUDS\
DSSURDFKMol Ther   
 :LHVW 5 :HLJHUW - :DQQLQJHU - 1HXPHLHU 0 %DXHU 6
6FKPLGKRIHU6)DUNDV66FKHUHU016FKDIIOHU$6FKROPHULFK
- et al,PSDLUHGKHSDWLFUHPRYDORILQWHUOHXNLQLQSDWLHQWV
ZLWKOLYHUFLUUKRVLVCytokine   

511
Authorized licensed use limited to: Technische Informationsbibliothek (TIB). Downloaded on November 29,2023 at 10:36:27 UTC from IEEE Xplore. Restrictions apply.

You might also like