Professional Documents
Culture Documents
Phn loi chui theo bin cc b v bin ton cc Global Sequence Alignment and Local Sequence Alignment
Phm Trung Dng Hc vin K thut Qun s Trn Hoi Linh i Hc Bch Khoa H Ni ng Thy Hng Hc vin K thut Qun s e-Mail: Hangdtys@gmail.com
Tm tt
D n v b gen ngi c thnh lp (1997), v qu trnh gii m trnh t tt c 24 cp nhim sc th ca b gen ngi cng hon thnh t cui nm 2000 khin lng thng tin sinh hc ngy cng tr nn phong ph v a dng. Chnh s hiu bit thng tin sinh hc ny ng gp vai tr to ln i vi lnh vc chm sc v bo v sc kho con ngi. Chng hn, vic chn on, d phng, tr liu, v.v...T , nng cao cht lng cuc sng v bo v mi trng thin nhin. Khi mt gene c pht hin, mt trong nhng yu cu quan trng l lm th no xc nh c chc nng ca gene [6]. Bi bo ny gii thiu mt phng php tip cn l nh gi s ging nhau (tng ng) ca chui nucleotide da trn vic nh gi chui amino acid tng ng.
chn lc t nhin ca cc chui trnh t, t cho php cc nh sinh hc a ra kt lun v ngun gc ca cc on gene, DNA, RNA, hay protein.
Abstract
The human genome project was established (1997) as well as the sequenced of all the human genomes 24 pairs chromosomes in 2000 that the amount of biological information is becoming more rich and diverse. Understanding of biological information has contributed a large role for human health. For example, diagnosis, treatment, ect Since then, improving the quanlity of life and protect the natural environment. When a gene is discovered, one of the important requirements is how to determine gene function [6]. This paper present an approach to access the similarity of nucleotide sequences based on the evaluation of the corresponding amino acid sequence.
1. Gii thiu
Qu trnh tin ha ca loi ngi l mt qu trnh bin i a dng, t mt gene (chui DNA) t tin di tc ng ca qu trnh tin ha bin i to nn nhng khc bit so vi gene gc ban u. Do vic nhn nh s ging nhau ca cc on gene, trnh t gene l mt vn ln.Trong lnh vc nghin cu phn tch cu trc v chc nng ca gene v protein, vic phn tch trnh t (chui DNA, protein) ng vai tr quan trng. n gin cho vic nghin cu, trnh t DNA, protein s c tun t ha v nghin cu di dng chui cc k t [1]. Sau , chng ta s so snh, nh gi s ging nhau (tng ng) ca chui DNA, protein mi vi nhng chui DNA, protein bit, t c th a ra d on v chc nng cng nh cu trc ca nhng gene mi pht hin. Bi ton so snh 2 trnh t (Pairwise Sequence Alignment-PSA) c t ra gii quyt vn ny. Trn quan im sinh hc, php so snh trnh t th hin qu trnh bin i
Hnh 1. Cu trc phn t ca ADN Cc baz hai chui ghp cp vi nhau qua lin kt hir.
Trong khi , Protein l biu bin ca vt cht sng, n tham gia vo hu ht cc qu trnh sinh hc v l c s ca s a dng v cu trc v chc nng ca tt c cc sinh vt. Trong s sng, protein c to ra qua qu trnh dch m t on gen biu hin cha thng tin di truyn trong DNA. Protein l mt chui trnh t cc amino acid ni kt vi nhau bng cc lin kt to nn cu trc (c chia ra lm nhiu dng cu trc nh bc 1, bc 2 v cu trc khng gian bc 3, bc 4, bc 5).
VCCA-2011
749
C 20 loi amino acid trong cc phn t protein sinh vt nhng ch c 4 loi baz nit nucleotide khc nhau trn phn t RNA. Do vy, trong t nhin khng th n gin s dng mt baz nit nucleotide m ho cho mi amino acid khi tng hp protein. Trong qu trnh dch m, cc baz ca mRNA c c theo nhm b ba, c bit n nh l m b ba (codon). Mi codon i din cho mt amino acid xc nh. Bi v c 4 baz nit nucleotide, nn s c th c 64 codon khc nhau. Tuy nhin, ch c 20 amino c k hiu tt bi cc ch ci. Mi Amino acid c m ho t b 3 nucleotide. B k hiu cho cc amino acid: AA = {A, C, D, E, F,G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y}. C t nht hai loi khc c m ha bi DNA theo mt cch khc (khng chun): Selenocysteine kt hp vi mt vi protein UGA codon, thng gi l stop codon. Pyrrolysine c s dng bi mt vi methanogen trong cc enzyme m c dng sn xut ra methane. N c m ha ging vi ca selenocysteine nhng m bng codon UAG. Cc loi amino acid khc cha trong proteins thng c to thnh bi bng cch chnh sa sau khi dch m. Vic chnh sa ny thng rt cn thit cho chc nng ca protein.
Trong qu trnh tin ha ca sinh vt cc trnh t c th thm hoc bt i mt s phn t trong trnh t, cho nn cc sinh vt c h hng gn nhau c th cc trnh t khc nhau phn thm vo chen gia trnh t.
Trong qu trnh tm s tng ng, trng hp no thy tng ng nht (c im tnh cao nht) s c chn. Thng thng c hai cch so snh cc trnh t: - So snh tng ng ton cc: Thng c s dng khi cc trnh t so snh c kch thc gn tng ng v cc trnh t ny c tng ng, ging nhau cao. Trng hp ny xt tng ng trn ton chui tnh cho vic so snh nh sau (so snh 2 chui):
VCCA-2011
750
Hnh 5. Bng kt qu mu bnh hi chng bnh Polip theo hai loi bin
VCCA-2011
751
1839 60% 83% 1865 61% 83% 1824 20% 27% 1840 60% 83% 1922 41% 53% Hnh 6. Bng kt qu mu bnh hi chng gin mao mch xut huyt di truyn theo hai loi bin
Qua bng kt qu c th nhn thy nu so vi php so snh ton cc v cc b th vi vic chn mt chui gm cc amino acid trong ch mt protein cho kt qu tng ng cao hn.
4. Kt lun
Bi bo ny gii thiu khi qut chung v so snh trnh t hai chui v a ra mt s kt qu so snh gia cc phng php nh gi da trn c s d liu bnh gy hi chng bnh polip tr cha thnh nin v hi chng gin mao mch xut huyt di truyn trong NBCI. So snh trnh t hai chui c ngha ln trong vic xc nh nhng c im phn bit gene ng thi a ra nhng gi thuyt v chc nng ca gene thng qua nhng gii thut nh gi s ging nhau, tng ng gia cc trnh t.
[5]
[6]
[7]
VCCA-2011
752