GA-ANN based Dominant Gene Predictionin Microarray Dataset
Lecturer, P.G. Department of Information andCommunication Technology,Fakir Mohan University, Orissa, IndiaE-mail:firstname.lastname@example.org
Dr. B. Mittra
Reader, School of Biotechnology,Fakir Mohan University, Orissa, India
Dr. Sabyasachi Pattnaik
Reader,P.G. Department of Information andCommunication Technology,Fakir Mohan University, Orissa, India.
Dr. Ranjit Kumar Sahu
Assistant Surgeon, Post Doctoral Department of Plastic and Reconstructive Surgery,S.C.B. Medical College, Cuttack,Orissa, IndiaE-mail:email@example.com
Genome Analysis of a human being permits usefulinsight into the ancestry of that person and also facilitates thedetermination of weaknesses and susceptibilities of that persontowards inherited diseases. The amount of accumulatedgenome data is increasing at a tremendous rate with the rapiddevelopment of genome sequencing technologies and geneprediction is one of the most challenging tasks in genomeanalysis. Many tools have been developed for gene predictionwhich still remains as an active research area. Gene predictioninvolves the analysis of the entire genomic data that isaccumulated in the database and hence scrutinizing thepredicted genes takes too much of time. However, thecomputational time can be reduced and the process can bemade more effective through the selection of dominant genes.In this paper, a novel method is presented to predict thedominant genes of ALL/AML cancer. First, to train an FF-ANN a combinational data of the input dataset is generatedand its dimensionality is reduced through Probability PrincipalComponent Analysis (PPCA). Then, the classified database of ALL/AML cancer is given as the training dataset to design theFF-ANN. After the FF-ANN is designed, the genetic algorithmis applied on the test input sequence and the fitness function iscomputed using the designed FF-ANN. After that, the geneticoperations crossover, mutation and selection are carried out.Finally, through analysis, the optimal dominant genes arepredicted.
Keywords- gene prediction, Microarray gene expression data, Probabilistic PCA (PPCA), dimensionality reduction, Artificial Neural Network (ANN), Back propagation (BP), dominant gene, genetic algorithm.
I. INTRODUCTIONIn the public domain huge quantity of genomic andproteomic data are accessible. The capability to process thisinformation in ways that are helpful to humankind isbecoming more and more significant .
fundamentalstep in the understanding of a genome is the computationalrecognition, and in the analysis of newly sequencedgenomes it is one of the challenges. Accurate and speedytools are essential for the analysis of genomic sequences andfor interpreting genes . In such circumstances,conventional and modern signal processing techniques playsa vital part in these fields . Genomic signal processing (GSP) is a comparatively novel area in bio-informatics.It deals with the utilization of traditional digital signalprocessing (DSP) techniques in the representation andanalysis of genomic data.The code for the chemical composition of aparticular protein is enclosed in the DNA which is asegment of gene. Genes functions as the pattern for proteinsand some extra products, and the main intermediary thattranslates gene information in the production of geneticallyencoded molecules is mRNA . Usually sequences of nucleotide symbols, symbolic codons (triplets of nucleotides), or symbolic sequences of amino acids in thecorresponding polypeptide chains present in the strands of DNA molecules represent the genomic information.
.Gene expression microchip, which is perhaps the mostrapidly expanding tool of genome analysis enablessimultaneous monitoring of the expression levels of tens of thousands of genes under diverse experimental conditions.An influential tool in the study of collective gene reaction tochanges in their environments is presented by geneexpression microchip, and it also offers indications aboutthe structures of the involved gene networks .Nowadays, in a solitary experiment by employingmicroarrays the expression levels of thousands of genes,possibly all genes in an organism can be measuredsimultaneously . In monitoring genome-wide expressionlevels of gene microarray technology has become a requisitetool . The evaluation of the gene expression profiles in a
(IJCSIS) International Journal of Computer Science and Information Security,Vol. 8, No. 8, November 201083http://sites.google.com/site/ijcsis/ISSN 1947-5500