This action might not be possible to undo. Are you sure you want to continue?
NAME: Lavan Kumar R USN: 1AY08IS032 E-MAIL: firstname.lastname@example.org PHONE: 7829125446
The advent of new high-throughput sequencing technologies has led to a flood of genomic data which overwhelms the capabilities of single processor machines. This document present a Map Reduce pipeline called Howdah that supports the analysis of genomic sequence data allowing multiple tests to be plugged in to a single Map Reduce job. The pipeline is used to detect chromosomal abnormalities such as insertions, deletions and translocations as well as single nucleotide polymorphisms (SNPs). The Howdah is framework allows multiple data streams (pipelines) to be extracted from a single Hadoop job flow. The design assumes that data may be located outside of HDFS (Hadoop File System) prior to starting the job and that the job flow consists of moving the data into HDFS, running a sequence of MapReduce jobs until every sub-flow has terminated and then moving the results from HDFS into the local file system and perhaps into appropriate data storage such as a database.
Seminar Coordinators Prof. Ravichandra M/Prof. Chayapathi A R
Project Guide Prof. Mahesh G