You are on page 1of 21

Bogor Agricultural University (IPB)

Pengantar Bioinformatika
(FMP400)
Pertemuan 7 Big Data dalam Bioinformatika
Introduction

• Dr Mushthofa, SKom, MSc


• S1 IPB (2005)
• S2 TU Vienna (2009)
• S3 Ghent University (2018)
• Research interest:
• Machine learning
• AI
• Bioinformatics
Big Data
Big Data
Application Domain
Value Chain
Big Data Context
Big Data in Bioinformatics

• Biology is big and complex!


Big Data in Bioinformatics

• Data measurement capability is increasing


Big Data in Bioinformatics

• Multi-omics data integration


Big Data in Bioinformatics

• “Integrative” analysis
Applications of Big Data
Personalized Medicine

• Iceland deCODE Project: medical history records


and genome data of 150,000 people
• Led to Discovery of:
• Genetic risk factors
• Breast cancer
• Alzheimer’s
• Also found 10,000 people missing 1,500 different copies of both
genes.
• Drug responsiveness: ADHD medicine only works
for one of ten preschoolers, cancer drugs are
effective for 25% of patients, and depression drugs
work with 6 of 10 patients.
Drug modeling/repurposing
Big Data Frameworks

• MapReduce
• Hadoop
• Spark
• Hive
• Flink
• Kudu
…..
MapReduce

• Large scale data processing was difficult!


• Managing hundreds or thousands of processors
• Managing parallelization and distribution
• I/O Scheduling
• Status and monitoring
• Fault/crash tolerance
• MapReduce was created to solve these problems
MapReduce

• Programming model used by Google


• A combination of the Map and Reduce models with
an associated implementation
• Used for processing and generating large data sets
MapReduce Workflow
Apache Hadoop
Apache Hadoop
Apache Spark

You might also like