You are on page 1of 1

scalability

redundancy & availability


fast access
Storage Device Characteristics long-term storage
schema-less storage
inexpensive storage

map On-Disk Storage distributed file system RDBMS key-value


combine (optional) map task database NoSQL column-family
partition MapReduce Algorithms NewSQL document
shuffle & sort reduce task graph
reduce

Module 7
Fundamental Big Data Engineering

distributed/parallel data processing


schema-less data processing
cluster
Processing Engine Characteristics multi-workload support
batch mode Fundamental Big Data Processing scalability
realtime mode redundancy & fault-tolerance
low cost

Big Data Storage Terminology & Concepts

master-slave
peer-to-peer replication
consistency sharding
availability CAP theorem
partition tolerance atomicity ACID basically available
consistency BASE soft state
isolation eventual consistency
durability

Module 7: Fundamental Big Data Engineering Big Data Science Certified Professional (BDSCP) Program
Official Mind Map Supplement Copyright © Arcitura Education Inc. www.arcitura.com

You might also like