Increasing the no of systems & operate in Parallel.
Vertical scalability
Increasing the disk size & RAM
HDFS -> Used for storage-> distributed file system ->based on Name node[master] & multiple Data nodes[slaves] based on the data size ->HDFS fault tolerance is based on 2 factors->Replication factor & Block size -> Block Size by default is 64MB ->The total blocks required is (File Size)/(Default Block Size) -> Each block will be distributed based on the Replication factor ->used for replication the same dataset over N no of Datanodes, where N is the Replication factor. MAP REDUCE ->Native support for Java ->framework used for processing of data->Mapper & Reducer combination->Mapper used for parallel processing of the instructions input to the Map Reduce framework-> distributes the instructions set among the Data nodes for parallel processing -> Reducer will merge the results obtained from the parallel processed instruction from different data nodes & aggregates(merge) them.
HIVE -> SQL query based support for analytics
PIG -> defined Functions support for analytics
SQOOP ->for Importing/Exporting data from DMS/RDBMS systems to HDFS
FLUME -> for Importing Streaming data from to HDFS
HBASE -> a NoSQL database ->Column based storage->Only Database supported by Hadoop
APACHE OOZIE -> scheduler to control the workflow of all the process