You are on page 1of 12

BIG DATA

BIGDATA , collection of large and complex data sets difficult to process using on-hand database tools

WHY IS IT INTRODUCED
Lots of data is being collected & warehoused. Processing exceeds database system capacity. Structured & Unstructured data Separation of data from application. Understanding data analytics. Faster development, faster runtime. Elastic Feature-Level Scalability.

APACHE HADOOP
Provides massive scalable storage, its not a database Data Processing Platform HDFS, a fault tolerant storage Store data in native format Reduce cost & lower risks Extracting business value from data Deliver new insights Automatically handles s/w & h/w failures

HDFS
Fault tolerant storage Survive failure on disk, network and network interface Uses Map-Reduce programs Creates clusters of machines and co-ordinates Storage on clusters using blocks No special hardware compared to RAID

PROBLEMS WITH BIG DATA


Will be so overwhelmed Costs escalate too fast Storage consumed 3 times Timeliness Analysis Poor data locality Incompatible & Replicated data

CONCLUSION
Big Data will replace the approaches, tools and systems that underpin development work. Better analysis of the large volumes of data. Potential for advancing in many scientific disciplines. Improving the profitability. Technical challenges to be addressed dynamically

REFERENCES
www.bigdatauniversity.com www.sas.com/big-data/ en.wikipedia.org/wiki/Big_data cra.org/ccc/docs/init/bigdatawhitepaper.pdf dataanalyticssummit.com hadoop.apache.org

Todays Big Data Is Not Tomorrows Big Data

THANK YOU

You might also like