Professional Documents
Culture Documents
(https://www.guru99.com/)
Data Lake is like a large container which is very similar to real lake and rivers. Just like in a lake
you have multiple tributaries coming in, a data lake has structured data, unstructured data,
machine to machine, logs flowing through in real-time.
2. Data Cleaning
3. Data Transformation
4. Data Loading and Refreshing
(/images/1/022218_0517_DataLakevsD1.png)
Here are key di erences between the two data associated terms in the mentioned aspects:
Storage In the data lake, all data is kept A data warehouse will consist of data
irrespective of the source and its that is extracted from transactional
structure. Data is kept in its raw form. It systems or data which consists of
is only transformed when it is ready to quantitative metrics with their
be used. attributes. The data is cleaned and
transformed
History Big data technologies used in data lakes Data warehouse concept, unlike big
is relatively new. data, had been used for decades.
Data Captures all kinds of data and Captures structured information and
Capturing structures, semi-structured and organizes them in schemas as
unstructured in their original form from defined for data warehouse
source systems. purposes
https://www.guru99.com/data-lake-vs-data-warehouse.html 2/7
7/3/2019 Data Lake vs Data Warehouse: Know the Difference
Data Data lakes can retain all data. This In the data warehouse development
Timeline includes not only the data that is in use process, significant time is spent on
but also data that it might use in the analyzing various data sources.
future. Also, data is kept for all time, to
go back in time and do an analysis.
Users Data lake is ideal for the users who The data warehouse is ideal for
indulge in deep analysis. Such users operational users because of being
include data scientists who need well structured, easy to use and
advanced analytical tools with understand.
capabilities such as predictive modeling
and statistical analysis.
Storage Data storing in big data technologies Storing data in Data warehouse is
Costs are relatively inexpensive then storing costlier and time-consuming.
data in a data warehouse.
Task Data lakes can contain all data and data Data warehouses can provide
types; it empowers users to access data insights into pre-defined questions
prior the process of transformed, for pre-defined data types.
cleansed and structured.
Processing Data lakes empower users to access Data warehouses o er insights into
time data before it has been transformed, pre-defined questions for pre-
cleansed and structured. Thus, it allows defined data types. So, any changes
users to get to their result more quickly to the data warehouse needed more
compares to the traditional data time.
warehouse.
Data Data Lakes use of the ELT (Extract Load Data warehouse uses a traditional
processing Transform) process. ETL (Extract Transform Load)
process.
Complain Data is kept in its raw form. It is only The chief complaint against data
transformed when it is ready to be used. warehouses is the inability, or the
problem faced when trying to make
change in in them.
Key They integrate di erent types of data to Most users in an organization are
Benefits come up with entirely new questions as operational. These type of users only
these users not likely to use data care about reports and key
warehouses because they may need to performance metrics.
go beyond its capabilities.
https://www.guru99.com/data-lake-vs-data-warehouse.html 3/7
7/3/2019 Data Lake vs Data Warehouse: Know the Difference
Summary:
A Data Warehouse is a blend of technologies and components which allows the strategic use
of data.
A Data Lake is a storage repository that can store a large amount of structured, semi-
structured, and unstructured data.
Data Warehouse stores data in schemas and tables which helps to organize and use the data
to take strategic decisions.
A Data Lake is a large size storage repository that holds a large amount of raw data in its
original format until the time it is needed.
Data warehouse concept, unlike big data, had been used for decades.
Big data technologies integrate with the use of data lakes is relatively new.
Next (/business-intelligence-definition-example.html)
https://www.guru99.com/data-lake-vs-data-warehouse.html 5/7
7/3/2019 Data Lake vs Data Warehouse: Know the Difference
(https://www.facebook.com/guru99com/)
(https://twitter.com/guru99com)
(https://www.youtube.com/channel/UC19i1XD6k88KqHlET8atqFQ)
https://www.guru99.com/data-lake-vs-data-warehouse.html 6/7
7/3/2019
Data Lake vs Data Warehouse: Know the Difference
(https://forms.aweber.com/form/46/724807646.htm)
About
About US (/about-us.html)
Advertise with Us (/advertise-us.html)
Write For Us (/become-an-instructor.html)
Contact US (/contact-us.html)
Career Suggestion
SAP Career Suggestion Tool (/best-sap-module.html)
So ware Testing as a Career (/so ware-testing-career-
complete-guide.html)
Certificates (/certificate-it-professional.html)
Interesting
Books to Read! (/books.html)
Suggest a Tutorial
Blog (/blog/)
Quiz (/tests.html)
eBook (/ebook-pdf.html)
Execute online
Execute Java Online (/try-java-editor.html)
Execute Javascript (/execute-javascript-online.html)
Execute HTML (/execute-html-online.html)
Execute Python (/execute-python-online.html)
https://www.guru99.com/data-lake-vs-data-warehouse.html 7/7