You are on page 1of 21

DATA

WAREHOUSE
Presented BY-

Vidhi
Sanjana
Nikita
Sahil
Arpit
WHAT IS A DATA WAREHOUSE
A data warehouse is a large, central location where data is managed and stored for analytical
processing. The data is accumulated from various sources and storage locations within an
organization.
COMPONENTS
1.Data warehousing starts with gathering data from different operational systems within an
organization,
-structured data ( tables , excel sheet , graphs )
-unstructured data ( text, documents , emails , images sample)

2. Data Integration: Once data is collected, it needs to be integrated into a single, unified format
which is compatible and can be analyzed together

3. Data Storage: The integrated data is then stored in a central repository known as a data
warehouse

4. Data Management: Data warehouses require ongoing maintenance to ensure data quality,
consistency, and security. This includes tasks such as data cleaning, data validation, data governance,
and access control. Data management practices help ensure that the data stored in the warehouse
remains accurate, relevant, and trustworthy.
CHARACTERISTICS
A data warehouse is a subject-oriented, integrated, time-variant,
and SUBJECT ORIENTED
nonvolatile collection of data in support of management’s
decision-making process.” —W. H. Inmon

SUBJECT - ORIENTED
A data warehouse can be used to
analyze a particular subject area. For
example, "sales" can be a particular INTEGRATED
subject.
A data warehouse integrates data
from multiple data sources. For
example, source A and source B may
have different ways of identifying a
product, but in a data warehouse,
there will be only a single way of
identifying a product.
CHARACTERISTICS

TIME-VARIANT
Historical data can be kept in a data
warehouse. For example, one can
retrieve data from 3 months, 6 months,
12 months, or even older data from a
data warehouse.

NON-VOLATILE
Once data is in the data warehouse, it
will not change. So, historical data in a
data warehouse should never be
altered.
Why Organizations
Need It:
Centralized Data Storage Integrated Data
Stores data from various sources in Combines diverse data into a unified
one place, simplifying management format, providing a comprehensive
and access for decision-making. view for informed decisions

scalable data storage Advanced Analytics


Designed to handle large data Supports data mining, and
volumes and adapt to growing predictive analytics, driving business
needs for storage, processing, and growth and innovation.
analysis.
When Organizations
NeED IT:

Data Complexity
Data-Driven Decision-Making
Business Growth
Real-Time Analytics

Regulatory Rules
DATA WAREHOUSE VS
DATABASE
Data warehouses and databases both act as data storage and management tools. However,
there are a few key differences to acknowledge. First, data warehouses have analytical
capabilities. They enable companies to make analytical queries that track and record certain
variables for business intelligence.

In contrast, a database is a simple collection of data in one place. Databases’ main purpose
is to store data securely and allow users to access it easily.
Organizations often need both databases and data warehouses to manage the massive
amounts of data they produce daily.

For example, a clothing company may use one database to store customer information and
another to track website traffic. They can use a data warehouse to compare both databases
on a historical scale to reveal insight into consumer trends.
TYPES OF DATA WAREHOUSES
Cloud-based warehouse

A cloud-based data warehouse is a virtual place on the cloud where companies can store their data.
Businesses set up these types of warehouses on external cloud servers. Companies typically require
exporting their internal data to a centralised server system hosted by a cloud service provider. Cloud-
based data warehouses are flexible and companies can choose the amount of storage space based
on their needs. These warehouses are also often cost-efficient. They may be preferable to companies
that are just starting out and require flexible data warehouses on a controlled budget.

On-premises warehouse

On-premises data warehouses are often physically available on the company premises. Unlike with a
cloud-based warehouse, companies do not require exporting their data to external servers. They may
directly upload the required data to the warehouses from the company systems. This also allows
companies the advantage of faster data processing.This type of data warehouse can be beneficial
because it allows a company to have complete control over its data without the data leaving the
company's system. Another key benefit of having an on-premises data warehouse is that if there are
any problems with internet connectivity, there are likely no interruptions in data accessibility.
Case Study On Amazon
Introduction
Amazon is one of the world's largest and
most successful companies with a
diversified business: cloud computing,
digital content, and more. As a company
that generates vast amounts of data
(including data warehousing services),
Amazon needs to manage and analyze its
data effectively.

The global e-commerce giant, leverages data warehousing to drive innovation, optimize
operations, and enhance customer experiences. In this case study, we'll delve into the
features, applications, and benefits of data warehousing in the context of Amazon's vast
and complex ecosystem.
What: Understanding Data
Warehousing at Amazon
Data Integration:
Amazon collects massive amounts of data from various sources,
including customer transactions, website interactions, inventory
management systems, and more.

Query and Analysis:


With millions of products and customers worldwide, Amazon
relies on robust querying and analysis capabilities to extract
actionable insights. Data warehousing allows Amazon to analyze
sales trends, customer preferences, and operational efficiency.

Scalability:
Amazon's business is constantly growing, generating vast
volumes of data each day. Data warehousing provides scalability,
allowing Amazon to accommodate this exponential data growth
while maintaining performance and reliability.
Amazon Web Services

Amazon Web Services, Inc. (AWS) is an Amazon subsidiary that offers governments,
businesses, and individuals metered, pay-as-you-go cloud computing platforms .

Amazon Web Services (AWS) is a comprehensive, evolving cloud computing


platform provided by Amazon. Launched in 2006, AWS offers a vast array of
services including computing power, storage options, networking, databases,
machine learning, analytics, and more.

While traditional on-premises data warehouses often struggle to handle the


exponential growth in volume of data, with AWS, businesses can seamlessly scale
their infrastructure while ensuring optimal performance.
Amazon Redshift: Data Warehouse
Solution
Amazon Redshift: Data Warehouse
Solution
Amazon Redshift is a managed, petabyte-scale cloud data warehouse service that makes up part of the
Amazon
larger cloudRedshift
platform,isAmazon
a cloud-based data
Web Services warehousing
(AWS). service
In simple terms, it's aprovided by Amazon
platform that lets you store and
Web
analyzeServices
all of your(AWS).
data in It
theallows businesses
cloud for to efficiently
deeper business insights. analyze large datasets using
SQL queries

It is designed for large-scale data analytics and processing, enabling businesses to


analyze vast amounts of data quickly and cost-effectively.

It supports standard SQL queries, making it accessible to users familiar with SQL.
Organizations across various industries leverage Redshift for business intelligence,
analytics, and data warehousing tasks.
Its pay-as-you-go pricing model and flexible scaling options make it cost-effective for
businesses of all sizes. Redshift enables users to derive insights from their data quickly
and efficiently, driving data-driven decision-making processes.
When: Applications of Data
Warehousing at Amazon
Personalized Recommendations:
By understanding individual shopping patterns, Amazon
can offer personalized product recommendations.

Inventory Management and Forecasting:


By analyzing historical sales data and demand patterns,
Amazon can anticipate future demand and optimize
inventory levels.

Operational Efficiency
Amazon leverages data warehousing to streamline its
operations and logistics.
Why: Benefits of Data
Warehousing for Amazon
Customer-Centric Innovation:
Data warehousing empowers Amazon to innovate
and develop customer-centric solutions.

Data-Driven Decision-Making:
Amazon leverages data-driven decision-making to
stay agile and responsive in a dynamic marketplace.

Competitive Advantage:
Data warehousing provides Amazon with a
competitive edge in the e-commerce industry.
CONCLUSION
The case study underscores the importance of embracing
innovative technologies to meet the evolving demands of
modern businesses. .

Moreover, Amazon's experience highlights the benefits of


cloud-based data warehousing solutions, including
increased agility, reduced infrastructure complexity, and
improved scalability.
THANK YOU

You might also like