You are on page 1of 4

365 Data Science

COURSES

Q&A HUB
ABOUT US
INSTRUCTORS
LOGIN

What Is a Data Warehouse?


Articles 1 min read
Blog / Articles / What Is a Data Warehouse?
what is a data warehouse
Data warehousing is one of the hottest topics both in business and in data science.
But if you’re new to the field, you’re probably wondering what a data warehouse is,
why we need it, and how it works. Don’t worry because, in this article, you’ll find
the answers to all these questions.

First, let’s start with a definition: the meaning of the phrase: ‘Single source of
truth’.

What Is the Single Source of Truth?


In information systems theory, the ‘single source of truth’ is the practice of
structuring all the best quality data in one place.

Here’s a very simple example.

Surely it has happened to you to work on a file and to create many different
versions of it.

How do you name such a file?


Well, once you are ready you often place the word ‘final’ at the end. This results
in having a bunch of files with extensions:

‘final’
‘final, final’
‘final, final, final’
Or my favorite:

‘really final’… ‘final’


If this is you, you are not alone. It seems that even corporations never know where
the most recent or most appropriate file is.

an Excel file with many different versions and extensions


But what if you knew that there is one single place where you would always have the
single source of information?
That would be quite helpful wouldn’t it?

Well, a data warehouse exists to fill that need.

So, what is a data warehouse exactly?


data warehouse definition
It is the place where companies store their valuable data assets, including
customer data, sales data, employee data, and so on.

In short, a data warehouse is the de facto ‘single source of data truth’ for an
organization. It is usually created and used primarily for data reporting and
analysis purposes.

There are several defining features of a data warehouse.


It is:

subject-oriented
integrated
time-variant
nonvolatile
summarized
Let’s quickly go through these, one by one.

Subject-oriented means that the information in a data warehouse revolves around


some subject.
Therefore, it does not contain all company data ever, but only the subject matters
of interest. For instance, data on your competitors need not appear in a data
warehouse, however, your own sales data will most certainly be there.

a data warehouse is subject oriented


Integrated corresponds to the example from the beginning of the video.
Each database, or each team, or even each person has their own preferences when it
comes to naming conventions. That is why common standards are developed to make
sure that the data warehouse picks the best quality data from everywhere. This
relates to ‘master data governance’, but that is a topic for another time.

a data warehouse is integrated


Time-variant relates to the fact that a data warehouse contains historical data,
too.
As said before, we mainly use a data warehouse for analysis and reporting, which
implies we need to know what happened 5 or 10 years ago.

a data warehouse is time-variant


Nonvolatile implies that the data only flows in the data warehouse as is.
Once there, it cannot be changed or deleted.

a data warehouse is nonvolatile


Summarized once again touches upon the fact that the data is used for data
analytics.
Often it is aggregated or segmented in some ways, in order to facilitate analysis
and reporting.

a data warehouse is summarized


So, that’s what a data warehouse is – a very well structured and nonvolatile, ‘de
facto’, single source of truth for a company.

Ready to take the next step towards a data science career?


Check out the complete Data Science Program today. Start with the fundamentals with
our Statistics, Maths, and Excel courses. Build up a step-by-step experience with
SQL, Python, R, Power BI, and Tableau. And upgrade your skillset with Machine
Learning, Deep Learning, Credit Risk Modeling, Time Series Analysis, and Customer
Analytics in Python. Still not sure you want to turn your interest in data science
into a career? You can explore the curriculum or sign up for 15 hours of beginner
to advanced video content for free by clicking on the button below.

Learn how to
get started with
data science

Join the webinar for free


0 0
DAYS
2 1
HOURS
5 6
MINUTES
01 01
SECONDS
Book My Space!
*Limited spaces available

Related Posts
Are Data Science Careers On the Rise in 2020?
Simpson’s paradox explained, or when facts aren’t really facts
What Is a Tensor?
Can I Become a Data Scientist: Research into 1,001 Data Scientists
Data Visualization: How to Choose the Right Chart and Graph for Your Data
Earn your Data Science Degree
Expert instructions, unmatched support and a verified certificate upon completion!
Go to courses See pricing
Related Posts
Data Science in Finance: 5 Ways It Changed the Industry
7 min read Articles
Are Data Science Careers On the Rise in 2020?
5 min read Career Advice
Debunking 10 Misconceptions About AI
12 min read Articles
Hack The News Datathon – From Data Science Society
1 min read Articles
Leave a Reply
Your email address will not be published.

COMMENT

NAME

EMAIL

© 2021 365 Data Science. All Rights Reserved

Pricing
Courses
Resources
Scholarship
Explainer videos
Events
Explainer videos
Excel
Statistics
Programming
Machine learning
Blog
Tutorials
Career Advice
Interviews
About us
Contact us
Privacy Policy
When you browse on this site, cookies and other technologies collect data to
enhance your experience and personalize the content and advertising you see. Ok,
got it! Read More
Privacy Overview
This website uses cookies to improve your experience while you navigate through the
website. Out of these cookies, the cookies that are categorized as necessary are
stored on your browser as they are essential for the working of basic
functionalities...
Necessary
Always Enabled
Non-Necessary
JanuaryPromo
×
Complete Data Science Education
00 days
08 hours
56 minutes
52 seconds
Get 50% OFF

You might also like