Professional Documents
Culture Documents
REDUNDANCY
There are many issues raised by the diagram. One of the issues is that of redun-
dant data. One looks at the diagram, and it appears that there is redundant data
everywhere.
In fact, there is data that have been transformed. And if a value of data remains
the same after transformation, then you may want to consider the data to be
redundant. Then again, you may not.
Consider redundancy in the real world. Take the time of day. You can find the
time of day on the Internet, on the telephone, on the radio, on television, and 225
FIG. 8.4.1
A high level architecture.
many other places, for that matter. Does the fact that time of day appears redun-
dantly in many places becomes a bother? The only time it becomes a bother is if
there is no way to determine what the accurate time is. If there were no defin-
itive source of time, then having time appear redundantly would be a problem.
But as long as there is some definitive source somewhere and as long as most
redundant sources adhere to that definitive source, then there is no problem. In
fact, having redundant sources of time is actually quite helpful, as long as there
is no problem with the integrity of that time.
Therefore, having redundant data across the enterprise as seen in Fig. 8.4.1 is
not an issue as long as the integrity of the data is not an issue.
FIG. 8.4.2
The system of record.
established and managed. Your bank account balance may appear in many places
throughout the bank. But there is only one place where the system of record is kept.
The system of record moves throughout the data architecture that has been
described.
Fig. 8.4.2 depicts the movement of the system of record.
Fig. 8.4.2 shows that as data are captured, especially in the online environment,
the data have its first occurrence of the system of record. Location 1 shows that
the system of record for current valued data is found in the online environment.
You can think of calling the bank and asking for your account balance that
exists right now, and the bank looks into its online transaction processing envi-
ronment to find your account balance right now.
Then one day, you have an issue with a bank transaction that occurred 2 years ago.
Your lawyer requires you to go back and prove that you made a payment 2 years
ago. You can’t go to your online transaction processing environment. Instead, you
go to your record in the data warehouse. As data age, the system of record moves
for older data to the data warehouse. That is location 2 in the diagram.
228 C HA PT E R 8 . 4 : Data Architecture: A High-Level Perspective
Time passes and you get audited by the IRS. This time, you have to go back
10 years time to prove what financial activity you have had a decade ago.
Now, you go to the archival store in big data. That is location 3 in the diagram.
So, as time passes, the system of record for data changes in data architecture.
FIG. 8.4.3
Answering different questions throughout the architecture.
Different Communities 229
In location 4, there are the data marts. In the data marts is where bank
management combines your account information with thousands of other
accounts and looks at the information from the perspective of a department.
One department looks at the data in the data marts from an accounting
perspective. Another department looks at the data from the perspective of
marketing and so forth.
There is yet another perspective of data afforded by the data found in location 5.
In location 5, big data is found. There is deep history there and a variety of other
data. The kinds of analysis that can be done in location 5 are miscellaneous and
diverse.
Of course, the differences in data and the types of analysis that can be done are
different for different industries. The example that has been used is of a bank for
the purposes of making the example clear. But for other industries, there are
other types of usage information.
DIFFERENT COMMUNITIES
Different communities use the information found in data architecture. In gen-
eral, the clerical community uses information found in locations 1 and 2.
Everyone uses the data found in location 3. The data warehouse serves as a cross
roads for information throughout the organization. Different functional
departments use the information found in location 4. And location 5 serves
as an omnibus for the entire organization.