You are on page 1of 19

The Information Value Chain:

Connecting Internal and External Data AND Data Warehousing Driving Quality & Integration

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Identification Of Data Needed From Other People


What else do we need that we dont have already on the list of data sources, the following list can be used as a starting point for identifying data competitors' unit and revenue sales results from the the regions in TheThe competitors' unit and revenue sales results from regions in which youwhichcompete.compete. both we both Historical demographic data, such as population trends, perHistorical demographic data, such as population trends, per-household and household and and local income, and local and regional per-capita income,per-capitaand regional unemployment data. unemployment data. Economic forecasts. Economic forecasts. Information about your customers activities and their behaviour with other Information about your customers activities and their behaviour companies with other companies In addition to comparative data, be certain to look into partnerships. You might want to determine whether an information value chain has emerged because of key relationships with suppliers and vendors within your companies supply chain.
30-05-2012 DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Determining What External Data We Really Need


You have to apply your business needs analysis before you begin to consider what data to acquire. Its not desirable to collect data which wasnt required and proves to be irrelevant. To escape that, one should follow these steps: Revalidate your list of total users.

For each person on the list, answer this question : To perform most effectively his/her assigned business functions, does this person need any data thats not available from the companys internal computer systems ?
Using the results from interviewing, create a consolidated list of external data needs, the sources from which you can obtain the data, prices and fees, restrictions, and contact information. Talk to the project sponsor about budget approval.
30-05-2012 DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Ensuring The Quality Of Incoming External Data

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Restocking Your External Data

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Finding external information

Acquiring External Data Gathering general information Cruising the internet

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Knowing What To Do With Historical External Data

When we begin building a data warehouse, we probably have to use as a data source a couple years worth of backups that are sitting in some archived, offline form such as tapes, CDs, or DVDs, collecting dust.

Be prepared to analyse each archived media for its structure and content before loading the data into the warehouse. You cant stop your data analysis after you study only the current format and content guidelines because at least a few things probably have changed over the period of time that those tapes, CDs, or DVDs represent.

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Switching From One External Data Provider To Another

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

The Infrastructure Challenges


When you develop any application or system, either data warehousing or a more traditional transaction-processing application, you have significant dependencies on pieces of the overall environment over which you have no direct control. Here are some examples specific to data warehousing:
You design a data warehouse that, based on business requirements a gigabytes of new and updated data extracted from various sources each evening and sent over the network to the hardware platform on which the data warehouse is running.

During the data warehousing projects scope phase, you determine a push strategy to update the data warehouse is the most appropriate model to follow. To implement a push strategy, though, you must modify each source application to include code that detects when that application must push (send) data to the data warehouse.
30-05-2012 DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Data Warehouse Data Stores


A data warehouse is, by its very nature, a distributed physical data store. Distribution of your information assets assists in the performance and usability across systems and across the enterprise. Make this level of usability the cornerstone of your data warehousing mission and objective. Figure shows how the important data stores of a data warehousing architecture incorporate sources of data, the data warehouse, an operational data store, data marts, and master data.
30-05-2012 DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

10

Sources Data Feeds


Source data feeds are the inputs that feed the data warehouse typically, your run-the-business application databases, as well as external data sources, such as credit rating data or market segment information. Although the data warehousing team doesnt manage the data and architecture associated with these data stores, the team needs to understand the data feeds. Just like a horse without hooves cant function properly, a data warehouse without sources cant get the job done. The most difficult task you face in data warehousing is choosing the right source, or system of record, for data that moves into the data warehouse. If the data is of low quality or isnt readily available, you have a hard time supporting a high-quality data warehouse.

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

11

Operational Data Store (ODS)


Heres my definition of an ODS (its a long one): an informational and analytical environment that reflects at any point the current operational state of its subject matter, even though data that makes up that operational state is managed in different applications elsewhere in the enterprise. This list explains each part of the preceding definition: Informational and analytical environment
Reflects at any time the current operational state Subject matter: Like with a data warehouse, create an ODS with a specific

business mission in mind for a manageable set of subject areas.


Data managed in different applications elsewhere in the enterprise
12

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Master Data Management (MDM)


In recent years, ODS-style feedback systems defined for a specific purpose
reference data have emerged. All systems are packed with reference data. This data can include the set of data you use to describe the stage of a sale opportunity. At a basic level, MDM seeks to ensure that an organization doesnt use multiple (potentially inconsistent) versions of the same reference (or master) data in different application systems or parts of its operations. Processes commonly seen in MDM solutions include source identification, data collection, data

transformation, normalization, rule administration, error detection and correction,


data consolidation, data storage, data distribution, and data governance. These processes are also used in the logical evolution of the ODS feedback loop.

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

13

Service-oriented Architecture
SOA is a method for systems development and integration in which functionality is grouped around business processes and packaged as interoperable services. SOA also describes IT infrastructure that allows different applications to exchange data with one another while they participate in business processes. An SOA aims to loosely couple services with operating systems, programming languages, and other technologies that underlie applications. This process is very similar to what happened with audio visual equipment while it evolved.

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

14

Dealing With Conflict: Special Challenges To Your Data Warehousing Environment


The data warehouse, if done properly, becomes the heart of your systems managing the data assets and assuring their quality and accessibility.
Others will minimize the importance of the data warehousing defining it merely as a dumb reporting solution.

Because your data warehousing project affects every application from which you plan
to extract data, no matter how un-intrusively, the project also affects the staff members who support those applications and their databases (or files).

Why would someone define it this way? Consider these factors:


15

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

Cont.
The network administration staff, already struggling with major companywide networking initiatives, now must help you figure out whether the network has enough bandwidth available to support your regular data extraction, movement, and loading procedures and, furthermore, whether you can secure the data from outside intrusion.

The database administration group, the people who handle all database structure creation and modification, are appalled that you want to frequently make changes to your data warehouses database definitions by yourself, without their involvement.

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

16

Cont.
The program management office (PMO) is wondering why you havent filled out their mandatory set of templates to be Sarbanes-Oxley (SOX) compliant because the PMO doesnt realize that youre not using their long, drawn-out waterfall technique for delivery (instead going with a more apt agile-delivery technique).

The business organization whose members today use data extracts along the lines of they arent perfect, but they give us mostly what we need suddenly find out that theyre losing their report-writing tools. To get their regular reports, they must figure out how to use the new business intelligence suite of tools that youre deploying with the data warehouse.

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

17

You need these people make no mistake about it. Heres how to handle these situations:

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

18

30-05-2012

DATA WAREHOUSE FOR DUMMIES- CH 19 & 20

19