P. 1
WorkShop BI Adquisition

WorkShop BI Adquisition

|Views: 108|Likes:
Published by alfredo_erubiel

More info:

Published by: alfredo_erubiel on Oct 26, 2010
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PPTX, PDF, TXT or read online from Scribd
See more
See less






Fundamentos Adquisición de datos

Scope Part 2.
The first lesson describes the flow of data between BI and source systems that contain data. The second lesson shows the procedure for loading master data (attributes and texts) from an SAP system. On the third lesson we will discuss the data transfer process with more complexity and more details. We will discuss the available transformation rule types and more advanced start and end routines. In addition, we will visualize our data in the InfoCube upon completion.

Generic Data Warehouse Positioning of the Data Flow

The ETL process, sometimes called the data flow is a list of the steps that raw (source) data must follow to be extracted, transformed, and loaded into targets in the BI system

BI Architecture: Positioning of the ETL Process .

BI Data Flow Details .

mySAP CRM. and many others. DataSources subdivide the data provided by a source system into self-contained business areas. Examples include mySAP ERP. and Cost Center Transaction DataSources from two different source systems. custom system-based Oracle DB. PeopleSoft. master data. A DataSource contains a number of logically-related fields that are arranged in a flat structure and contain data to be transferred into BI . DataSources are BI objects used to extract and stage data from source systems. Our cost center example includes cost center texts.Source Systems and DataSource A source system is any system that is available to BI for data extraction and transfer purposes.

Source System Types and Interfaces .

4. Is not constructed to support end-user or tool access. In response to a posting on Ask the Experts at DMreview. Evan Levy defines a PSA as: 1.com.Persistent Staging Area Persistent Staging Area (PSA) is an industry term. It is typically temporary. (This definition comes to us from Evan Levy. Specifically built to provide working (or scratch) space for ETL processing. but not everyone agrees on an exact definition. The storage and processing to support the transformation of data. 2.com) . 3.s response to a posting on ask the experts on DMreview.

. InfoCubes and DataStore Objects. you then to cleanse / transform it prior to physical storage in your targets.BI 7. These targets include InfoObjects (master data).0 Transformation Once the data arrives in the PSA.

Optional BI InfoSources .

there are two processes that need to be scheduled. Transformations. . Once the data flow is designed.InfoPackages and Data Transfer Processes 1 The design of the data flow uses metadata objects such as DataSources. As you can see from the figure below. the InfoPackages and the Data Transfer Processes take over to actually manage the execution and scheduling of the actual data transfer. InfoSources and InfoProviders.

and an extractor program associated with the DataSource might be initiated. In a production environment. An InfoPackage is the BI object that contains all the settings directing exactly how this data should be uploaded from the source system. This involves multiple steps that differ depending on which source system is involved. For example. with one InfoPackage. if it is a SAP source system. the same data in the same source system should only be extracted once. . a function call must be made to the other system. as many data transfer processes as necessary can push this data to as many InfoProviders as necessary.InfoPackages and Data Transfer Processes 2 The first process is loading the data from the source system. from there. The target of the InfoPackage is the PSA table tied to the specific DataSource associated with the InfoPackage.

InfoPackages and Data Transfer Processes Initiate the Data Flow .

Sometime necessity drives very complex architectures. . Note if you involve more than one InfoProvider. you need more than one data transfer process. This more complex situation is shown below. update mode (delta or full) for a specific transformation. It is this object that controls the actual data flow (filters.InfoPackages and Data Transfer Processes 3 The second process identified in the figure is the data transfer process. You might have more than one data transfer process if you have more than one transformation step or target in the ETL flow.

More Complex ETL: Multiple InfoProviders and InfoSource Use .

Loading SAP source system Master Data Scenario .

. Each time you want to convert incoming fields from your source system to InfoObjects on your BI InfoProviders. you create a dedicated TRANSFORMATION.Global Transfer Routines Cleansing or transforming the data is accomplished in a dedicated BI transformation. consisting of one transformation rule for each object.

SAP Source System Extraction .

DataSource Creation Access and the Generic Extractor .

As of the newest version of BI.Replication In order to access DataSources and map them to your InfoProviders in BI. This process is called replication. you can activate Business Content data flows entirely from within the Data Warehousing Workbench. During this process the Business Content DataSource Activation in the SAP source system and Replication to SAP NetWeaver BI takes place using a Remote Function Call (RFC). or replicating the DataSource metadata. Once the DataSource has been replicated into BI. . It is accomplished from the context menu on the folder where the DataSource is located. the final step is to activate it. you must inform BI of the name and fields provided by the DataSource.

DataSource in BI After Replication .

Since we added some custom global transfer logic directly to our InfoObject. we just need field-to-field mapping for our third step:Transformation. we are trying to keep it simple. .Access Path to Create a Transformation In this first load process.

Transformation GUI Master Data .

InfoPackage: Loading Source Data to the PSA .

Creation and Monitoring of the Data Transfer Process .

Complete Scenario: Transaction Load from mySAP ERP .

Emulated DataSources .

x DatasSources .Issues Relating to 3.

Using the Graphical Transformation GUI .

The Transformation Process: Technical Perspective .

Start Routine 1 .

Start Routine 2 .

Transformation Rules: Rule Detail

Transformation Rules: Options and Features

Transformation: Rule Groups

A rule group is a group of transformation rules. It contains one transformation rule for each key field of the target. A transformation can contain multiple rule groups. Rule groups allow you to combine various rules. This means that you can create different rules for different key figures for a characteristic.

Transformation Groups: Details

End Routine

Data Acquisition Layer .

Extraction using DB Connect and UD Connect .

UD Connect Extraction Highlights .

DB Connect Extraction .

Technical View of DB Connect .

XML Extraction .

XML Purchase Order Example .

XML Extraction Highlights .

Loading Data from Flat Files: Complete Scenario .

Flat File Sources .

.Features of the BI File Adapter and File-Based DataSources Basically a DataSource based on a flat file is an object that contains all the settings necessary to load and parse the file when it is initiated by the InfoPackage. Some of features of the BI file adapter are listed below.

File System DataSource: Extraction Tab .

File System DataSource: Proposal Tab .

File System DataSource: Fields tab .

File System DataSource: Preview Tab .

BI Flexible InfoSources .

A New BI InfoSource in the Data Flow .

Complex ETL: DataSource Objects and InfoSources .

DTP: Filtering Data .

another feature supports debugging bad transformations. You can determine how the system responds if errors occur. . In addition. The data transfer process also supports error handling for DataStore objects. It is called temporary storage. At runtime.Error Handling The data transfer process supports you in handling data records with errors. the incorrect data records are sorted and can be written to an error stack (request-based database table).

Error Processing .

Features of Error Processing .

More Error Handling Features .

DTP Temporary Storage Features .

Access to the Error Stack and Temporary Storage via the DTP Monitor .

we will look at an example to illustrate exactly what hapens when data is uploaded and subsequently activated in a DataStore Object. which is the technical term used to describe how these tables get their data. are loaded into the DataStore Object. In the following section. Previously. we described the three tables and the purpose for each. .Loading and Activation in DataStore Objects A standard DataStore Object has three tables. we will examine the DataStore Object activation process. This can occur sequentially or in parallel. In addition. REQU1 and REQU2. Let us assume that two requests. but we only explained that a data transfer process is used to load the first one. The load process posts both requests into the activation queue.

Loading Data into the Activation Queue of a Standard DataStore Object .

Activation Example: First Load Activated .

Activation Example: Offsetting Data Created by Activation Process 1 .

a DataStore Object was required in the data flow before the InfoCube. we feed the change log data to the InfoCube. and 30 add to the correct 30 value. and the source data flowed directly to a InfoCube. but many times it is desired .Activation Example: Offsetting Data Created by Activation Process 2 If the DataStore Object was not in the flow of data in this example. instead. the InfoCube would add the 10 to the 30 and get an incorrect value of 40.-10. It is not always required. If. In this example. 10.

Integrating a New Target .

The MultiProvider itself (like InfoSets and VirtualProviders) does not contain any data. Its data comes exclusively from the InfoProviders on which it is based.MultiProviders A MultiProvider is a special InfoProvider that combines data from several InfoProviders. A MultiProvider can be made up of various combinations of the following InfoProviders: † InfoCubes † DataStore objects † InfoObjects † InfoSets † Aggregation levels (slices of a InfoCube to support BI Integrated Planning) . providing it for reporting.

MultiProvider Concept .

´Partitioned separatelyµ can either relate to the concept of splitting cubes and DataStore Objects into smaller ones Performance gains though parallel execution of subqueries. You can construct simpler BasicCubes with smaller tables and with less redundancy.Advantages of the MultiProvider Simplified design: The MultiProvider concept provides you with advanced analysis options. . Individual InfoCubes and DataStore Objects can be partitioned separately. without you having to fill new and extremely large InfoCubes with data.

MultiProviders Are Unions of Providers .

Example: Plan And Actual Cost Center Transactions .

MultiProvider Queries .

Selecting Relevant InfoProviders for a MultiProvider .

MultiProvider Design GUI .

Characteristic Identification in a MultiProvider .

Key Figure Selection .

Centralized Administration Tasks .

Process Chains: Automating Warehouse Tasks .

Summary of Dedicated BI Task Monitors .

and compress the contents of the fact table. roll up requests in the aggregates. Six tab pages appear: † † † † † † Contents Performance Requests Roll-Up Compress Reconstruct ( Only valid with 3.Administration / Managing InfoCubes The Manage function allows you to display the contents of the fact table or the content with selected characteristic values (through a view of the tables provided by the Data Browser). You can also repair and reconstruct indexes. Select the InfoCube that you want to manage and choose Manage from the context menu.x data flow objects) . delete requests that have been loaded with errors.

Managing InfoCubes .

Requests in InfoCubes .

Compressing InfoCubes .

The three tabs under the Manage option for DataStore Objects are: Contents.Management Functions of DataStore Objects The functions on the Manage tab are used to manage standard DataStore Objects. Although there are not as many tabs for managing DataStore Objects as in the equivalent task for InfoCubes. Requests. . the functions for InfoCubes are more complex. and Reconstruction.

DataStore Object Administration .

Contents and Selective Deletion .

The system does not check whether the data has been successfully activated. is set when activation is started for a request. indicating readability by BEx queries. .DataStore Object Administration: Requests Tab The Query icon.

DataStore Object Change Log: Maintenance Required .

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->