You are on page 1of 11


Welcome, Guest Login Register Getting Started Newsletters Store

SAP Community Network Wiki - Enterprise Information Management - EIM Home

Solutions SAP Services & Support About SCN Downloads Industries Training & Education Partnership Code Exchange Lines of Business University Alliances Events & Webinars Idea Place

EIM Home
Added by Moshe Naveh, last edited by Vicky Bolster on Mar 26, 2012
Moderators: Kendra Van Gundy | Brandon Jacobson | Kris Sorenson How to contribute: Welcome to the SAP Enterprise Information Management (short: EIM) topic. Feel free to create new entries or add to existing ones. Empow er your people to make better decisions, drive operational excellence, ensure regulatory compliance, and minimize IT costs based on trusted information. SAP solutions for Enterprise Information Management help you deliver integrated, accurate, and timely data both structured and unstructured across your enterprise. Click here to submit content Please use Page Template for all Submissions SCN Discussions (form erly called Forum s): Postalsoft | Data Integration and Data Quality

EIM Staging Area

Data Services Getting Started w ith DataServices Compare DataServices w ith... Comparing DataServices w ith hand coding Challenges w ith Scripts Development goodies in DataServices Reducing the cost of ow nership w ith DataServices Scenario 1 - Copy Scenario 1 - Copy via SQL Scenario 1 - Copy via DataServices Scenario 2 - SCD2 Scenario 2 - SCD2 via SQL Scenario 2 - SCD2 via DataServices Scenario 2 - SCD2 via SSIS Scenario 3 - Fact Load Scenario 3 - Fact Load via SQL Scenario 3 - Fact Load via DataServices Video Tutorials Supported Platforms documentation How to dow nload a new release DataServices and the RampUp program How to get a new License Key I am new to DS, w here to start? Training and certification of ACE

Postalsoft Business Edition and DeskTop Mailer DataRight IQ Frequently Used USPS and Industry Links Label Studio Match Consolidate Postal File Preparation Tool Presort PrintForm Inform ation Stew ard Business Terms Glossary (Metapedia) Cleansing Package Builder Data Insight Information Stew ard Installation Metadata Management Metapedia User Management and Security Data Quality Managem ent for Enterprise Apps Data Quality Management for Informatica Data Quality Management for SAP Data Quality Management for Siebel RapidMarts Introduction To RapidMarts Data Services Postalsoft

Welcom e to EIM!

Data Quality Management for Enterprise Apps Information Stew ard Data Quality Management SDK Enterprise Information Management Use Case Wiki RapidMarts

Subm it your content! Click here to submit content Please use Page Template for all Submissions (All new content to be created in Staging Area until Moderation and Point Assignment)

Watch out! RSS Feeds - Coming Soon! Click here to w atch w hen content is updated

Help! Our Wiki User Guide - Coming Soon! Contact Moderators via contact information at top of page

Last updated
Recently Updated

DataServices and other EIM products



SAP Community Network Wiki - Enterprise Information Management - EIM Home

DataServices and other EIM products How to file a support case Where is a DS and DQ FAQ Where to find documentation Setup scripts How To - typical DataServices questions Access to Previous Row Values Previous row processing via custom function Previous row processing via self join Previous row processing via User Defined Transform Adapter SDK Tutorial Integration of Adapters into DI Types of Adapters Document Source-Target Table Source-Target FunctionCall Adapter Operations Poll Operation - Realtime Service bridge Listener Operation - Outbound Message A simple Table Read Adapter Interfaces to Implement The Adapter Interface The Session Interface The MetadataBrow sing Interface The MetadataNode for the RootNode The MetadataNode for the FileNode The MetadataImport Interface ImportByName Interface The TableSource Interface The GUI Interfaces For the ImportByName For the DataStore Example II - the eMail Adapter The Coding Installation Using the Adapter The XML and processing options Reading eMails in Batch Processing eMails in Realtime Installing the Adapter Adding and starting the SFAdapter Creating a new DataStore Execute a DataFlow ANSI 92 joins in DataServices ANSI 92 right outer join in DataServices Auditing and Validation Audit Points Data Validation Build a "w here exists" query Complex transformation rules Routing via the Case Transform Lookup_ext() function lookup_ext() w ith pattern lookup_ext() w ith return expressions Control the Commit points Create a Designer Dektop Icon Creating a last insert and last update date Consuming an external Web Service Cumulative Sum Database Session parameters Datastore Configurations Using configurations for porting Datatype conversions Debugging jobs using log files Delta Load Implementation Timestamp based delta Logtable based delta Database Transactionlog based delta test updated by Shari Wennes (v iew change) girls.jpg attached by Shari Wennes
May 01 May 01

Report Generation updated by Brandon Law

(v iew change) May 01

CGUL Tips and Tricks for Entity Extraction

updated by Julie Oliv er (v iew change) May 01

Data Services User Resource limits on Unix systems

updated by Lina Encinales (v iew change) May 01



SAP Community Network Wiki - Enterprise Information Management - EIM Home

Database Transactionlog based delta Replication based delta Messaging based delta Table Comparison based delta Initial load as delta Ignore data not possible to be modified for delta Error Handling ETL Project Guidelines Goals Global Variables Inititialize-End Script AW_EndJob AW_JOBEXECUTION AW_StartJob Components Sections Dimension Tables Fact Tables PreLoad Stored Procedure PreLoad Stored Procedure for Oracle PostLoad Stored Procedure PostLoad Stored Procedure for Oracle Initial vs. Delta Load Other ETL Project Rules Restartability for Initial Loads Restartability for Delta Loads Supports the Recovery Feature Testing Flat Files Change the Row Delimiter Errorhandling in file formats Excel- Save to DI How to create a flat file format from a Query Multirecord Files Reading a large XML file Reading multiple files at once File Group Reader Selective Reading and Postprocessing Shared Directory access Writing a large XML file Help, my Dataflow consumes so much memory! How to call a function returning multiple parameters How to delete records? How to split a comma separated String into multiple row s? Identify a Bottleneck in a Dataflow Display Optimized SQL Monitor Log File The lookup tuning The Stepped Execution Carrying Attributes Where is the bottleneck now ? Thread Names in the Monitor Log Identify long running dataflow Installation and Architecture Example 1 - Three parallel projects Example 2 - One project w ith three developers Example 3 - Remote Development Architecture Details Where to put the Jobserver Where to place the Repos Naming the database accounts Moving to Prod via Designer - Push into Target Repo Moving to Prod via Designer Export into ATL File Central Repository - Yes or No Creating a Central Repo



Creating a Central Repo Using a Central Repo Installation Checklists

SAP Community Network Wiki - Enterprise Information Management - EIM Home

Prerequisites w hen installing Designer Installing Designer Creating a Repository Prerequisites for Window s Jobserver Prerequisites for Unix Jobserver Installing Window s Jobserver Installing Unix Jobserver Loading log files (Error, Trace, Monitor) into a table Multiple Codepages Terminology The different Softw are Layers Data Integrator example ODBC connections from a Linux (or UNIX) jobserver To configure Teradata ODBC on Linux and Unix To configure DataDirect ODBC on Linux and Unix Oracle CDC The Publisher and the Subscriber Setting Up a CDC Environment Create a CDC Datastore Creating the Job Oracle Hints and DI Parallel vs sequential Execution Read from Excel Realtime at a Glance Batch or Realtime? Batch vs. RealTime Flow s The Nested Relational Data Model NRDM What is it (good for)? Master-Detail Table to NRDM An additional Hierarchy Level Unnest or NRDM join One table to NRDM Separating a NRDM node Building an XML String Memory requirements of NRDM Realtime Objects in batch and vice versa The RealTime DataFlow Acting as Server Building the DTDs The Job and the DataFlow Caching and RealTime How to test Realtime Jobs Setting up the Service Sending Messages to the Service Clienttest Utility Connecting to a WebServer Setting up as WebService The Client Flow Setting up the WebserviceAdapter (pre DI 12.1) Creating a datastore of type WebService (DI 12.1 and higher) Calling Webservices Guaranteed Delivery Recovery Scheduling Using other schedulers Building Job Chains Event based scheduling Scheduling via WebServices Using the SAP scheduler Sharing Caches SQL Server 2008 CDC SQL Server 2008 CDC - Transactional DataFlow



DataFlow DataFlow

SAP Community Network Wiki - Enterprise Information Management - EIM Home

SQL Server 2008 CDC - Delta SQL Server 2008 CDC - Impact on source System SQL Server DeadLocks SQL Server Identity column Staging tables automated Cases w here an additional Data_Transfer is added automatically Performance of DataServices Caching in DI DI Caching Example Pageable Cache and DSConfig Pageable Cache and Sort Order Caching and Execute as separate Process DataServices Performance example DataServices Performance example ETL Speed DataServices Performance example - Details DataServices Performance example - DWH tasks missing DataServices Performance example - Why does it not scale? Performance of the Customer Dimension example Customer SCD2 Initial Load Customer SCD2 Delta Load Customer SCD2 Initial Load w ith SQL Customer SCD2 Delta Load w ith SQL Performance of the Material Dimension example Material Dimension Initial Load Material Dimension Delta Load Material Dimension Initial Load w ith SQL Material Dimension Delta Load w ith SQL Performance of the Order Fact example Fact Initial Load Fact Delta Load Fact Initial Load w ith SQL Fact Delta Load w ith SQL Performance w hen DataServices is on a separate server Customer SCD2 Initial Load DataServices separate Fact Initial Load DataServices separate Data Quality Performance example Address Cleanse and Geocoding Data Quality Performance example - Address Cleanse Details Data Quality Performance example Match and Data Cleanse How to use the DataServices sizing dashboard Degree of Parallelism DoP and Partitions High Performance Loads w ith Oracle Inserts vs. Updates Speed up Updates Speed up Inserts Speed up Inserts Part 2 Loads and Indexes Parallel processing Putting it all together How to lookup a row Database Join DI Join



SAP Community Network Wiki - Enterprise Information Management - EIM Home

I w ant to lookup in a selected dataset, not just a table lookup_ext and constants lookup, lookup_ext, lookup_seq w hat is the difference?? multiple lookups sql function lookup_ext() inside a custom function Calling a stored procedure Dynamic Lookups Joining tables in the engine Monitor Sample Rate Myths about ELT tools is fastest If there is no database link Database is faster for joining data PL-SQL scripts are faster than any ETL tool Database Links Implementing lookups Nested SQL SQL for a Slow Changing Dimension Having tw o target tables Performance characteristics at customers Installation steps for the Benchmark Installation for Oracle Installation for SQL Server Installation for others Monitoring Results Results Version 1.0 Results w ith different DI Versions Test Details DF_Benchmark_read DF_Benchmark_API_bulkloader DF_Benchmark_regular_load DF_Benchmark_single_thread DF_Benchmark_lookup_DOP1 DF_Benchmark_lookup_DOP10 Performance of Functions Performance of nesting and unnesting Performance of Reader, Engine and Loader Source-Query-Target Without any options With Bulkloader turned on Ignoring Reader and Loader Ignoring the Loader With API Bulkloader turned on Performance of Selfjoins Performance of Transforms Generation Transforms Date Generation Transform Row Generation Transform Streamline Transforms Case Transform History Preserving Key Generation Map CDC Operation Map Operation Merge Transform Pivot Query (simple) Validation Streamline Transforms w ith (SQL) overhead SQL Transform Table Comparison (row by row setting) Table Comparison (sorted input) Cached Transforms Hierarchy Flattening Query w ith distinct

Query w ith group by



SAP Community Network Wiki - Enterprise Information Management - EIM Home

Query w ith group by Query w ith order by Table Comparison (cache mode) Other Transforms Effective Date Reverse Pivot Multiple Transforms w orking in conjunction Loading a table w ith surrogate key Slow Changing Dimension Type 2 Slow Changing Dimensions and Deletes Data Quality Transforms CountryID - all in one line CountryID - City centric CountryID - Country centric CountryID - Multilines Data Cleanse - Name Parsing Global Address Cleanse - EMEA Engine Global Address Cleanse - Global Engine Global Address Cleanse - US Engine Global Suggestion - Lookup City Match Consolidate Transform Match Consolidate - Household Data User Defined Transform - Python Pushdow n not w orking Row creation time The Impact of Number of Loaders Impact of Number of Loaders (Oracle) Impact of Number of Loaders (SQL Server) The impact of the CommitSize The impact of the CommitSize (Oracle) The impact of the CommitSize (SQL Server) What is better Table Comparison or AutoCorrect Load? Autocorrect Load Pushdow n Example SAP Topics Overview SAP Interfaces Direct SQL ABAPs RFC ReadTable Extractors RFC-BAPI IDOCs Connecting to SAP Chosing the Transport Method direct_dow nload transport method ftp transport method shared_directory transport method custom_transfer transport method Reading via ABAP How to read the ABAP How to execute the ABAP Moving to Production Moving ABAP to Production (DI 12.1) Common Questions Custom ABAP Transform Calling functions inside the ABAP Reading R3 Hierarchies Reading via RFC Read Table Using Extractors as Source (Data Services 4.0) Releasing Extractors for use by the ODP API Importing Extractors into the Datastore What Extractors to use Identify the type of Extractor Building Dataflow s w ith Extractors General considerations about Extractor based delta dataflow s



SAP Community Network Wiki - Enterprise Information Management - EIM Home

Extractor based delta dataflow s The Extractor date-time field When does the Extractor start collecting the delta? Extractor RecordMode Dataflow s for each Extractor Delta Process Type Dataflow for Extractor Delta Process Type A Dataflow for Extractor Delta Process Type ABR Dataflow for Extractor Delta Process Type ABR1 Dataflow for Extractor Delta Process Type ADD Dataflow for Extractor Delta Process Type ADDD Dataflow for Extractor Delta Process Type AIE Dataflow for Extractor Delta Process Type AIED Dataflow for Extractor Delta Process Type AIM Dataflow for Extractor Delta Process Type AIMD Dataflow for Extractor Delta Process Type CUBE Dataflow for Extractor Delta Process Type FULL Dataflow for Extractor Delta Process Type NEWD Dataflow for Extractor Delta Process Type NEWE The Extractor does not contain all the data needed No ODP API and Extractors are show n still?? How is this possible? Debugging DataServices issues w ith Extractors (SAP internal only) Extractors w ith DataServices Monitoring Administration of Extractors (Data Services 4.0) Calling RFCs-BAPIs Reading SAP BW Configuring SAP BW Open Hub Destination Configuring SAP BW Open Hub ProcessChain Reading from an Open Hub Destination Openhub Common Questions Loading BW Setup BW - DataServices communication Run the DataServices 3.2 RFC Server Run the DataServices RFC Server Prepare a BW InfoSource for Loading via DataServices Build the BW Load Job Configure the Load Job in BW (DS 3.2) Configure the Load Job in BW BW Load Job and datatypes Loading BW 7.x DataSources Receiving IDOCs Configure SAP to send IDOCs to DI Building the RealTime DataFlow Configure WebAdmin for IDOCs Testing the IDOCs Sending IDOCs Function Example READ_TEXT Function Example READ_TEXT ABAP w rapper function Function Example READ_TEXT RFC enabled Data Quality Continuous Monitoring Data Assessment



Data Assessment Data Cleansing

SAP Community Network Wiki - Enterprise Information Management - EIM Home

Address Reference Data Installing the Address Dictionaries CountryID Transform Data Cleanse Transform Data Cleanse Transform (Data Services 3.x) Global Address Cleanse Transform Installing and using the EMEA Engine (Data Services 3.x) Installing and using the US Engine Chinese Pinyin Fuzzy Search feature in Global Address Cleansing Name, Title & Firm Cleansing Packages Real-Time Address Validation GAC Suggestion Lists (Data Services 4.x or higher version) Global Suggestion Lists Transform (Data Services 3.x) US Regulatory Address Cleanse Transform DSF2 Walk Sequencer Enhance Geocoder Reference Data Geocoder Transform Directory Data Services 4.x Geocoder Transform w ork w ith US Tomtom Directories US 2010 CENSUS DATA Upgrade in SAP GEO Directories US GEO Directories Dow nload and Setup Geocoder Labs Add Location Aw areness Perform Address Geocoding Perform Geo Spatial Search Geocoder Options Input fields Output fields POI and address geocoding Geocoding scenario1 Geocoding scenario2 POI and address reverse geocoding Reverse geocoding scenario1 Reverse geocoding scenario2 POI Types Understanding your output What's new - Data Services 4.1 Geocoder transform features Match and Consolidate Associate Transform Consumer Householding Match Strategy Corporate Householding Match Strategy Match Transform Multinational Consumer Match Strategy FIM HANA Prerequisites SAP Data Services - SAP Business Suite Data Extraction Options ABAP Application Layer Data Sources (Extractors) Extractors - Full Refresh Extractors - Source-Based CDC Extractors - Target-Based CDC Tables Insert Only Insert Only - Full Refresh Insert Only - Target-Based CDC Insert Only - TimestampBased CDC



Based CDC

SAP Community Network Wiki - Enterprise Information Management - EIM Home

Updateable Updateable - Full Refresh Updateable - Target-Based CDC Updateable - TimestampBased CDC Direct RDBMS Connection RDBMS - Full Refresh RDBMS - Source-Based Change Data Capture (CDC) Native RDBMS CDC Timestamp-Based CDC RDBMS - Target-Based CDC SAP HANA Bulk Loading Table Creation Before Data Services Job Execution Import New Table Editor SQL Statement During Data Services Job Execution Metadata Text Data Processing Configuring Extraction Options How to Customize Rules Mapping Input and Output Fields Recommendations on Best Practices Enterprise Inform ation Managem ent Use Case Wiki Data Integration and Data Quality Use Case Data Migration Use Case Data Services (DI, DQM) and Information Stew ard Webinar Series EIM Use Case Submission guidelines Enterprise Content Management Use Case Event Processing Use Case Information Governance Use Case Information Lifecycle Management Master Data Management Use Case Data Quality Managem ent SDK Documentation Information and Dow nloads DQM SDK Code Samples DQM SDK Sample Transform Configurations Using Customized Cleansing Packages w ith DQM SDK

Child Pages (10)

Copy of RapidMarts Data Quality Management for Enterprise Apps Data Quality Management SDK Data Services Data Services Wiki - Template EIM Staging Area Enterprise Information Management Use Case Wiki Information Steward Postalsoft RapidMarts

Follow SCN


Contact Us SAP Help Portal Privacy Terms of Use Legal Disclosure Copyright

SAP Community Network Wiki - Enterprise Information Management - EIM Home