Informatica PowerCenter Data Profiling

The Informatica® PowerCenter® Data Profiling option furnishes thorough, accurate information about the content, quality, and structure of the data in virtually any operational system to enable effective data integration. With the Data Profiling option, development teams leverage Informatica’s unified data integration architecture and codeless development environment to discover and understand source data and easily create and leverage historic profiling performance metrics. By using the Data Profiling option, organizations can lower costs, speed development, and increase data quality across their data integration initiatives.

Benefits of Informatica PowerCenter Profiling
• Improve data quality over time with historical dashboards and reports • Reduce development iterations by leveraging reusable development assets • Ensure data is fit for its intended purpose by easily profiling source data and eliminating unfounded assumptions • Speed time-to-benefit through a single, fully integrated GUI and data integration environment that requires little training.

Assumptions Lead to Poor Data Quality
Understanding the content, quality, and structure of source data is critical to the success of any data integration effort. Manual steps to distill information about source data require exhaustive searches through external documentation, data dictionary definitions, copybooks, and tens of thousands of rows actual data, only to result in out-of-date and incorrect information. Rather than spending considerable time to understand “the state of the data” many project teams make quality assumptions and then simply dump data into the new ERP system, data warehouse, or target system often resulting in problem data. Reliance on assumptions or manually derived information about the data in operational systems propogates poor data quality, a target system unfit for its intended use, extended time to value—even failed data integration projects. Complete understanding of source data and the quality of the data in operational systems is critical for the success of any data integration or conversion effort. Data integration development teams need an automated method for exposing valuable information about their underlying source systems to eliminate assumptions. The solution should automatically generate reusable development assets to fuel data integration projects. In addition, a data profiling solution that enables ongoing profiling and captures historical trends and statistical information--and also provides visibility into this information through reports and dashboards-- can help organizations dramatically improve data quality over time.

The Data Profiling option automatically profiles any data accessible to PowerCenter. structure. data warehouse implementations. or enterprise resource planning (ERP) migrations. This capability is critical for any organization that wants to understand the effectiveness of data quality improvement efforts over time. quality. lower cost data integration. Unless enterprises take the time to understand the data available in their current and planned applications. eliminating upfront assumptions by building quality-based statistics. The Data Profiling option stores the metadata statistics for each run in a time-aware data warehouse. The Data Profiling option offers an Interactive Profiling mode that displays results immediately as well as a Batch mode that provides ongoing data profiling and quality metrics. By allowing organizations to profile data on an ongoing basis and to analyze changes to source data over time. and values over time. including those sources supported by PowerExchange™. enterprises must understand key metrics and trends in their data over time. such as Informatica PowerAnalyzer™. The Data Profiling option then autogenerates reusable development assets such as mappings and objects.“Understanding the data elements in their operational systems is particularly critical for enterprises starting e-business initiatives. Users can view results in line or through a business intelligence tool. they will end up with far too many databases to manage and will have problems with their data quality. It also ensures the ongoing accuracy of data integration logic as source data evolves. which provides reports and dashboards to illustrate changes in data content. the Data Profiling option supports ongoing efforts to improve data quality and ensures end user data confidence. Figure 1: “Right Click” creation of an automatic profile of customer data Improve Data Quality Over Time With Historical Dashboards and Reports Because data is not static. customer relationship management (CRM) strategies. Automatic Data Profiling The PowerCenter Data Profiling option offers the capabilities necessary to implement fast. which can be used for the initial data integration project and re-used for subsequent projects. . accurate. yet it is often ignored.” —Gartner Research Ensure Data Quality With Comprehensive.

Ensure Data is Fit For Its Intended Purpose The PowerCenter Data Profiling Option eliminates assumptions by uncovering accurate metadata statistics about the actual data in operational systems. or system architecture. the autogenerated mapping logic can be re-used in full or in part for subsequent data integration processes. the Data Profiling option leverages Informatica’s universal connectivity to provide analytical profiling capabilities into even the most complex or arcane system. Further. By automatically generating re-usable development assets. real-time. manual processes necessary to profile and integrate data by automatically generating mapping logic that extracts information about the source data. without converting complex data into relational or flat files.Leverage Reusable Development Assets and Reduce Iterations Codifying business rules for profiling and integrating data from operational systems can be a tedious and time-consuming process. The wizard-driven auto approach enables team members to gain valuable insights quickly and with minimal effort. eliminates error-prone steps in the integration process. allowing users to gather detailed metrics about their source data and capture exception rows for analysis prior to building an integration workflow. project requirements. While most profiling tools on the market require time consuming conversion of most sources into a relational or flat file. whether it is a relational database. Figure 2: Sample Profiling Data Quality Report of a Customer Data Source Optimize Performance Understand Even the Most Complex or Arcane Systems An enterprise engaged in a data integration project must be able to connect to and profile any source system. . and creates consistency across integration routines. By performing this profiling. The Informatica Data Profiling Option eliminates many of the iterative. organizations can enhance the quality of data delivered for business critical systems by uncovering data quality issues before initiating integration development. or mainframe legacy system. Users have a choice of auto or custom profiling for generating rules that drive profiling. the Informatica Data Profiling Option replaces the manual processes involved with profiling data. The custom profile feature provides users full control over the profile creation process. flat file. This unique feature allows users to profile source data regardless of the source system. CRM. ERP.

S.The warehouse in which results are stored also has an open architecture that provides access to any industry reporting tool or custom built data delivery system. and unlimited linear scalability to power the profiling process.A. in-memory caching. All other company and product names may be tradenames or trademarks of their respective owners.1179 www.04) .970. These tools add cycle time to the data integration process and limit the effectiveness of the profiling process. © 2004 Informatica Corporation. reduces the potential for an incorrect reading.800. PowerExchange. PowerCenter. allowing it to benefit from performance features such as optimized partitioning. Turning integration into insight. the Informatica logo. GUI-driven Data Smart Parallelism™. which enhances acceptance rates. Users accustomed to a specific report format or interface will receive results in a familiar manner. Informatica’s Data Profiling option is built on PowerCenter’s adaptive metadata driven engine.5500 ● Toll-free in the US 1. and PowerAnalyzer are trademarks or registered trademarks of Informatica Corporation in the United State and in jurisdictions throughout the world. All rights reserved.385.informatica. Informatica.19.5000 ● Fax 650. WORLDWIDE HEADQUARTERS 2100 Seaport Boulevard ● Redwood City. 6555 (10. CA 94063. high-volume throughput allows data integration teams to rapidly profile the entire set of source data. providing more accurate information about the data and reduced overall project risk.385. and increases the likelihood of success. Scalable. GUI-Driven Environment Speeds Time-To-Benefit Most standalone profiling tools suffer from sluggish performance because of architecture and connectivity limitations. Printed in the U. USA Phone 650.

Sign up to vote on this title
UsefulNot useful