Professional Documents
Culture Documents
H/W Requirements
1. System Requirements Server Disk Space: 1.3 GB Repository: Disk Space: 150 MB of database space* RAM: 1 GB (Recommended) Client: Disk Space: 300 MB RAM: 512 MB (Recommended) Server: UNIX (Sun Solaris, HPUX, AIX) or LINUX (Red Hat Linux, SUSE Linux) or Windows (DEC or Intel).
RAM: 1GB (Recommended) Client: Disk Space: 256 MB RAM: 512 MB (Recommended)
Platform support
Server: Windows NT, IBM AIX, HP-UX, Sun Solaris, COMPAQ Tru64
2. Administration
Job Monitoring
Can be monitored using DataStage. Monitor window displays summary information about relevant stages in a job that is being run or validated. And also displays link information, status, Number of rows processed, Start time, Elapsed time and percentage of CPU against each stage
Logging mechanism
Workflow manager allows for centralized administration of all sessions and batches. Repo Mgr. Allows for centralized administration of all users, groups and privileges. Monitor Window keeps displaying the state of a mapping/job that has been scheduled to run. The various stages are: scheduled, initializing, running and Logs the details about the job User can set Trace level to execution in log window through log level of details. The trace DataStage Director. But not levels are - Normal , Terse, able to capture log details in Verbose Data and Verbose separate file. Also provides Initialization. It can be set detailed message for each event in either through widget level in the job log file through Event Detail Designer or at session level window. We can write into a logfile in WorkFlow Manager. using in-built routine "DSJobReport"
DataStage Director can abort and restart the job at any point of time. Does not support ETL Process in "Recovery Mode"
User can Abort at any point in time . Restart and recovery is built in to the tool and invoked by a checkbox.
Supports IBM DB2 UDB, Informix Dynamic Server, Informix Red Brick, Microsoft SQL Server, Oracle, Sybase Adaptive Server, UniVerse, UniData, ADABAS, VSAM/QSAM/ISAM, IMS, IDMS, Teradata, Flat Files, Complex File Format (CFF) MQ Series, FTP, ODBC and OLE/DB
ODBC, Flat file/XML, MS Excel, COBOL files Also supports databases Oracle, Sybase, SQL Server, Informix, Teradata and DB2. PowerConnect support for AS/400, SAP R/3, Siebel, IBM MQSeries, PeopleSoft & Mainframe systems.
Supports for Oracle OCI , Native plug-in for Sybase RedBrick,DB2 Supports for Oracle, Informix, Sybase Adaptive Server, Sybase Adaptive Server IQ, MSSQL, Redbrick, and IBM DB2 UDB
Support for Oracle, Sybase, Informix, DB2, SQL server,Terrdata. Supports for Oracle, Terradata, Informix, Sybase, MSSQL and DB2
Supports moving the system from one environment to another by using "import" and "export" utilities.
4. Deployment Facility The System can be migrated from one environment to another by copying the Repository folders available in Repository Manager or throgh Mapping/Session/other objects xml import/export
Support for multi-user development Incremental load Supports Multiple Target Loading
Yes
Yes ,Relational metadata ensures concurrency while 5. Loading strategies supported Supports Incremental Load supported for aggregation. Multiple Targets can be loaded Multiple targets of same in the same job/batch RDBMS can be loaded in the same mapping
Supports
Yes, Mainframe & AS/400 Data can be accessed Via PowerConnect for Mainframe/AS400
Ease of Design (GUI or otherwise) Ability to automatically generate source code (SQL, C, COBOL) Built-in scripting language (4GL,Basic) No of available transformation functions
Simple GUI (DataStage Designer) for Designing Transformation Supported. Source code is generated in BASIC language for each and every transforer stage in a Job. Supports BASIC
Does not support Code generation Informatica Transformation Language (TX) ,C++ Around 60 functions (Numeric, Date, Conversion, Scientific and String)
Supports conditional structures in the form of if then else in a Transformation. Also support CASE statements through router transformation.
Support for looping the source row (For While Loop) Ability to call Exit routines (SQL, UNIX Apps, NT Apps)
Support for comparing the Previous record. (Row Comparison Routines) Supports
Supports for comparing immediate previous record using variable ports. Does not Supports
Aggregation
Supported in-built DataStage Has built-in functions and Functions: Minimum, Maximum, supports Aggregator Count, Sum, Average and transformation. Standard Deviation
Supports only through Custom scripts. Does not have a wizard to do this
Supports Full history, recent values, Current & Previous values programmatically as well as through wizards. Support through Lookup table
Reference (Lookup) Table Supports through in-memory support hash files and ODBC look-up is also possible.
Supports by Sequence Supports (Sequence generator Transformation) Data conversion functions In-built conversion functions are Character, Numeric, available. Scientific, String Invoking Stored Procedures (SPs) Database functions Supports It can be used in source and target DB stages but cannot be used directly in transformation. But DataStage in-built functions can be used. Supports Database functions can be used directly in transformations (like Source Qualifier etc.)
Rejected Records
Supports through Custom scripts in pre session.Supports through Before & After SQL Executiion property and the Stage Level Can be captured in a reject file
Can be captured
Filters
Supports by Constraints and Rejects Can be called Able to specify the flat file format through DataStage Designer. Import the Flat file format using Manager Supports basic debugging facilities for testing. Able to set and edit break points based on number of rows and Expression. Able to watch each column values before and after transformation through "Add Watch" option
Supports through Filter & Router Transformation Supported Able to specify the flat file format through Source Designer Supported through debugger wizard in PowerCenter Designer. Able to set and edit break points based on number of rows and Expression.
API function calls Flat File format (Unix, DOS Style) Debugging Facility
Able to view SQL Statements at stage . (For example, if the transformation includes loading of data to a table from flat file, Developer can view the equivalent INSERT statements in DataStage Designer) Supports. DataStage Manager allows creating user-defined functions, Macros, Routines and Transforms. Also supports Shared Containers which can be reused across jobs Able to set limit on number of rows processed and warning messages in Job- Run option through DataStage Director
SQL Statements can be viewed using Generate SQLs and Update overrides available in Source Qualifier and Target respectively.
Reusability functions
Supports. Able to set limit on number of rows processed. Can set warning message at row level.
7. Application Integration Functionality Yes, Supports MQ Series Supported thru PowerCenterRT, has interface for MQSeries,TIBCO,JMS and WebMethods Key Features Bi-directional Integration Integrates real time data (from messaging systems, mainframes, RDBMs, applications), its meta data and historic data in EDW. Guaranteed delivery Ensures the message data persists till target confirms the successful consumption. Rollback & Recovery Robust mechanisms for rollback and recovery that prevents duplicate data loading during recovery. Supports transactional based loading ordered based on key relationships and constraints using source based / user In Datstage 7 and above Support for XML sources & supports XML targets 8. Meta Data Supports CWM Metadata Exchange standards
Support for Metadata standards (OMG/CWM) External acquisition /Design Tools/Business Intelligence tools
Supports. DataStage Supports through MetaData accomplishes Meta data Exchange (MX2) for ERWIN, integration with other data tools PowerDesigner. Supports SAP
Ability to view & navigate Not Available with core product. metadata on the web To be bought as MetaStage. Web based application not available directly
The repository information can be viewed on web with the help of Metadata Reporter tool Key Features: Data Linage & Where used analysis - Graphical representation via Web browser Process Validation and Impact Reporting at run time Data Access from disparate Sources Regulatory Compliance by Audit Trail Can be customized by writing database Views
Ability to Customize views of metadata for different users (DBA Vs Business user)
Not Available
Reverse engineering of the input schema Metadata repository can be stored in RDBMS
Supports
In latest Datstage Hawk Yes release(IBM) we have the option to crate repository in RDBMS. 9. Performance Parameter Controls
Parallel processing
DataStage engine supports multiple CPU environments by automatically distributing independent job flow across multiple processes. Only availabe in Datstage Enterprise Edition not in Server Edition.
Grid Option delivers costeffective scalability utilizing grid computing environment. By determining parallel execution plan at run time, the amount and frequency of data mapping modifications are minimized in case of node addition/removal from grid. Key Features: Limited degree of High Availability Sessions on Grids Adaptive Load Balancing Dynamic Partitioning
Caching
Advantages : Continuous uptime through self monitoring of Power Center services, ensures seamless fail over and guaranteed flexible recovery. Can execute optimal parallel sessions by dividing data processing into subsets, which run in parallel and are automatically partitioned reliably and optimally among available CPUs in a multiprocessor system. No caching on DataStage server In-memory Server-side caching
Ability to monitor session Able to monitor Job execution Able to monitor Job execution load performance and performance through and performance through DataStage Designer through WorkFlow Manager. view Monitor option. 2. Able to monitor Job execution and performance through DataStage Designer through view Perforamce Statistics option.
Optimization of execution 1. Use job Sequencer to path Sequence jobs in the optimized path. 2. Optimization at the job level can be done using the Buffer Properties. 3. Optimization at the Transformation level can be done using the Link Partitioners and Collectors 4. Can define array size at source stage. Tuning Metrics Able to view Number of Rows processed, Start Time End Time and percentage of CPU for each event in Log window through DataStage Director Level of automation Yes with DataStage TX does the tool perform auto-optimization of its jobs, does it perform load balancing of the server tasks?
Session optimization is possible through concurrent batches, reducing error tracing ,partitioning sessions, tuning session parameters.
User can view the statistics using workflow manager and Repository Manager
Pushdown optimization option provides flexibility to process the data transformation either within a source or target or through power center server. Advantages : Dynamically optimizes mapping performance according to runtime 10.demands, and maintenance Support peak processing User /Group privileges/permissions Reusable transformation and mapplets. Populates the design Pmcmd server interface for command line Pmrep repository interface for command line Backup/Recovery of metadata supported through menu interface as well as command line Yes
Maintenance include: Setting up users Deleing, moving and adding DataStage projects Cleaning up project files Purging job Yes Availabe. Ability to Execue the Jobs and read the Metadata from the Command Line When you take export of Jobs, metadata also included in the export. Export generates a text file, which will be interpreted by Import. Export generates a xml or dsx file. Yes
Time-Based
11. Job Controlling & Scheduling Supports Sessions and Batches can be DataStage Director is used for scheduled to run on demand, scheduling Jobs and Batches run once, run every given Available Scheduling frequency minutes / seconds or in a options: customized manner. Today, Tomorrow, Every Day, Next and Daily with Specified Time option.
Supports. Able to add jobs sequentially in a batch. And also supports a pre-defined order in which it should execute. Can code manually in a batch job to execute jobs in a predefined sequence or in a sequencer job. Supports. Event Based Triggering is possible (Like, Wait for a File to arrive or disappear.).
Physical order of sessions is the order in which it executes. However u can define the run criteria of a session (on success of previous session) Supports Event based scheduling with the help of indicator file. The file can be generated through shell command, scripts or batch program.
Event based
Supported Using the Supports row based commit Transaction size on the Target interval on Targets & Source. Stage Existence of rollback Controlled using the Transaction Not Supported functions (when bad data Size in Target Stage has been loaded) Support for error recovery Does not support error Supports error recovery recovery. Ability to trace & debug Supported Supported scheduled events Support for Pre session & Supports. Can write code that Supported Post Session events shall execute before an session occurs & after it occurs Alerts like sending mails Supports Supported.
Security
Able to assign access to user groups at project level through DataStage Administrator. Does not support Security at job/mapping level. Cannot grant access on individual user, can only to user groups.
12. Security Object and Operational level security. Security implemented through User groups, Repository Users/Privileges/Folders and locking.
Roles can be assigned at Project User defined ,Administrator Level through DataStage Administrator Roles & responsibilities: DataStage Developer: Setting up a project, Developing a job, Programming, Debugging, Compiling, Releasing a job, Importing , Exporting and Packaging jobs. DataStage Operator: Running compiled Jobs, creating batches and Monitoring jobs and log window. Production Manager: All the access to Datastage developer plus can protect or unprotect a project.
13. Other features Part of a end-to-end suite DataStage is part of End-to-end Informatica Data Integration data integration suite Suite Support for ERP packages Supports Support for ERP source & targets. PowerConnect for SAP R/3, PeopleSoft & Siebel systems.
Predefined Cleaning or Supported through QualityStage Data Cleansing and match Intgeration with Clenaing feature helps to standardize, Tools validate, enhance, and correct data records. This also provides algorithmic data matching capability to identify relationships between data records for deduplication or group-based processing. Key Features : Standardization & Parsing Identifies, verifies, and standardizes free-form text data elements through configuration business rules Data Matching Correct the records against accurate secondary sources. Leverage Power centers various connectivity options, parallelization and grid computing capabilities when cleansing and matching the data Data federation provides on demand access to data from operational source systems without data being moved
Ab-Initio
ements Server (Co>Op System) Disk Space: 118 MB Server Disk Space: 55 MB Server Disk Space: 850 MB(Windows) , 1100 MB (Unix, Linux)
Information Not Available. Client: Disk Space: 40 MB RAM: 64 MB (Recommended) Server: Compaq Tru64 Unix DIGITAL UNIX HP-UX IBM AIX NCR MP-RAS Red hot linux IBM/Sequent DYNIX/ptx Siemens Pyramid Reliant UNIX Silicon Graphics IRIX Sun Soloris Windows NT and Windows 2000 Client: Windows NT or Windows 2000 , Unix.
128 MB of RAM Client: Disk Space: 55 MB RAM: 128 MB (Recommended) Server: HP-UX, Sun SPARC Solaris, IBM AIX, Compaq Tru64 UNIX, Windows NT, or Windows 2000.
768 MB of RAM , 1GB Page file size. Client: Disk Space: 850 MB RAM: 128 MB (Recommended)
Server: Microsoft Windows,Microsoft Windows (64-bit Itanium),Linux x86,Linux Itanium,Linux x86-64,Linux on Power,Solaris Operating System (x86),Solaris Operating System (SPARC),HP-UX PA-RISC (64bit),HP-UX Itanium,AIX5L Based Systems (64-bit),IBM zSeries Based Linux
ion
We have EME (Enterprise Meta Environment) to monitor thejobs.The host computer to which the GDE(graphical development environment) users Connect to, acts as a central point of control for the Co>operating System installed on Several machines. Co>operating system of Ab initio monitors the jobs and issues reports. Job controlling is done within the script. Priority Availability of logs in a format that is easily readable by other applications (preferably text). Logs can be re-directed to separate files.
Can be monitored using Decision Stream. Monitor window displays summary information about relavant stages in job.
ETL jobs can be scheduled and monitored using Schedule Editor. A manual refresh is required to see the status of the run.
User can set Trace level to log level of details. The trace levels are Progress,Detail,Internal,S QL,Executed SQL,Variable and User. It can be set either through build level or job level in Decision Stream Designer.
Unsuccessful job Executions are re-covered automatically. No manual intervention is required unless there is a OS crash.
The Oracle Warehouse Builder Repository Browser is a browserbased tool that generates reports data stored in Oracle Warehouse Builder repositories. Using the Repository Browser, you can view: Detailed information about the design of a repository. Reports are generated from data stored in the Warehouse Builder repositories. Reports that provide access to both high-level and detailed ETL runtime information. This information includes the following: Timings for each mapping and Users can Abort at any point of time. User can Abort at any Restart and recovery is not built-in to point of time . Restart and recovery is built in to the tool. the tool and invoked by a checkbox.
Supported
Ab initio supports heterogeneous sources and heterogeneous targets. IBM DB2, DB2/PE, DB2EEE,UDB,IMS Oracle Informix XPS Sybase Teradata MS SQL Server 7 OLE-DB ODBC are supported
Supports Oracle, Informix, Sybase, MS SQL Server, DB2, Teradata, ODBC, or Flat Files. Other Target Support: Cognos PowerPlay, Impromptu, or Architect, or other OLAP Servers such as Microsoft SQL Server Analysis Services.
Oracle db 8.1, 9.0, 9.2, 10.1, 10.2, SAP R/3 3.x, 4.x, Oracle E-Business Suite, PeopleSoft 8, 9, Delimited and fixed-length flat files, Any database accessible through Oracle Heterogeneous Services,including but not limited to DB2,DRDA, Informix, SQL Server,Sybase, and Teradata. Any data store accessible through the ODBC Data Source Administrator, including but not limited to Excel and MS Access.
Support for Oracle, Sybase, Informix, DB2, SQL server,Terrdata. Supports for Oracle, Informix,Redbrick and DB2
acility The System can be migrated from one environment to another by copying the Repository folders available in Sand Box or Manual movement. Multiple environments can be manage through import and export utilities. Supports moving the system from one environment to another by using "import" and "export" utilities.
Yes
Yes.supported through source code control, Supports Multiple Targets can be loaded in the same job.
Yes
Supports
Supports
Supports
mation GUI based. GUI Based (Decision Stream Designer) GUI based
Supported. Source code is Does not support Code generated in Unix Shell generation Programming language/C/C++. Yes. Ab-Initio Has its own Language called DML( Data Manipulation Language) More than 200 Functions are available.(String, Date, Error, Miscellaneous etc) Yes.Decision Stream has its own Scripting language called Decision Stream Scripting language. Around 112 functions(Conversion,Con trol, Logical,Mathematical, Mamber, Text, Date and User defined functions) Supports conditional structures in the form of 'if then else' and CASE statements.
Yes.
Supports
sophisticated Built in Aggregate Supported througj Built in Functions , Components are Aggregatiom methods: Sum, available. Max, Min, Count, Avg, Var, Stddev, First, First Non-null, Last and Last non-null Needs Programming, no Wizards.
Point and click facility Supports Full history, recent automatically creates and values, Current & Previous maintains slowly values through wizards. changing dimensions. Support through Lookup table Support through Lookup table
Lookup File Support and Join with DB will act like LOOKUP table
Supports(Assign Key Component In-built conversion functions are available Can be Invoked through a wrappered Shell program. Database functions are not listed but can be used in free hand SQLs
Supports by Sequence generator In-built conversion functions are available Supports Database functions can be used directly in transformations
Supports by Sequence generator In-built conversion functions are available Supports Database functions can be used directly in transformations
Supports through Custom scripts in start script and end scripts. Using reject files, rejected data can be captured. Supports through components.
Supports through Custom Supports through Custom scripts in pre session scripts through process flows
Can be captured
Can be captured
Yes Yes.
Supports through Level Supports through Filter & Filter and Output Filter in Router Transformation Fact deliveries. Can be Called Can be called Able to specify the flat file Able to specify the flat file format through SQL Text format . Designer Supported through debugger wizard in OWB Mapping designer. Able to set watch, define/edit test data, etc.
Supports basic debugging Suported through debugfacilities for testing. related variables to save Able to add watch files and able to the execution log file. to watch each field and record.
Yes
Able to view the SQL statements through SQL Term and SQL Helper
Yes. Supports .
Supports Supports in the form of reusable confirmed dimensions, user-defined functions and Jobs. Supports. Able to set limit on number of rows processed. Can set warning message at row level. Supports. Able to set limit on number of rows processed. Can set warning message at row level.
Yes
n Functionality Yes
Supports XML.
Not Available
Not Applicable
Not Available
Not Available
Supports
No
Yes
Yes
ter Controls
Ab-Initio supports multiple CPU environments by automatically distributing the processes across multiple nodes.
Supports partitioning Decision Stream ETL engine supports partitioning tasks or jobstreams to use multiple CPUs which can be performed in sequence or parallel.
In-memory Server-side caching Able to view the no. of rows processed by every step in the process, the process start/end time in the process in GDE.
Supports Refernce data caching on demand. Able to monitor job execution and performance through Execution log file.
In-memory Server-side caching Able to monitor job execution and performance through Execution Schedule Report
Not Available
Able to view Number of Rows processed, Start Time End Time and percentage of CPU for each event in Log window through Track Detail Option / in the Audit file No
User can view the statistics through execution log which provides basic tuning information Not Supported
Not Supported
Yes
Available.Full command line access delivers flexible transformation package integration. Backup/Recovery of metadata is supported through menu interface as well as command line Yes
Available
Not Applicable
Yes
Yes
Jobstreams and Builds can be scheduled to run on demand,run once,run every given minutes/seconds or in a customized manner through crontab in WRQ Reflection for HP. Supports. Able to add jobs sequentially in a batch. And also supports a pre-defined order in which it should execute.
Process flows can be scheduled to run on demand, run once, run every given minutes / seconds or in a customized manner.
Not Applicable
Not Applicable
Supports row based commit interval on Targets & Source. Not Supported
Supports row based commit interval on Targets & Source. Not Supported
Supports row based commit interval on Targets & Source. Commit or Rollback can be set using Manual Commit control. No Built-in Support Supported Supported Pre and Post mapping events.
Yes. Can call other applications. Supported through Alert or Email node in jobstreams. Personally Controlled Personally Controlled
Supported.
Object and Operational level security. Security implemented through User Roles & Folders.
NA
NA
ures Part of Cognos suite of products Support for SAP R/3 systems using the add-on Cognos Decision Stream connector for SAP R/3 product Part of Oracle Business Intelligence 10g Suite Supports SAP, Oracle eBusiness Suite and PeopleSoft ERP
It supports Trillium
Not Supported
Color Codes Excellent Very Good Average Poor Feature/Measure Meta data Modeling Performance High Volume Data Performance Across Server queries Diagnostics Development / Maintenance ease Security Report Look and feel Dashboards Licensing / Pricing Repository Report Level Formatting Migration (e.g. QA to Prod) Alerting Sybase database support Source control User Learning / Training Market Share Stability Rating Geography Fit for EDW Purpose Architecture/Scalability & Performance Implementation Support Integration of heterogeneous data sources Support for compliance and audit Integrated data management platform High developer/designer productivity Deep partner and third-party development community Familiar, easy-to-use skills Lower TCO processing formulae and sentences top n, bottom n, etc custom filters Exporting of reports Reports navigation Platform Client Tools Server Tools Report Processing Clustering Load Balancing Web Support Caching Authentication Object Security SAS OBIEE Business Objects
Source Connectivity Publishing Schedule calendar Export formats Delivery channels Alerts Events Report history Linked reports Drill down reports to n-level Report with subreport Cross-tab reports OLAP reports Multi-column reports Labels Maps Formulas SQL expressions Parameters Cascading parameters Pictures, lines, and figures Custom and user-defined functions Shared data sources Multiple data regions on a report (tables, matrices, charts, and so on) Top-N, bottom-N, top %, bottom % Drill-down on same page Multipass Data Mining Ad hoc Reporting Trending Indicators Multiple Report Export Option Documentation / White Papers Techinal Forums Scheduling of Reports Guage Control Reporting Support on PDA/Mobile Devices OLAP Analysis Support for 32 bit & 64 bit Machines Backward Compatability Integration with Office Role Based Dashboards/Reports/KPIs/Cubes Support for DOLAP Data Quality Predictive Analytics Text Analysis Support for ERP packages
Unified BI Architecture Market Proven Enterprise Scalability and Performance Reusable and Rich Metadata Layer Interactive WY SIWYG Web interface
Industrial Strength Multi-level Security Dynamic Report Personalization Centralized Enterprise Administration Seamless Microsoft Office Integration Heterogeneous Data Source Access from a Single Web Document Robust Enterprise Reporting Information Delivery Reporting Query OLAP Dashboards Architecture
Hyperion
MicroStrategy
Cognos
MS-BI