You are on page 1of 14

Informatica Metadata Exchange Frequently Asked Questions

Q: What is Informatica Metadata Exchange? A: Informatica Metadata Exchange is a family of metadata exchange products for the Informatica Data Integration platform. The Informatica Metadata Exchange Software Development Kit (MX SDK) provides an open, bi-directional application programming interface (API) for metadata integration between the Informatica Metadata Repository and other metadata stores. Informatica Metadata Exchange tools leverage the MX SDK to provide metadata integration with data models, relational database catalogs, ERP systems and other metadata sources. Q: How is Informatica Metadata Exchange different from MX SDK? A: Metadata Exchange is the umbrella term applied to all of Informaticas metadata exchange products. MX SDK is the Metadata Exchange Software Development Kit, the enabling technology that the other Metadata Exchange products are built on. Q: Who should use Informatica Metadata Exchange? A: Metadata Exchange provides tools for Informatica administrators and developers. Administrators can use the Metadata Exchange products to help them manage their metadata in a more timely, accurate, and efficient manner. Developers can build applications that use the MX SDK to deliver metadata to business users. Q: How comprehensive is the MX SDK? A: The MX SDK supports most metadata objects in the Informatica repository including Repositories, Folders, Sources, Targets, Transformations, Mappings, Mapplets, Schedules, Workflows, Workflow logs, Worklets, Sessions, Database Connections, Metadata Extensions and many others. Q: Can I use MX SDK to update sources and targets? A: Yes. The MX SDK source and target objects are fully updateable. You can rename the object, change business names and descriptions, add, modify or remove columns, update indexes (targets only) and foreign key relationships and update metadata extensions. The only limitation is that MX SDK does not allow you to change the database platform for an existing source or target in this case you must delete the object and recreate it. Q: Can I use MX SDK to update transformations? A: Not all transformations can be updated using the MX SDK. MX SDK support for transformations allows developers to update the properties of their external procedure transformations and advanced external procedure transformations. Standard transformations are read-only in MX SDK and have to be updated using Designer. Q: Can I use MX SDK to update mappings? A: No. Because of the potential complexity of an Informatica mapping, the MX SDK mapping objects are read only. Mappings must be created using Informatica Designer. Q: Can I use MX SDK to find out when a workflow was run and if it succeeded? A: Yes. The MX SDK provides a rich object model for querying workflow, session task and transformation logs.

Q: Can I use MX SDK to determine data lineage? A: MX SDK can determine data lineage at a source or target level, but not at a field/column level. The source and target objects have methods for determining shortcut, source, target and mapping dependencies. The dependency information returned by these methods is the same as the dependency information that is visible in the Informatica Repository Manager. More detailed information about dependencies within a mapping is available when you use MX SDK or Designer to export the mapping as an XML file.

Q: Can I use Informatica Metadata Exchange to manage my own custom metadata? A: Informatica metadata extensions were introduced as a new feature in Informatica PowerCenter 6.0. You can add your own custom metadata extensions to sources, targets, mappings, transformations, workflows and other objects using the Informatica Repository Manager. These metadata extensions then become part of the repository metamodel and can be maintained using the Informatica Designer and viewed by end-users in the Informatica Metadata Reporter. These metadata extensions can also be read and updated by other applications that use the MX SDK. This means that Metadata Exchange for Data Models can be used to populate source and target metadata extensions from data models, database catalogs and other Informatica repositories.

Q: I am an Informatica Developer. Which Informatica Metadata Exchange tool is right for me? A: You need Metadata Exchange for Data Models. You can design and document your database schema using a data modeling tool, then use Metadata Exchange for Data Models to replicate your data model in the Informatica repository, creating either source or target definitions. As you refine your data models, your database schemas and your mappings, Metadata Exchange for Data Models helps you to detect and resolve metadata discrepancies that may cause your mappings to fail. The tools validation, change and dependency reports can help to identify the impact of schema changes. Used in conjunction with Informatica metadata extensions, you can document your design with rich metadata, including business and technical definitions, metadata and data lineage, update timestamps, volumetric data and extended metadata from your data models. Q: I am responsible for administering a data warehouse. Which Informatica Metadata Exchange tool is right for me? A: Metadata Exchange for Data Models will help to ensure that your source and target definitions are always consistent with the databases that they represent. If the database schema changes, validation, change and dependency reports will identify the impact of the change. Metadata Exchange for Data Models will also help to ensure that business and technical metadata is always current. If your data warehouse contains data that originates in a SAP or Peoplesoft ERP system, PowerConnect Metadata Exchange gives you tools for automated preventative maintenance and will ease the pain of managing an ERP upgrade or customization. This product dramatically reduces the work required to keep the Informatica repository current with changes in an ERP system by providing tools to manage ERP metadata. You can quickly and selectively identify metadata differences between the ERP system and the Informatica repository, determine the impact of the discrepancy and optionally update the Informatica repository to resolve the problem.

Q: I need to deliver business metadata to my business users. Which Informatica Metadata Exchange tool is right for me? A: Informatica provides the Informatica Metadata Reporter, a simple web-based reporting solution. If this does not fit your needs, many Business Intelligence tool vendors have prebuilt integration tools for delivering Informatica metadata to end-users (for more information see the Informatica Developers Network web site). These metadata integration components use either the MX SDK or MX Views, an earlier and simpler metadata integration technology based on database views in the Informatica repository. If your BI tool vendor does not have an integration offering, then you can use the MX SDK to build your own. The MX SDK comes complete with sample code to show you how. Of course, delivering metadata to end-users only has value if the metadata is complete and is current. Metadata Exchange for Data Models and PowerConnect Metadata Exchange provide you with tools for capturing the metadata from data models, database catalogs and ERP systems. Q: What is PowerPlug? A: PowerPlug is an earlier name for Metadata Exchange for Data Models (formerly PowerPlug for Data Models) and PowerConnect Metadata Exchange (formerly PowerPlug for ERP). Some third-party vendors have also used the term PowerPlug for their own metadata integration offerings. Q: What is PowerConnect Metadata Exchange? A: PowerConnect Metadata Exchange is a proactive metadata management tool for use with PowerConnect for SAP and PowerConnect for Peoplesoft. Data derived from ERP systems often underpins a data warehouse implementation but ERP systems are large and complex, containing thousands of tables and data structures and a change in any one can potentially cause the data warehouse load to fail. PowerConnect Metadata Exchange is used for preventative maintenance, to ensure that your Informatica ERP sources are always consistent with the ERP system that they represent. It quickly and selectively identifies metadata differences between the ERP system and the Informatica repository, determines the impact of the discrepancy and can optionally update the Informatica repository to resolve the problem. Q: Can I use PowerConnect Metadata Exchange with my PowerConnect for Siebel? A: YES Q: What is Metadata Exchange for Data Models? A: Simply put, Metadata Exchange for Data Models is a tool for capturing the metadata that describes relational databases, then replicating that metadata in the Informatica repository as either sources or targets. Metadata Exchange for Data Models can capture metadata from the market-leading database design tools as well as from database catalogs, Informatica repositories and from XML files that conform to the relational package of the OMG Common Warehouse Metamodel standard. An interactive and intuitive GUI provides fine-grained control of metadata replication as well as model validation, reporting and impact analysis. Metadata Exchange for Data Models saves much time and effort in the development stages of an Informatica implementation. You design and document your database schema using a data modeling tool (such as CA ERwin and ModelMart, Embarcadero ER/Studio, Sybase PowerDesigner or Oracle Designer), then use Metadata Exchange for Data Models to replicate your data model in the Informatica repository, creating either source or target definitions. As you refine your data models, database schemas and mappings, Metadata Exchange for Data Models helps you to detect and resolve metadata discrepancies that may cause your mappings to fail. Used in conjunction with Informatica business names, descriptions and metadata extensions, you can document your design with rich metadata, including business and technical definitions, metadata and data lineage, volumetric data and extended metadata from

your data models. For ongoing preventative maintenance and troubleshooting, built-in validation, change and dependency reports can help to detect and determine the impact of database schema changes. Q: I already use a Metadata Exchange tool. Should I upgrade? A: New versions of Metadata Exchange tools are usually released simultaneously with new versions of PowerCenter. If you plan to upgrade PowerCenter then you should also plan to upgrade your Metadata Exchange tools to ensure connectivity and compatibility. This is particularly important with major releases. If you are not planning to upgrade PowerCenter there may still be good reasons for upgrading Metadata Exchange for Data Models to a newer version. For instance, Metadata Exchange for Data Models 7.1 supports PowerCenter 7.1, is backwards compatible with PowerCenter 6.x and offers new interfaces for Embarcadero ER/Studio and CA ERwin ModelMart, finer grained control of comparison reports and a number of bug fixes and minor enhancements. If you use PowerCenter 6.x then the upgrade is recommended. Q: What data modeling tools are supported by Metadata Exchange for Data Models? A: Metadata Exchange for Data Models 7.0 supports CA ERwin 4.1, 4.0 and 3.5.2; CA Model Manager (ModelMart) 4.1 SP3; Embarcadero ER/Studio 6.0 and 5.5.x; Sybase PowerDesigner 9.5, 9.0, 8.0, 7.5 and 6.1; Oracle Designer 9i, 6i and 6.0 and Oracle Designer 2000 2.1.x. Support for Embarcadero ER/Studio and CA Model Manager (ModelMart) is new in this release. Q: I dont use a data-modeling tool. Do I need Metadata Exchange for Data Models? A: Apart from data models, Metadata Exchange for Data Models can also extract metadata from database catalogs, Informatica repositories and CWM files. The database catalog interface provides a powerful tool for detecting discrepancies between Informatica sources and targets and the corresponding database schema. If you detect changes, you can update the Informatica repository from the database schema, guaranteeing consistency. If your database platform supports schema comments (e.g. DB2, Oracle), then you can also capture these in the Informatica repository. Using the Informatica repository interface, you can connect to two different Informatica repositories and compare sources and targets, transfer business metadata data and migrate sources and targets between repositories. The XMI: Common Warehouse Metamodel interface provides a standards-based mechanism for metadata exchange, using the relational package of the OMG Common Warehouse Metamodel standard. Q: Informatica Designer can import database schemas. Why do I need the database catalog interface in Metadata Exchange for Data Models? A: Informatica Designer can use ODBC to create source and target definitions from a database catalog. However, if you already have the source or target in the Informatica repository, Designer has to delete the existing object, before creating a new one this invalidates your mappings. Metadata Exchange for Data Models can compare the database catalog with the sources and targets, report on differences and selectively create or update the corresponding Informatica repository objects. Metadata Exchange for Data Models also gives you far more control over the business metadata. It can capture database comments (if the database supports them), can generate business names from table and column names, and can selectively update just the physical metadata while leaving existing business metadata and mappings intact. If you define suitable extended metadata properties, you can also capture auditing information including the source database catalog and an update timestamp.

Q: Why would I use Metadata Exchange for Data Models to move metadata between Informatica repository folders when I can do this with Informatica client tools? A: Informatica Designer allows object definitions to be exported as XML files and allows the XML files to be imported. Repository Manager provides tools for copying folders between repositories. Metadata Exchange for Data Models complements these tools by allowing conversion of sources to targets and targets to sources (useful for staging tables and Operational Data Stores) and also allows migration between versions. Metadata Exchange for Data Models also allows selective migration of business names, descriptions and metadata extensions and mapping between them this provides a very useful tool for managing business metadata and for moving business metadata between development, test and production environments. Q: Can I update my data models from my Informatica repository? A: No. Data model interfaces are currently read-only and metadata exchange occurs from the data model into the Informatica repository. The reasons for this are primarily technical: until recently, data modeling tools did not provide an API that allowed the data model to be updated. Another significant reason is the need to define a metadata system of record: generally this is the data model. However, you can select two different data models at one time; all of the reporting features of Metadata Exchange for Data Models are available. This means that you can compare two different data models, even if the data models were created with different modeling tools. You can also select a data model and a database catalog and compare the data model with the database schema that it represents. Q: Do any interfaces support bi-directional metadata exchange? A: Yes. The Informatica repository interface is bi-directional and it can be used as both a metadata source and a target. The Common Warehouse Metamodel interface is also bi-directional and can be used for metadata exchange with any tool that supports the standard and the relational package. Q: What Informatica repository objects do the Metadata Exchange tools support? A: Metadata Exchange for Data Models can selectively update both Informatica source and target definitions. It can update all of the objects properties, including table, column and index names, physical properties (e.g. column datatypes), business names, descriptions and metadata extensions. PowerConnect Metadata Exchange works specifically with Informatica source definitions for ERP sources (SAP and Peoplesoft). It can update the physical metadata, ERP specific properties, business names and descriptions. Q: What is the CWM standard, and how do the Metadata Exchange tools support it? A: The Common Warehouse Metamodel is a vendor-independent metadata standard defined by the Object Management Group OMG). The CWM standard defines the key metadata required for data integration and defines a file format and mechanism for exchanging it. This standard supercedes the earlier Metadata Coalition MDIS standard and is supported by many key players in the industry. Informatica and Yaletown Technology Group are both OMG members. Informatica Metadata Exchange for Data Models provides an interface that can read and write XML files that conform to the CWM relational package. This means that metadata can be exchanged with any other product that implements the standard and contains relational database metadata.

Q: Can I run the Metadata Exchange products in batch mode? A: Because of the large number of objects in an ERP system, PowerConnect Metadata Exchange was designed specifically to support batch operation. A command file controls batch operation. The command file is configured using an interactive tool and specifies the ERP and repository connection parameters, the ERP sources in the Informatica repository that will be included in reports and updates, and the operation that will be performed. The name of the command file is specified as a parameter when the batch operation is started. Metadata Exchange for Data Models cannot be run in batch mode but customers have asked for this feature, so it may be offered in the future. Q: How is Metadata Exchange for Data Models licensed? A: Separate licenses are available for ERwin, ModelMart, ER/Studio, PowerDesigner and Oracle Designer. These licenses cover all versions of the modeling tool, even where different integration technologies are used. Special licensing packages are available for the CWM, database catalog and Informatica repository interfaces. Enterprise licensing for single or multiple sources is also available. Q: How is Metadata Exchange for Data Models packaged? A: Metadata Exchange products can be downloaded from the Informatica Developers Network web site. The Metadata Exchange for Data Models download is a trial version that permits a single use per interface. A license key must be purchased to unlock the full product functionality. Q: How much does Metadata Exchange cost? A: The Metadata Exchange SDK is available to all Informatica customers from the Informatica Developers Network. For the latest pricing and more information on Metadata Exchange products, please contact your Informatica sales representative. Q: How are the Metadata Exchange tools supported? A: Metadata Exchange tools are Informatica products. To obtain support and product updates, you need to purchase support from Informatica. Additional resources are available on the Informatica Developer Network web site. Q: What operating systems does Metadata Exchange run on? A: Metadata Exchange products all use the MX SDK. The MX SDK is built using Microsoft's Component Object Model (COM) standard and requires 32bit Windows operating systems including Windows 98, Windows NT 4, Windows 2000 and Windows XP. Q: What repository database platforms does Metadata Exchange support? A: Metadata Exchange works with all of the repository database platforms that are supported by the Informatica Data Integration platform. Since version 6, the Metadata Exchange SDK uses TCP/IP to interact with the Informatica Repository Server, providing complete database platform independence, enhanced security and better performance. Q: Does Metadata Exchange for Data Models support all the database platforms that my data modeling tools do? A: ERwin ER/Studio and PowerDesigner are designed to support almost all relational database platforms. Metadata Exchange for Data Models supports almost all of the database platforms that these products support, including all of the popular ones. Informatica PowerCenter has native support for Oracle, Sybase ASE, Informix, DB2, Microsoft SQL Server and Teradata. Metadata Exchange for Data Models maps data models defined for these database platforms to the corresponding Informatica native database type and all others to ODBC (e.g. Microsoft Access 2000, Red Brick, Centura SQLBase, Sybase SQL Anywhere).

Oracle Designer provides an Oracle centric modeling solution. Oracle Designer Server Model Diagrams do not explicitly specify a database platform, so Metadata Exchange for Data Models allows the user to choose the database platform and maps Oracle Designer datatypes to the corresponding database platform specific datatype. Q: Does Metadata Exchange support XML? A: Metadata Exchange products use XML for metadata exchange, reporting and control. Metadata Exchange for Data Models can read and write XML files that conform to the relational package of the OMG Common Warehouse Metamodel standard, allowing metadata exchange with other tools that support the standard. The MX SDK can be used to export and import XML files that conform to the Informatica repository DTD. This feature utilizes functionality that is built into the Informatica Repository Manager, Designer and PowerConnect, producing files that can be exchanged between Informatica tools. PowerConnect Metadata Exchange uses the export functionality to generate XML files from both the ERP system and the Informatica repository and uses the import functionality to update the repository. Metadata Exchange for Data Models and PowerConnect Metadata Exchange use proprietary XML file formats to store report data and XSL style sheets to transform the data for presentation. PowerConnect Metadata Exchange command files are also XML files.

ETL Tool vs EAI Tool An ETL tool takes data from one or more data models (typically represented by an EntityRelationship diagram) and puts it into another data model (represented by another E-R diagram). As the name implies, these models represent entities -- invoices, people, budgets, even timelines and their relationships. The order in which things happened (usually) isn't all that relevant to a data model. You could say that an ETL tool coordinates multiple entities and their relationships as data objects. But an EAI tool coordinates multiple entities and their relationships within a given process. We deal with process flows and the transactions (not so much individual entities) within that process. The order in which things happen are critical, and the relationship of entities to each other is less important than the relationship of transactions to each other. Clearly, these two can be related. You can occasionally do ETL with an EAI tool, and vice versa.

Tips for interview:

Interviews are a lot of work and require serious preparation. Review your recent performance and have examples of how you 1) solved a complex issue, 2) displayed leadership, 3) exhibited team spirit. Focus on accomplishments. Review in detail the requirements of this new post. Wear a nice suit and be clean-shaved. Anticipate possible questions and have some well prepared responses. Be ready to ASK GOOD QUESTIONS. 2. Adapted from response by Joe on Wednesday, December 15, 2004 Being nervous is natural, especially for an important experience you are about to go through. Some suggestions: - Review the company, the division and the people you are going to work for. Learn as much about them, their products, their vision, their mission, etc... Study online resources for this information, but also get on the phone, even seek meetings (informal ones) with others in the company, or with those who know the company. This is considered a normal/natural part of your job seeking homework, by the way. Learn all you can, as its in your best interest. - Have one, or more, people who you know/trust to give you a series of mock interviews. What worked for me in this area (your mileage may vary!) was for my mock interviewer to set up a series of 3-5 interviews, each was to be a new/unique session and to put me through a variety of typical scenario's. I was fortunate that this was a person who does this for a living. I was blunt and candid in what I wanted, and expected to be shown what a nice/good/effective interview SHOULD be like, but to also pointedly put me on the spot with how things can go wrong. Each session was treated like a full and FORMAL job interview, to include suit, demeanor and complete interview set of questions, answers and discussion. Afterward there was a blunt and candid review of what the objectives for that session were (from HIS perspective), what areas I did well on, where I did poorly/badly, identification of areas of opportunity where I missed out on something good (or bad) to capitalize on, and objective suggestions for improving my body language, demeanor, language, and attitude. In my mind, I wanted these mock interviews to take their best shots at ripping me to shreds, and see where my strong/weak points were. The reviews afterward were essential to improving my understanding of MYSELF and what I MUST improve in order to get through the interview. For me, this proved to be a winning move. - Go into the interview eager and ready to experience it. Relish and enjoy every moment of it. You will get to do it so infrequently, that this is a golden opportunity to experience to the fullest. You may think I'm kidding ---I'm not. By adjusting yourself so that this IS your mindset and approach, you'll find it not only enjoyable, but very rewarding as well. - I'm hoping someone more knowledgeable than me can address the interview questions 'issue'. I know there are lots of resources on the internet to research the plethora of interview questions and types, etc... My suggestion, at this point in your life/career, to not worry so much about the questions, as to what you can give and offer this company. On the other hand, you certainly should have your own list of questions, written down is fine, of what you want to know about them. Especially about your work environment, expectations of you and your time, etc... Start off general ("what can you tell me about the company") and work to being more specific ("what can you tell me about the division", what can you tell me about the position you are hiring me for", etc....").

- Try this approach on being calm---think about, and continuously remind yourself in productive, enriching and positive ways that you will calmly and rationally be successful in this interview. Mentally focus on what you WANT, vice what you don't want. Its fine to honestly self-evaluate how you are today. What is really important is HOW will you improve? What can you do better, and what are you doing about it now? Another approach is that being nervous is your minds way of telling you to be careful. You are in control of yourself. You decide what is important or worrisome. So, tell your mind what to think and how to act. Such an improvement can occur over time when you are persistent. Think about it. Side note: I've found in life, people who focus on what they don't want, or like, as the case may be, don't see how negative that is. They really believe that by telling themselves NOT to do something that somehow, magically, the RIGHT thing they are supposed to be doing will magically occur. It doesn't work that way. I've found when you positively and actively WANT something to occur, then make that accomplishment the focus of your attention --- it happens. I believe that occurs because you've DONE something, as opposed to the alternative of attempting to NOT do something. I believe the former is a positive builder in our lives. Be optimistic. - You gave the impression that you'd have some stiff competition for this job. Competition is a good thing. Go in with your best foot forward. Be honest, show them that you are more than interested in doing your best every day. Convey, throughout the interview-using every question as an opportunity, to explain to them HOW dedicated you are, HOW energetic your day-to-day performance is now (and will continue to be for them), that you are the best candidate even with your 'limited' experience (because you can work harder, smarter, faster, with stellar results), that you are a quick learner, that you are able to efficiently and effectively apply new rules to existing paradigm's, etc.... In accomplishing this, I don't mean for you to embelish who and what you are, simply have this (the above) as your mindset, and as a way to EXPLAIN/justify and demonstrate that YOU are confident you are the best candidate. Note: In this context, I mean no offense with the reference to your being 'limited', simply saying that your skills are what they are, and that it is normal to recognize others (your competition?) have more time, and possibly even more talent/skill than you do. In fact, consider the amount of your skills, and that you are positively approaching this job opportunity as an 'asset'. You are bringing knowledge and experience to the table. You'll accept direction on what they want, and how they want it, but will also tap into your own talents, skills, abilities and creativity to do the job even better. Review your resume, the one THEY have, the night before. Study it closely, as they will ask you questions based on what they see, and don't see. I've seen a good interview go VERY bad, simply because the interviewee wasn't cognizant of the content of their own resume. Focus on your talents and skills. Don't BS the interviewer with smoke and mirrors. More than likely they'll catch on rather quickly that you are smart, patient and honest (desired qualities!!!) or that you aren't (bad). - Be honest regarding what you can do, and only volunteer what you are bad at, or cannot do when questioned about something specific that you can't do. Its reasonable to know your limitations, and that you can candidly explain the breadth of your abilities (and limits). If you find they focus on 'stuff' you don't know, its ok. Expect such questions and take them in stride. Follow up with your speed/willingness to learn AND apply it correctly. - Finally, as quickly as possible after the interview, sit down (in your car even) and take notes on what occurred. What could you have done better? What things did you mention or explain, that your resume SHOULD have addressed? What did you do right, and can capitalize on in the future? What questions were asked, and how you answered them. Its normal that you won't remember everything, yet by documenting the gist of it when it is fresh in your mind, will be productive and helpful when you go through, and prepare for, the next interview. You think you were going to stay in this new job for forever, did you? Do NOT take it personally, if you don't get this job. Instead, reflect on this interview, work on areas you can improve on, and move on to seeking out the next one. More than likely, the company hired the best candidate, and that honestly will not always be you. That is ok and should be acceptable in your mind. Such things happen, and companies make job-hiring

decisions in light of what is best for the company. It greatly helps to realize that these are business decisions and not an adverse indicator of you (which it isn't!). 3. Adapted from response by Shareen on Wednesday, December 15, 2004 As an advice on interviews, the best way is to be honest in your answers. If you have never worked on the task that is asked on the interview, the best answer is to say I can learn anything quickly even though I have never worked with what they ask I have a proven track record on learning new tasks quickly. It is important to bring the positive first and not the negative. Example, I never have worked with the software but I can learn quickly. See, in the example, the interviewer will only hear the negative and not the ending part of the sentence. Another word of advice, be yourself. It is not advisable to rehearse answers on a face to face interview. The reason is that the interviewer will pick up on it and try to trip you up or not believe your answers. You should also be relaxed. Remember the person who is interviewing you must want to talk to you and like what you have to offer or they would have never agreed to an interview. You should also remember that the person interviewing you has been in your position before and will understand some anxiety. Consider what specifically your need is. Typically an ETL tool 1. reads data 2. may change data 3. writes data In a majority of cases (50-80%) it just reads and writes to a new location. In 1. 2. 3. 4. 5. In 1. 2. 3. 4. the rest of the cases it does a 'simple' transformation text to numeric text to date format - leading '0', dashes etc. simple split - one field to two or more fields simple merge - two or more fields to one (name concatenation) a few cases it does more look up to another table and replace complex algorithm merge two or more input fields into one with some algorithm split one field into many with some algorithm

Breaking all this down you can create a set of procedures that will 1. read a database for input information 2. write that information to file 3. read a database for transformation code 4. append that code to file 5. read a database for write/output information 6. append that information to file 7. execute file You've created your ETL. This process will probably handle 80-95% of all your needs. The other 5-20% will need specific coding, but may (if you pay attention) fall into several sets of code that, if you analyze the code you may be able to insert that code into your 'ETL' database and then call it as needed. b>More information:

10

More information on Data Warehouse solution selection is available in the DW-SELECT discussion group. Enter a topic or subject and click the search button for detailed results. General Questions on data ware housing: Here a few: 1. 2. 3. 4. 5. Tell me about cubes Full process or incremental Are you good with data cleansing? How do you handle changing dimensions? What is a star-schema?

2. Adapted from response by Mike on Thursday, July 22, 2004 A few high level questions might include: Talk about the Kimball vs. Inmon approaches. Talk about the concepts of ODS and information factory. Talk about challenges of real-time load processing vs. batch. For Informatica: Let them know which version you are familiar with as well as what role. Informatica 7.x has divided the developer and administrator roles. You will most likely be asked specific questions for building a mapping and workflow. Know what the difference is between sttic and reusable objects for both. Be prepared to demonstrate how to create a connection, source definition (flat file and relational), use expression transformation, lookups (connected and disconnected), aggregators, normalizers, update strategies, how to modify source and target sql overrides, etc. For Erwin: Know the difference between Logical and Physical models. Know how to use the Reverse Engineer and Comparison features. The dimension model feature is pretty weak, but you might want to know how Erwin treats dimensional modeling. Other topics: Anything you know about RDBMS is worth discussing. In Oracle, you can talk about referential integrity as it applies to DW. Views and Materialized Views, Partitioning, Bitmap Indexing (when to use), and any other specifics as related to DW (for 10g there is the new Bitmap join Index). Always, always offer details of your knowledge, and ask questions to get the customer's perspectives (you do not want to push Kimball concepts if the customer is hard-set on Inmon). Anything you can bring to the table regarding the customers business systems (i.e. SAP, Peoplesoft, etc.) will help separate you from the pack. Also anything you know about business processes such as:, Order Fulfillment, Inventory Analysis, Finance, etc. will also separate you. 3. Adapted from response by Shaquille on Wednesday, August 11, 2004 Here are the few questions that might be posed: Data Warehousing questions: 1) What is source qualifier?

11

2) Difference between DSS & OLTP? 3) Explain grouped cross tab? 4) Hierarchy of DWH? 5) How many repositories can we create in Informatica? 6) What is surrogate key? 7) What is difference between Mapplet and reusable transformation? 8) What is aggregate awareness? 9) Explain reference cursor? 10) What are parallel querys and query hints? 11) DWH architecture? 12) What are cursors? 13) Advantages of de normalized data? 14) What is operational data source (ODS)? 15) What is meta data and system catalog? 16) What is factless fact schema? 17) What is confirmed dimension? 18) What is the capacity of power cube? 19) Difference between PowerPlay transformer and power play reports? 20) What is IQD file? 21) What is Cognos script editor? 22) What is difference macros and prompts? 23) What is power play plug in? 24) Which kind of index is preferred in DWH? 25) What is hash partition? 26) What is DTM session? 27) How can you define a transformation? What are different types of transformations in Informatica? 28) What is mapplet? 29) What is query panel? 30) What is a look up function? What is default transformation for the look up function? 31) What is difference between a connected look up and unconnected look up? 32) What is staging area? 33) What is data merging, data cleansing and sampling? 34) What is up date strategy and what are th options for update strategy? 35) OLAP architecture? 36) What is subject area? 37) Why do we use DSS database for OLAP tools? Business Objects FAQ: 38) What is a universe? 39) Analysis in business objects? 40) Who launches the supervisor product in BO for first time? 41) How can you check the universe? 42) What are universe parameters? 43) Types of universes in business objects? 44) What is security domain in BO? 45) Where will you find the address of repository in BO? 46) What is broad cast agent? 47) In BO 4.1 version what is the alternative name for broadcast agent? 48) What services the broadcast agent offers on the server side? 49) How can you access your repository with different user profiles? 50) How many built-in objects are created in BO repository? 51) What are alertors in BO? 52) What are different types of saving options in web intelligence? 53) What is batch processing in BO? 54) How can you first report in BO by using broadcast agent? 55) Can we take report on Excel in BO?

12

1)what is the use of surogate keys? surrogate key in a data warehouse is more than just a substitute for a natural key Surrogate keys are the artifical keys that are used in SCD 2)where to use dd_insert and insert as update? dd_insert will insert data, where as insert as update will be used in SCD concept. 3)what is the use of updatestratage transformation? Update stratery transformation will look for all the updates,inserts and deletes. 4)what is associate port in dynamic lookup? hope i told u this 5)what is sql design? 6)why sorter tra is active? Sorter transformation is generally used to sort the data. it Have an option called distinct in properties sheet this option will remove the duplicates. so once if we pass 10 records having 5 duplicates it will give only 5 records out which tells this as a active 7)how to improve query optimization? rule based cost based it depends on requirment 8)what is meant by globel enterprice repository ? 9)how to compare ports in expresion (other than warehouse key) 10)what types of document your are getting from client? 11)what is materlized view? A materialized view is a database object that contains the results of a query. They are local copies of data located remotely, or are used to create summary tables based on aggregations of a table's data. Materialized views, which store data based on remote tables are also, know as snapshots. 12)what is referential cursor? 13)what advantage of denormalization? and what is denormalization? In DWH we store historical data if the tables are normalized it will be difficult to 14)in session level sql overwrite is there or not? 15)what is data modular? 16)suppose u have 12 tran in one mapping but the data is not flowing the 3rd trans how to know what is the problem?

13

17)how many mapings are done in your project? depends on duration of project 18)what is workflow variable? we can specify when u create a work flow these variables are generally used while executing the mapping 19)currently where u r working in your project? 20)how to improve performance in maping level? hope u know this 21)how to send post e-mail to client if session is not completed? 22)what is parrel partioning? hope u know 23)suppose you have 6 sessions in your mapping how to identify which typ of batch is running in that mapping? we can see it in workflow monitor 24)i have implemented one mapping with one source and one target later my client given two more sources what is the initial step you have to do? nothing much load those table also in mapping if there is any requirment like first one target shd get data populated before the other we can sprecify that in target load plan

14

You might also like