What are Target Types on the Server?
Target types are relational,file and XML
What are Target Options on the Servers?
Target Options for File Target type are FTP File, Loader and MQ. There are no target options for ERP target type Target Options for Relational are Insert, Update (as Update), Update (as Insert), Update (else Insert), Delete, and Truncate Table
How do you identify existing rows of data in the target table using lookup transformation?
You with can use a Connected Lookup with dynamic cache on the target.
What are Aggregate transformation? The Aggregator transformation allows you to perform aggregate calculations, such as averages and sums. The Aggregator transformation is unlike the Expression transformation, in that you can use the Aggregator transformation to perform calculations on groups. What are various types of Aggregation?
• • • • • • • • • • •
AVG COUNT FIRST LAST MAX MEDIAN MIN PERCENTILE STDDEV SUM VARIANCE
What are Dimensions and various types of Dimensions?
The DWH contains two types of tables, 1.Dimension table 2.Fact table. The Dimensions are classified 3 types. 1.SCD TYPE1(Slowly Changing Dimension) : contains current data. 2.SCD TYPE2(Slowly Changing Dimension) : contains current data + complete historical data. 3.SCD TYPE3(Slowly Changing Dimension) : contains current data + one type historical data.
What are 2 modes of data movement in Informatica Server? The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server. a) Unicode - IS allows 2 bytes for each character and uses additional byte for each nonascii character (such as Japanese characters) b) ASCII - IS holds all data in a single byte The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server. What is Code Page Compatibility? Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in the Unicode data movement mode. If the code pages are identical, then there will not be any data loss. One code page can be a subset or superset of another. For accurate data movement, the target code page must be a superset of the source code page. Superset - A code page is a superset of another code page when it contains the character encoded in the other code page, it also contains additional characters not contained in the other code page. Subset - A code page is a subset of another code page when all characters in the code page are encoded in the other code page. What is Code Page used for? Code Page is used to identify characters that might be in different languages. If you are importing Japanese data into mapping, u must select the Japanese code page of source data. What is Router transformation? Router transformation allows you to use a condition to test data. It is similar to filter transformation. It allows the testing to be done on one or more conditions. A Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group. Thus the added advantage over filter transformation is that we can also route rejected records as per requirement.Router transformation allows to test more than one condition and based on condition you can route the data into different targets or target instances. A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the
option to route rows of data that do not meet any of the conditions to a default output group. What is Load Manager? While running a Workflow,the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks.When the PowerCenter Server runs a workflow, the Load Manager performs the following tasks: 1. Locks the workflow and reads workflow properties. 2. Reads the parameter file and expands workflow variables. 3. Creates the workflow log file. 4. Runs workflow tasks. 5. Distributes sessions to worker servers. 6. Starts the DTM to run sessions. 7. Runs sessions from master servers. 8. Sends post-session email if the DTM terminates abnormally. When the PowerCenter Server runs a session, the DTM performs the following tasks: 1. Fetches session and mapping metadata from the repository. 2. Creates and expands session variables. 3. Creates the session log file. 4. Validates session code pages if data code page validation is enabled. Checks query conversions if data code page validation is disabled. 5. Verifies connection object permissions. 6. Runs pre-session shell commands. 7. Runs pre-session stored procedures and SQL. 8. Creates and runs mapping, reader, writer, and transformation threads to extract,transform, and load data. 9. Runs post-session stored procedures and SQL. 10. Runs post-session shell commands. 11. Sends post-session email. What is Data Transformation Manager? After the load manager performs validations for the session, it creates the DTM process. The DTM process is the second process associated with the session run. The primary purpose of the DTM process is to create and manage threads that carry out the session tasks. · The DTM allocates process memory for the session and divide it into buffers. This is also known as buffer memory. It creates the main thread, which is called the master thread. The master thread creates and manages all other threads. · If we partition a session, the DTM creates a set of threads for each partition to allow concurrent processing.. When Informatica server writes messages to the session log it includes thread type and thread ID. Following are the types of threads that DTM creates:
A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets.Master thread .what are the meta data of source U import? Source name Database location Column names Datatypes Key constraints
. Use a Lookup transformation in your mapping to look up data in a relational table.We can use unconnected lookup transformation to determine whether the records already exist in the target or not. we can use either the server manager or the command line program pmcmd to start or stop the session. Why we use lookup transformations? Lookup Transformations can access data from relational tables that are not sources in mapping. After creating the session. you need to connect it to a Source Qualifier transformation.Run Session At The Same Time.Batches . What is Session and Batches? Session .WRITER THREAD-One Thread for Each Partition if target exist in the source pipeline write to the target. There Are Two Types Of Batches : Sequential . view.One Thread to Each Session.Run Session One after the Other.One or More Transformation Thread For Each Partition. Fetches Session and Mapping Information. Update slowly changing dimension tables . The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.Mapping thread . Creates and manages all other threads. or synonym.reader thread-One Thread for Each Partition for Each Source Pipeline. we can accomplish the following tasks: Get a related value-Get the Employee Name from Employee table based on the Employee IDPerform Calculation. With Lookup transformation.concurrent .Main thread of the DTM process.It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server.Pre and Post Session Thread-One Thread each to Perform Pre and Post Session Operations.tRANSFORMATION THREAD . Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping While importing the relational source defintion from database. what is a source qualifier? When you add a relational or a flat file source definition to a mapping.
How many ways you can update a relational source defintion and what r they? in 2 ways we can do it 1) by reimport the source definition 2) by edit the source definition Where should U place the flat file to import the flat file defintion to the designer? There is no such restrication to place the source file. In performance point of view its better to place the file in server local src folder. if you need path please check the server properties availble at workflow manager. It doesn't mean we should not place in any other folder, if we place in server src folder by default src will be selected at time session creation. To provide support for Mainframes source data,which files r used as a source definitions? COBOL Copy-book files Which transformation should u need while using the cobol sources as source defintions?
Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data.
How can U create or import flat file definition in to the warehouse designer? U can not create or import flat file defintion in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile. Yes, you can import flat file directly into Warehouse designer. This way it will import the field definitions directly. U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer. What is the maplet? Maplet is a set of transformations that you build in the maplet designer and U can use in multiple mapings. For Ex:Suppose we have several fact tables that require a series of dimension keys.Then we can create a mapplet which contains a series of Lkp transformations to find each dimension key and use it in each fact table mapping instead of creating the same Lkp logic in each mapping. Part(sub set) of the Mapping is known as Mapplet
A mapplet should have a mapplet input transformation which recives input values, and a output transformation which passes the final modified data to back to the mapping. when the mapplet is displayed with in the mapping only input & output ports are displayed so that the internal logic is hidden from end-user point of view. What r the designer tools for creating tranformations? Mapping designer Tansformation developer Mapplet designer What r the active and passive transforamtions? Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition. A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation. What r the connected or unconnected transforamations? An unconnected transformation cant be connected to another transformation. but it can be called inside another transformation. In addition to first answer, uncondition transformation are directly connected and can/used in as many as other transformations. If you are using a transformation several times, use unconditional. You get better performance. Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not. much like calling a program by name and by reference. use unconnected transforms when you wanna call the same transform many times in a single mapping. How many ways u create ports? Two ways 1.Drag the port from another transforamtion 2.Click the add buttion on the ports tab. What r the reusable transforamtions? Reusable transformations can be used in multiple mappings.When u need to incorporate this transformation into maping,U add an instance of it to maping.Later if U change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,U can change the transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save U great deal of work.
A transformation can reused, that is know as reusable transformation
You can design using 2 methods using transformation developer create normal one and promote it to reusable What r the reusable transforamtions? Reusable transformations can be used in multiple mappings.When u need to incorporate this transformation into maping,U add an instance of it to maping.Later if U change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,U can change the transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save U great deal of work. A transformation can reused, that is know as reusable transformation
You can design using 2 methods using transformation developer create normal one and promote it to reusable What r the unsupported repository objects for a mapplet? COBOL source definition Joiner transformations Normalizer transformations Non reusable sequence generator transformations. Pre or post session stored procedures Target defintions Power mart 3.5 style Look Up functions XML source definitions IBM MQ source defintions Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data. Target definitions. Definitions of database objects or files that contain the target data. Multi-dimensional metadata. Target definitions that are configured as cubes and dimensions. Mappings. A set of source and target definitions along with transformations containing business logic that you build into the transformation. These are the instructions that the Informatica Server uses to transform and move data. Reusable transformations. Transformations that you can use in multiple mappings.
The Informatica Server first generates an SQL query and scans the query to replace each mapping parameter or variable with its start value. Then it executes the query on the source database.The informatica server saves the value of maping variable to the repository at the end of session run and uses that value next time U run the session.Because reusable tranformation is not contained with any maplet or maping.Mapplets. A workflow is a set of instructions that describes how and when to run tasks related to extracting. A set of transformations that you can use in multiple mappings.
What is aggregate cache in aggregator transforamtion? The aggregator stores data in the aggregate cache until it completes aggregate calculations. A session is a type of task that you can put in a workflow. When u use the maping parameter . Unlike a mapping parameter. and loading data. You can also use the system variable $ $$SessStartTime. How can U improve session performance in aggregator transformation?
use sorted input: 1.Then define the value of parameter in a parameter file for the session.A mapping parameter retains the same value throughout the entire session. Can U use the maping parameters or variables created in one maping into another maping? NO. user-defined join. Sessions and workflows. Sessions and workflows store information about how and when the Informatica Server moves data. and source filter of a Source Qualifier transformation.the informatica
. use a sorter before the aggregator 2.a maping variable represents a value that can change throughout the session. Each session corresponds to a single mapping. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by.When u run a session that uses an aggregator transformation. You can use mapping parameters and variables in the SQL query. We can use mapping parameters or variables in any transformation of the same maping or mapplet in which U have created maping parameters or variables Can u use the maping parameters or variables created in one maping into any other reusable transformation? Yes. transforming.U declare and use the parameter in a maping or maplet. What r the mapping paramaters and maping variables? Maping parameter represents a constant value that U can define before running a session. the key order is also very important.
Two relational sources should come from same datasource in sourcequalifier. Both input pipelines originate from the same Joiner transformation. Both input pipelines originate from the same Source Qualifier transformation.it stores overflow values in cache files.
what r the settiings that u use to cofigure the joiner transformation? Master and detail source Type of join Condition of the join the Joiner transformation supports the following join types. which you set in the Properties tab: Normal (Default) Master Outer Detail Outer Full Outer Master and detail source Type of join Condition of the join the Joiner transformation supports the following join types. which you set in the Properties tab: Normal (Default)
. In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?
Both pipelines begin with the same original data source. it stores overflow values in cache files. the Informatica Server creates index and data caches in memory to process the transformation.server creates index and data caches in memory to process the transformation.U can join relatinal sources which r coming from diffrent sources also. U need matching keys to join two relational sources in source qualifier transformation. When you run a workflow that uses an Aggregator transformation. What r the diffrence between joiner transformation and source qualifier transformation? Source qualifier – Homogeneous source Joiner – Heterogeneous source U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation. Both input pipelines originate from the same Normalizer transformation. If the Informatica Server requires more space. Either input pipelines contains an Update Strategy transformation.Where as u doesn’t need matching keys to join two sources. Either input pipelines contains a connected or unconnected Sequence Generator transformation.If the informatica server requires more space.
choose Transformation-Create. Change the master/detail relationship if necessary by selecting the master source in the M column. You can add multiple conditions. Click the Add button to add a condition. Enter a description for the transformation. Click any box in the M column to switch the master/detail relationship for the sources. Select the Ports tab. Certain ports are likely to contain NULL values. You can specify a default value if the target database does not handle NULLs. Select the Joiner transformation. The master and detail ports must have matching datatypes. Keep in mind that you cannot use a Sequence Generator or Update Strategy transformation as a source to a Joiner transformation. Select the Properties tab and enter any additional settings for the transformations. Tip: Designating the source with fewer unique records as master increases performance during a join. The Designer creates the Joiner transformation. Select the Condition tab and set the condition. The Designer creates input/output ports for the source fields in the Joiner as detail fields by default. This description appears in the Repository Manager. making it easier for you or others to understand or remember what the transformation does. Add default values for specific ports as necessary. Enter a name. Drag all the desired input/output ports from the first source into the Joiner transformation. Select and drag all the desired input/output ports from the second source into the Joiner transformation. The naming convention for Joiner transformations is JNR_TransformationName. since the fields in one of the sources may be empty. The Joiner transformation only supports equivalent (=) joins: 10. Double-click the title bar of the Joiner transformation to open the Edit Transformations dialog box. The Designer configures the second set of source fields and master fields by default. You can edit this property later. click OK.Master Outer Detail Outer Full Outer What r the join types in joiner transformation? In the Mapping Designer.
Using it we can access the data from a relational table which is not a source in the mapping. make sure the directory exists and contains enough disk space for the cache files. Many normalized tables include values used in a calculation. Get a related value. Perform a calculation. the cached files are created in a directory specified by the server variable $PMCacheDir. Normal (Default) -. Update slowly changing dimension tables. If you override the directory.all detail rows and only matching rows from master Detail outer -.we can Lkp the table and get the Empname in target. By default. For Ex:Suppose the source contains only Empno. For example. Informatica server queries the look up table based on the lookup ports in the transformation.
what is the look up transformation? Use lookup transformation in u’r mapping to lookup data in a relational table. if your source table includes employee ID.view. Why use the lookup transformation ? To perform the following tasks. such as gross sales per invoice or sales tax.It compares the lookup transformation port values to lookup table column values based on the look up condition. but you want to include the employee name in your target table to make your summary data easier to read.all rows from both master and detail ( matching or non matching) What r the joiner caches?
Specifies the directory used to cache master records and the index to these records.Click OK.only matching rows from both master and detail Master outer -.Then instead of adding another tbl which contains Empname as a source .all master rows and only matching rows from detail Full outer -. but we want Empname also in the mapping. You can use a Lookup transformation to determine whether records already exist in the target. Choose Repository-Save to save changes to the mapping.synonym. The directory can be a mapped or mounted drive. but not the calculated value (such as net sales). What r the types of lookup? Connected lookup Unconnected lookup Persistent cache Re-cache from database Static cache Dynamic cache Shared cache
what is meant by lookup caches?
The informatica server builds a cache in memory when it processes the first row af a data in a cached look up transformation. In addition to this: In Connected lookup if the condition is not satisfied it returns '0'. U can pass these transformations. informatica server returns the default value for indicates that the the row is not in the connected transformations and null for unconnected cache or target table. A Normalizer transformation can appear anywhere in a data flow when you normalize a relational
. In UnConnected lookup if the condition is not satisfied it returns 'NULL'. Dynamic cache U can insert rows into the cache as u pass U can not insert or update the cache to the target The informatic server returns a value from the lookup table The informatic server inserts rows into or cache when the condition is true. rows to the target table Static cache
Which transformation should we use to normalize the COBOL and relational sources? The Normalizer transformation normalizes records from COBOL and relational sources.The informatica server stores condition values in the index cache and output values in the data cache. U can use a static cache.This not true. Cache includes all lookup out put ports in the lookup condition and the lookup/return port.Differences between connected and unconnected lookup? Connected lookup
Receives input values diectly from the pipe line. U can use a dynamic or static cache Cache includes all lookup columns used in the maping Support user defined default values
Receives input values from the result of a lkp expression in a another transformation. Does not support user defiend default values
In addition: Connected Lookip can return/pass multiple rows/groups of data whereas unconnected can return only one port.When the condition is cache when the condition is false. Difference between static cache and dynamic cache? lets say for example your lookup table is your target table. On the other hand Static caches dont get updated when you do a lookup. or if there is a match it will update the row in the target. So when you create the Lookup selecting the dynamic cache what It does is it will lookup values and if there is no match it will insert the row in both the target and the lookup cache (hence the word dynamic cache it builds up as you go along).It allocates memory for the cache based on the amount u configure in the transformation or session properties. allowing you to organize the data according to your own needs.
A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition.The informatica server stores group information in an index cache and row data in a data cache. Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source. the Normalizer transformation automatically appears.the informatica server replaces the stored row with the input row. What is the Router transformation? A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. What r the rank caches? During the session . What is the Rankindex in Ranktransformation? The Designer automatically creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group. What r the types of groups in Router transformation? Input group Output group The designer copies property information from the input ports of the input group to create a set of output ports for each output group.If U configure the seeion to use a binary sort order. When you drag a COBOL source into the Mapping Designer workspace.If the input row out-ranks a stored row. creating input and output ports for every column in the source How the informatica server sorts the string values in Ranktransformation? When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder. A Router transformation has the following types of groups: Input Output
.source.the informatica server compares an inout row with rows in the datacache. the rank index numbers the salespeople from 1 to 5: Based on which port you want generate Rank is known as rank port. the generated values are known as rank index. However. if you create a Rank transformation that ranks the top 5 salespersons for each quarter.the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string. a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group. For example. Two types of output groups User defined groups Default group U can not modify or delete default groups.
the Informatica Server adds an ORDER BY clause to the default SQL query. the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. For populating and maintaining data bases What r the types of data that passes between informatica server and stored procedure? 3 types of data Input/Out put parameters Return Values Status code. Specify sorted ports.Input Group The Designer copies property information from the input ports of the input group to create a set of output ports for each output group. Select only distinct values from the source.The stored procedure issues a status code that notifies whether or not stored procedure completed sucessfully. If you include a filter condition. What is the status code? Status code provides error handling for the informatica server during the session.
. If you include a user-defined join. What is source qualifier transformation? When you add a relational or a flat file source definition to a mapping. Filter records when the Informatica Server reads source data. If you specify a number for sorted ports. Specify an outer join rather than the default inner join. If you choose Select Distinct. the Informatica Server replaces the join information specified by the metadata in the SQL query. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier.It only used by the informatica server to determine whether to continue running the session or stop. the Informatica Server adds a WHERE clause to the default query. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session.This value can not seen by the user. you need to connect it to a Source Qualifier transformation. Output Groups There are two types of output groups: User-defined groups Default group You cannot modify or delete output ports or their properties. Why we use stored procedure transformation? A Stored Procedure transformation is an important tool for populating and maintaining databases. Join data originating from the same source database. Database administrators create stored procedures to automate time-consuming tasks that are too complicated for standard SQL statements.
which you set in the Properties tab: Normal (Default) Master Outer Detail Outer Full Outer What r the basic needs to join two sources in a source qualifier? The both the table should have a common feild with same datatype. What is the target load order? U specify the target loadorder based on source qualifiers in a maping. or reject. When you configure a session. treat all records as inserts). and targets linked together in a mapping. you can instruct the Informatica Server to either treat all rows in the same way (for example. what is update strategy transformation ? The model you choose constitutes your update strategy. or use instructions coded into the session mapping to flag rows for different database operations. A target load order group is the collection of source qualifiers. transformations.U can designatethe order in which informatica server loads data into the targets. you can instruct the Informatica Server to either treat all records in the same way (for example. how to handle changes to existing rows. or use
. Describe two levels in which update strategy transformation sets? Within a session. Also of you are using a lookup in your mapping and the lookup table is small then try to join that looup in Source Qualifier to improve perf. Within a mapping. you use the Update Strategy transformation to flag rows for insert.If u have the multiple source qualifiers connected to the multiple targets. you might use a custom query to perform aggregate calculations or execute a stored procedure. What is the default join that source qualifier provides? The Joiner transformation supports the following join types. Its not neccessary both should follow primary and foreign relationship. delete. For example. update. If any relation ship exists that will help u in performance point of view. treat all rows as inserts). When you configure a session. Within a mapping. In PowerCenter and PowerMart.Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. you set your update strategy at two different levels: Within a session.
Within a mapping. delete. or reject. if they are new records from source. What r the options in the target session of update strategy transsformatioin? Update as Insert: This option specified all the update records from source to be flagged as inserts in the target. you use the Update Strategy transformation to flag records for insert. update. When Data driven option is selected in session properties it the code will consider the update strategy (DD_UPDATE. delete.DD_INSERT. Insert Delete Update Update as update Update as insert Update esle insert Truncate table What r the types of maping wizards that r to be provided in Informatica? Simple Pass through
Slowly Growing Target
. this field is marked Data Driven by default. instead of updating the records in the target they are inserted as new records. update. or reject. What is the default source option for update stratgey transformation? Data driven What is Datadriven? The Informatica Server follows instructions coded into Update Strategy transformations within the session mapping to determine how to flag rows for insert. If the mapping for the session contains an Update Strategy transformation. Update else Insert: This option enables informatica to flag the records either for update if they are old or insert. In other words.DD_REJECT) used in the mapping and not the options selected in the session properties.DD_DELETE.instructions coded into the session mapping to flag records for different database operations. Within a mapping.
Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data. a series of dimensions related to a central fact table. What r the mapings that we use for slowly changing dimension table?
. Use this mapping when you want to drop all existing data from your table before loading new data. Slowly Growing target : Loads a slowly growing fact or dimension table by inserting new rows.Slowly Changing the Dimension
Type1 Most recent values
Type2 Full History Version Flag Date Type3 Current and one previous The Designer provides two mapping wizards to help you create mappings quickly and easily. Use this mapping to load new data when existing data does not require updates. Creates mappings to load static fact and dimension tables. Slowly Changing Dimensions Wizard. as well as slowly growing dimension tables. Getting Started Wizard. What r the types of maping in Getting Started Wizard? Simple Pass through maping : Loads a static fact or dimension table by inserting all rows. Both wizards are designed to create mappings for loading and maintaining star schemas.
. all rows contain current dimension data. When updating an existing dimension. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table. Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. And updated dimensions r saved with the value 0.Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. Flag indiactes the dimension is new or newlyupdated.This maping also inserts both new and changed dimensions in to the target.In addition it creates a flag value for changed or new dimension. Version numbers and versioned primary keys track the order of changes to each dimension. Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target.Recent dimensions will gets saved with cuurent flag value 1. In the Type 1 Dimension mapping. the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates What r the different types of Type2 dimension maping? Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number. Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions. and sends postsession email when the session completes. Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions. creates the DTM process.And changes r tracked by the effective date range for each version of each dimension. Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table. How can u recognise whether or not the newly added rows in the source r gets insert in the target ? In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date Range What r two types of processes that informatica runs the session? Load manager Process: Starts the session. Rows containing changes to existing dimensions are updated in the target.And newly added dimension in source will inserted into target with a primary key. Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target.
u can access information about U’r repository with out having knowledge of sql. Parallel data processing: This feature is available for powercenter only.U can use multiple CPU’s to process a session concurently. but you can generate metadata report. that is not going to be used for business analysis What is metadata reporter? It is a web based application that enables you to run reports againist repository metadata. Informatica server can achieve high performance by partitioning the pipleline and performing the extract . and load for each partition in parallel. Process session data using threads: Informatica server runs the session in two processes. write. It is a ETL tool. Creates threads to initialize the session.The DTM process. With a meta data reporter. and transform data.This allows U to change the values of session parameters. Install the informatica server on a machine with multiple CPU’s.transformation language or underlying tables in the repository Define maping and sessions? Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation. To achieve the session partition what r the necessary tasks u have to do? Configure the session to partition source data. By using Metadata reporter we can generate reports in informatica.and post-session operations.
.Explained in previous question. Session : It is a set of instructions that describe how and when to move data from source to targets. you could not make reports from here. What r the new features of the server manager in the informatica 5. Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server? Informatica Workflow Managar and Informatica Worlflow Monitor Why we use partitioning the session in informatica? Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. Can u generate reports in Informatcia? Yes.0? U can use command line arguments for a session or batch. read.and mapping parameters and maping variables. transformation.If we use the informatica server on a SMP system. and handle pre.
For loading the data informatica server creates a seperate file for each partition(of a source file).U can choose to merge the targets. What is DTM process? After the loadmanger performs validations for session. Creating log files: Loadmanger creates logfile contains the status of session.it creates the DTM process. Reading the parameter file: If the session uses a parameter files.informatica server reads multiple files concurently. For XML and file sources. Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently.DTM is to create and manage the threads that carry out the session tasks.When u configure the session the loadmanager maintains list of list of sessions and session start times.Informatica server reads multiple partitions of a single source concurently. Why u use repository connectivity? When u edit.I creates the master thread.schedule the sesion each time.informatica server directly communicates the repository to check whether or not the session and users r valid.Locking prevents U starting the session again and again.All the metadata of sessions and mappings will be stored in repository What r the tasks that Loadmanger process will do? Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server. What r the different threads in DTM process? Master thread: Creates and manages all other threads
.Master thread creates and manges all the other threads.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.loadmanager reads the parameter file and verifies that the session level parematers are declared in the file Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session.How the informatica server increases the session performance through partitioning the source? For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.
.Maping thread: One maping thread will be creates for each session. Indicator file: If u use the flat file as a target.It reads data from source.It writes information about session into log files such as initialization process.It also creates an error log for error messages.U can view this file by double clicking on the session in monitor window Performance detail file: This file contains information known as session performance details which helps U where performance can be improved. Session log file: Informatica server creates session log file for each session.The control file contains the information about the target flat file such as data format and loading instructios for the external loader.server. Pre and post session threads: This will be created to perform pre and post session operations.creation of sql commands for reader and writer threads. Reader thread: One thread will be created for each partition of a source.U can configure the informatica server to create indicator file.These files will be created in informatica home directory.For each target row. Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.errors encountered and load summary.log).update. What r the out put files that the informatica server creates during the session running? Informatica server log: Informatica server(on unix) creates a log for all status and error messages(default name: pm.Fectchs session and maping information. Control file: Informatica server creates control file and a target file when U run a session that uses the external loader.To genarate this file select the performance detail option in the session property sheet. Reject file: This file contains the rows of data that the writer does notwrite to targets.the indicator file contains a number to indicate whether the row was marked for insert.Session detail include information such as table name. Session detail file: This file contains load statistics for each targets in mapping. Writer thread: It will be created to load data to the target.delete or reject.The amount of detail in session log file depends on the tracing level that u set.One if the session completed sucessfully the other if the session fails. Transformation thread: It will be created to tranform data.number of rows written or rejected.U can create two different messages.
the informatica server creates the target file based on file prpoerties entered in the session property sheet. By using copy session wizard u can copy a session in a different folder or repository. What is batch and describe about types of batches? Grouping of session is known as batch. What is polling? \It displays the updated information about the session in the monitor window. Batch--.targets and session to the target folder. Cache files: When the informatica server creates memory cache it also creates cache files. Violates database constraint Filed in the rows was truncated or overflowed.But that target folder or repository should consists of mapping of that session. If u have sessions with source-target dependencies u have to go for sequential batch to start the sessions one after another.For the following circumstances informatica server creates index and datacache files. associated source. Whch runs all the sessions at the same time. The monitor window displays the status of each session when U poll the informatica server Can u copy the session to a different folder or repository? Yes.output file: If session writes to a target file.is a group of any thing Different batches ----Different groups of different things Can u copy the batches? NO
.If u have several independent sessions u can use concurrent batches. u should have to copy that maping first before u copy the session In addition.Batches r two types Sequential: Runs sessions one after the other Concurrent: Runs session at same time. you can copy the workflow from the Repository manager. Aggreagtor transformation Joiner transformation Rank transformation Lookup transformation In which circumstances that informatica server creates Reject files? When it encounters the DD_Reject in update strategy transformation. This will automatically copy the mapping. If target folder or repository is not having the maping of copying session .
How can u stop a batch? By using server manager or pmcmd. What r the different options used to configure the sequential batches? Two options Run the session only if previous session completes sucessfully.create a new independent batch and copy the necessary sessions into the new batch. Database connections Source file names: use this parameter when u want to change the name or location of session source file between session runs Target file name : Use this parameter when u want to change the name or location of session target file between session runs.in case of concurrent batch we cant do like this. In a sequential batch can u run the session if previous session fails? Yes. Server manager also allows U to create userdefined session parameters. If u want to start batch that resides in a batch. Logically. What r the session parameters? Session parameters r like maping parameters. Always runs the session. Reject file name : Use this parameter when u want to change the name or location of
.How many number of sessions that u can create in a batch? Any number of sessions. Yes Can u start a session inside a batch idividually? We can start our required session only in case of sequential batch.Following r user defined session parameters. When the informatica server marks that a batch is failed? If one of session is configured to "run if previous completes" and that previous session fails.By setting the option always runs the session. What is a command that used to run a batch? pmcmd is used to start a batch. Can u start a batches with in a batch? U can not.represent values U might want to change between sessions such as database connections or source files.
For UNIX shell users. pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '\$PMRootDir/myfile. U can define the following values in parameter file Maping parameters Maping variables session parameters When you start a workflow. enclose the file name in double quotes: -paramfile ”$PMRootDir\my file.A parameter file is a file created by text editor such as word pad or notepad. What is parameter file? Parameter file is to define the values for parameters and variables used in a session. FileSource : To access the remote source file U must configure the FTP connection to the host machine before u create the session.the server manager creates a hetrogenous session that displays source options for all types. the parameter file name cannot have beginning or trailing spaces.txt' How can u access the remote source into U’r session? Relational source: To acess relational source which is situated in a remote place . If the name includes spaces. The Informatica Server runs the workflow using the parameters in the file you specify. you can optionally enter the directory and name of a parameter file. enclose the parameter file name in single quotes: -paramfile '$PMRootDir/myfile.u need to configure database connection to the datasource. This ensures that the machine where the variable is defined expands the server variable.session reject files between session runs. Hetrogenous : When U’r maping contains more than one source type. What is difference between partioning of relatonal target and partitioning of file targets?
.txt' For Windows command prompt users.txt” Note: When you write a pmcmd command that includes a parameter file located on another machine. use the backslash (\) with the dollar sign ($).
Partition's can be done on both relational and flat files.If u parttion a session with a relational target informatica server creates multiple connections to the target database to write target data concurently.U can just specify the name of the target file and create the partitions. Informatica supports following partitions 1.Key Range partitioning All these are applicable for relational targets.U can configure session properties to merge these target files.Pass-through 4.For flat file only database partitioning is not applicable.If u partition a session with a file target the informatica server creates one target file for each partition. Informatica supports Nway partitioning. Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source Joiner Transformation : U can not partition the master source for a joiner transformation Normalizer Transformation XML targets.RoundRobin 3. what r the transformations that restricts the partitioning of sessions? Advanced External procedure tranformation and External procedure transformation: This transformation contains a check box on the properties tab to allow partitioning.Database partitioning 2. Performance tuning in Informatica? The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.Hash-Key partitioning 5. rest will be taken care by informatica session.Increase the session performance by following.
u can use incremental aggregation to improve session performance.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session. Relational datasources: Minimize the connections to sources . move those files to the machine that consists of informatica server.The performance of the Informatica Server is related to network connections. Run the informatica server in ASCII datamovement mode improves the session performance. Running a parallel sessions by using concurrent batches will also reduce the time of loading the data. U can run the multiple informatica servers againist the same repository. Flat files: If u’r flat files stored on a machine other than the informatca server.To do this go to server manger . Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines.Moving target database into server system may improve session performance.So aviod netwrok connections.Distibuting the session load to multiple informatica servers may improve session performance. Data generally moves across a network at less than 1 MB per second.Unicode mode takes 2 bytes to store a character. In some cases if a session contains a aggregator transformation . Thus network connections ofteny affect on session performance. single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes. optimizing the query may improve performance. Also.
.which allows data to cross the network at one time. If u r target consists key constraints and indexes u slow the loading of data. Removing of staging areas may improve session performance. Staging areas: If u use staging areas u force informatica server to perform multiple datapasses. whereas a local disk moves data five to twenty times faster.choose server configure database connections. We can improve the session performance by configuring the network packet size.So concurent batches may also increase the session performance. If a session joins multiple source tables in one Source Qualifier.Because ASCII datamovement mode stores a character value in one byte.targets and informatica server to improve session performance.
If the sessioin containd lookup transformation u can improve the session performance by enabling the look up cache.normalizer transformations in maplet.create that filter transformation nearer to the sources or u can use filter condition in source qualifier. What is difference between maplet and reusable transformation? Maplet consists of set of transformations that is reusable.The Repository Manager connects to the repository database and runs the code needed to create the repository tables. Aggreagator. Metadata can include information such as mappings describing how to transform source data.Thsea tables
. Whole transformation logic will be hided in case of maplet. and connect strings for sources and targets. and product version. If u create a variables or parameters in maplet that can not be used in another maping or maplet.Where as we can make them as a reusable transformations. Maplet: one or more transformations Reusable transformation: only one transformation Define informatica repository? The Informatica repository is a relational database that stores information.joiner. permissions and privileges.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.Because they must group data before processing it. or metadata. We can not include source definitions in reusable transformations.Rank and joiner transformation may oftenly decrease the session performance . Use repository manager to create the repository.But we can add sources to a maplet.A reusable transformation is a single transformation that can be reusable. The repository also stores administrative information such as usernames and passwords.Aviod transformation errors to improve the session performance. sessions indicating when you want the Informatica Server to perform the transformations.But it is transparent in case of reusable transformation. If U’r session contains filter transformation .To improve session performance in this case use sorted ports option. used by the Informatica Server and Client tools. We cant use COBOL source qualifier.
You create a set of metadata tables within the repository database that the informatica application and tools access. A repository that functions individually. A workflow is a set of instructions that describes how and when to run tasks related to extracting. views. These are the instructions that the Informatica Server uses to transform and move data. synonyms) or files that provide source data. A set of transformations that you can use in multiple mappings. Each session corresponds to a single mapping. A session is a type of task that you can put in a workflow. Definitions of database objects (tables. (PowerCenter only. transforming. Mappings. The informatica client and server access the repository to save and retrieve metadata. Target definitions. unrelated and unconnected to other repositories. What is power center repository? Standalone repository. Definitions of database objects or files that contain the target data. Infromatica Repository:The informatica repository is at the center of the informatica suite. Sessions and workflows.) A repository within a domain that is not the global repository. Local repository. Sessions and workflows store information about how and when the Informatica Server moves data. A set of source and target definitions along with transformations containing business logic that you build into the transformation. Transformations that you can use in multiple mappings. Mapplets. Reusable transformations.) The centralized repository in a domain. Each domain can contain one global repository. Global repository. What r the types of metadata that stores in repository? Following r the types of metadata that stores in the repository Database connections Global objects Mappings Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source definitions Target defintions Transformations Source definitions. and loading data.client tools use. Target definitions that are configured as cubes and dimensions. Each local repository in the domain can connect to the global repository and use objects in its shared folders
.stores metadata in specific format the informatica server. Multi-dimensional metadata. a group of connected repositories. The global repository can contain common objects to be shared throughout the domain through global shortcuts. (PowerCenter only.
. If the source changes only incrementally and you can capture changes. Customized repeat: Informatica server runs the session at the dats and times secified in the repeat dialog box.How can u work with remote database in informatica?did u work directly by using remote connections? You can work with remote. what is incremantal aggregation? When using incremental aggregation. This allows the Informatica Server to update your target incrementally. What r the scheduling options to run a sesion? U can shedule a session to run at a given time or intervel.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time. rather than forcing it to process the entire source and recalculate the same calculations each time you run the session. Run every: Informatica server runs the session at regular intervels as u configured.or u can manually run the session. Different options of scheduling Run only on demand: server runs the session only when user starts session explicitly Run once: Informatica server runs the session only once at a specified date and time.Instead u bring that source into U r local machine where informatica server resides. you can configure the session to process only those changes. you apply captured changes in the source to aggregate calculations in a session.
But you have to
Configure FTP Connection details IP address User authentication To work with remote datasource u need to connect it with remote connections.But it is not preferable to work with that remote source directly by using remote connections .
By default. Correct the errors. The Informatica Server then reads all sources again and starts processing from the next row ID. Explain about Recovering sessions? If you stop a session or if an error causes a session to stop.Ie u need to make it as a DLL to access in u r maping. · Consider performing recovery if the Informatica Server has issued at least one commit. and then complete the session. Types of tracing level Normal Verbose Verbose init Verbose data What is difference between stored procedure transformation and external procedure transformation? In case of storedprocedure transformation procedure will be compiled and executed in a relational data source. refer to the session and error logs to determine the cause of failure.001.No need to have data base connection in case of external procedure transformation. Perform Recovery is disabled in the Informatica Server setup.How can u load the records from 10001 th record when u run the session next time? As explained above informatcia server has 3 methods to recovering the sessions. session. it reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to the target database. Explain about perform recovery? When the Informatica Server starts a recovery session. the Informatica Server bypasses the rows up to 10. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table. The method you use to complete the session depends on the properties of the mapping. if the Informatica Server commits 10. and Informatica Server configuration. Use one of the following methods to complete the session: · Run the session again if the Informatica Server has not issued a commit. · Truncate the target tables and run the session again if the session is not recoverable.000 records in to the target. If a session fails after loading of 10.000 and starts loading with row 10. How to recover the standalone session?
.What is tracing level and what r the types of tracing level? Tracing level represents the amount of information that informatcia server writes in a log file.U need data base connection to import the stored procedure in to u’r maping. For example.000 rows before the session fails.Use performing recovery to load the records from where the session fails.Where as in external procedure transformation procedure or function will be executed out side of data source. when you run recovery.
After the batch completes.Run the session. and the remaining sessions in the batch complete. From the command line. if a session in a concurrent batch fails and the rest of the sessions complete successfully. The Informatica Server completes the session and then runs the rest of the batch. 2.From the command line. select Server Requests-Start Session in Recovery Mode from the menu. you might want to truncate all targets and run the batch again. How can u complete unrcoverable sessions? Under certain circumstances. and click OK. Select Server Requests-Stop from the menu. 2.Follow the steps to recover a standalone session. How to recover sessions in concurrent batches? If multiple sessions in a concurrent batch fail. stop the session. open the session property sheet. 3. To recover sessions using the menu: 1. 3. How can u recover the session in sequential batches? If you configure a session in a sequential batch to stop on failure. With the failed session highlighted. However. you can recover the session as a standalone session. 5. In the Server Manager. you need to truncate the target tables and run the session from the beginning. 4. highlight the session you want to recover. 2. These options are not available for batched sessions. you can run recovery using a menu command or pmcmd.Clear Perform Recovery. start recovery. the next time you run the session. 2. To recover a session in a concurrent batch: 1. 3. open the session property sheet.A standalone session is a session that is not nested in a batch.Copy the failed session using Operations-Copy Session. select Perform Recovery. the Informatica Server attempts to recover the previous session. you can run recovery starting with the failed session. Use the Perform Recovery session property To recover sessions in sequential batches configured to stop on failure: 1. Run the session from the beginning
. recover the failed session as a standalone session.Drag the copied session outside the batch to be a standalone session. If a standalone session fails. and click OK. If you do not clear Perform Recovery. when a session does not complete. If you do not configure a session in a sequential batch to stop on failure.In the Server Manager. 4. To recover sessions using pmcmd: 1.On the Log Files tab.Delete the standalone copy.
If u change the partition information after the initial session fails. these tables would be merged as a single table called TIME DIMENSION for performance and slicing data. can u map these three ports directly to target? u drag three hetrogenous sources and populated to target without any join means you are entertaining Carteisn product.It displays u all the information that is to be stored in repository. So we need to polish this data. clean it before it is add to Datawarehouse.If want to reflect back end changes to informatica screens.when the Informatica Server cannot run recovery or when running recovery might result in inconsistent data. month lookup. If you are not interested to use joins at source qualifier level u can add some joins sepratly. In a dimensional data modeling(star schema). year lookup. What r the circumstances that infromatica server results an unreciverable session? The source qualifier transformation does not use sorted ports.oracle. If i done any modifications for my table in back end does it reflect in informatca warehouse or maping desginer or source analyzer? NO. If a concuurent batche contains multiple failed sessions. It will be reflected once u refresh the mapping once again. We might need a address cleansing to tool to have the customers addresses in clean and neat form. If the sources or targets changes after initial session fails. for normalization purposes. The all sub systesms maintinns the customer address can be different. If the maping consists of sequence generator or normalizer transformation.informix) to a single source qualifier. quarter lookup. Informatica is not at all concern with back end data base. Other typical example can be Addresses. Yes.
. Perform recovery is disabled in the informatica server configuration. For example of one of the sub system store the Gender as M and F. After draging the ports of three sources(sql server. and week lookups are not merged as a single table. again u have to import from back end to informatica by valid connection.And u have to replace the existing files with imported files. What is Data cleansing? This is nothing but polising of data. The other may store it as MALE and FEMALE. what is a time dimension? give an example In a relational data model. If you don't use join means not only diffrent sources but homegeous sources are show same error.
Also.This dimensions helps to find the sales done on date. Diff between informatica repositry server & informatica server Informatica Repository Server:It's manages connections to the repository from client application. But in performance point of view using too many transformations will reduce the session performance.in snowflake dimension tables r normalized.In snowflake schema the dimension tables are connected with some subdimension table. At the max how many tranformations can be us in a mapping? 22 transformation.snowflake schema is used for cube. monthly and yearly basis. The advantage of snowflake schema is that the normalized tables r easier to maintain. A flat file target is built by pulling a source into target space using Warehouse Designer tool. stored procedure etc. You can find on Informatica transformation tool bar.it also saves the storage space.and loads the transformed data into the target Discuss the advantages & Disadvantages of star & snowflake schema? star schema consists of single fact table surrounded by some dimensional table. weekly. aggregator. There is no such limitation to use this number of transformations.
.performs the data transformation. How do you transfert the data from data warehouse to flatfile? You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. router. We can have a trend analysis by comparing this year sales with the previous year or this week sales with the previous week. expression. The disadvantage of snowflake schema is that it reduces the effectiveness of navigation across the tables due to large no of joins between them. star schema is used for report generation . Normalizer transformation can be used to create multiple rows from a single row of data How to read rejected data or bad data from bad file and reload it to target? correction the rejected data and send to target relational tables using loadorder utility. Informatica Server:It's extracts the source data. Waht are main advantages and purpose of using Normalizer Transformation in Informatica? Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. In starflake dimensional ables r denormalized. Find out the rejected data by using column indicatior and row indicator. joiner.
" Always remember when designing a mapping: less for more design with the least number of transformations that can do the most jobs. The junk dimension is simply a structure that provides a convenient place to store the junk attributes.My idea is "if needed more tranformations to use in a mapping its better to go for some stored procedure. 1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query. Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. A good example would be a trade fact in a company that brokers equity trades. Bulk load is also called direct loading. 2) source qualifier don't have any variables feilds to utalize as expression. flags and/or text attributes that are unrelated to any particular dimension. can we lookup a table from a source qualifer transformation-unconnected lookup No. I will explain you why. how to get the first 100 rows from the flat file into the target?
. What is the difference between Narmal load and Bulk load? what is the difference between powermart and power center? when we go for unconnected lookup transformation? bulk load is faster than normal load. else the session will be failed. Rule of thumb For small number of rows use Normal load For volume of data use bulk load what is a junk dimension? A "junk" dimension is a collection of random transactional codes. we can't do. compartivly Bulk load is pretty faster than normal load.in such cases we should you normal load only. when the source file is a text file and loading data to a table. In case of bulk load informatica server by passes the data base log file so we can not roll bac the transactions. Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases.
Compare Data Warehousing Top-Down approach with Bottom-up approach? bottom approach is the best because in 3 tier architecture datatier is the bottom one. change what ever you want (but datatype should be the same) difference between summary filter and details filter? Summary Filter --. Unlike an ordinary view. here storage parameters are required. replicate. can we modify the data in flat file? Just open the text file with notepad. 2. I'd copy first 100 records to new file and load. where as materialized view stores the data as well. to construct a data warehouse. Detail Filter --.the select query is stored in the db. which does not take up any storage space or contain any data view. precompute. Effectively u r calling the stored query. like table. E. In case u want use the query repetadly or complex queries. Put counter/sequence generator in mapping and perform it.g. A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. task ----->(link) session (workflow manager) double click on link and type $$source sucsess rows(parameter in session variables) = 100 it should automatically stops session. what are the difference between view and materialized view? Materialized views are schema objects that can be used to summarize. In case of materialised view we can perform DML but reverse is not true in case of simple view. and distribute data. we store the queries in Db using View. whenever u use select from view.we can apply to each and every record in a database.please check this one.
. 1. the stored query is executed. Just add this Unix command in session properties --> Components --> Pre-session Command head -100 <source file path> > <new file name> Mention new file name and path in the Session --> Source properties. Use test download option if you want to use it for testing.we can apply records group by that contain common values.
. Otherwise Incremental load which can be better as it takes onle that data which is not available previously on the target. or is called by an expression in another transformation in the mapping. less time as compared to above and less cost. What is the difference between connected and unconnected stored procedures. It either runs before or after the session.At the time of software intragartion buttom/up is good but implimentatino time top/down is good top down ODS-->ETL-->Datawarehouse-->Datamart-->OLAP Bottom up ODS-->ETL-->Datamart-->Datawarehouse-->OLAP in top down approch: first we have to build dataware house then we will build data marts. It all depends on your business requirements and what is in place already at your company. If the database supports bulk load option from Infromatica then using BULK LOAD for intial loading the tables is recommended. Nothing wrong with any of these approaches. For more info read Kimball vs Inmon. Lot of folks have a hybrid approach. All data entering the transformation through the input ports affects the stored procedure. or the results of a stored procedure sent as an output parameter to another transformation. Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure. the data mart that is first build will remain as a proff of concept for the others. in bottom up approach: first we will build data marts then data warehuse. Normal Load and Bulk load? It depends on the requirement.. which will need more crossfunctional skills and timetaking process also costly. Discuss which is better among incremental load. Depending upon the requirment we should choose between Normal and incremental loading strategies.
lookup on flat file 3.2 and 7. In ver 7x u have the option of looking up (lookup) on a flat file.Differences between Informatica 6. they introduce workflow manager and workflow monitor. Repository server is one which communicates with the repository i.we can use pmcmdrep 5.1 are : 1. U can write to XML target.it stores all metadata of the infa objects.2 and 5.union and custom transformation 2.we ca move mapping in any web application 7.data profilling What are the Differences between Informatica Power Center versions 6. also between Versions 6.1? graphical enhecements and xml files the main difference between informatica 5.1). Versioning LDAP authentication Support of 64 bit architectures whats the diff between Informatica powercenter server.
. all the metadata is retrived from the DB through Rep server.1 is that in 6.we can export independent and dependent rep objects 6.1 they introduce a new thing called repository server and in place of server manager(5.e DB.All the client tools communicate with the DB through Rep server. repositoryserver and repository? Repository is nothing but a set of tables created in a DB.0 in 7.0 intorduce custom transfermation and union transfermation and also flat file lookup condition Features in 7.grid servers working on different operating systems can coexist on same server 4.1.1 and 6.2 and Informatica 7.version controlling 8.
how to create the staging area in your database creating of staging tables/area is the work of data modellor/dba. concatenate first and last names. repository-it is a place where all the metadata information is stored. what does the expression n filter transformations do in Informatica Slowly growing target wizard? EXPESSION transformation detects and flags the rows from source.
. So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab..Infa server is one which is responsible for running the WF.. They do not automatically update to the current folder version.1 use 9 Tem server i.x use 22 transformation. and calculates new flag Based on that New Flag." the tables will be created.and add 5 transformation .x use only 8 tem server.. power center server-power center server does the extraction from the source and loaded it to the target.. in 6.x anly 17 transformation but 7. filter transformation filters the Data You can use the Expression transformation to calculate values in a single row before you write to the target. all shortcuts continue to reference their original object in the original version.repository server and power center server access the repository for managing the data. tasks etc.e add in Look up. Briefly explian the Versioning Concept in Power Center 7.1? In power center 7.. Infa server also communicates with the DB through Rep server. For example. repository server-it takes care of the connection between the power center client and repository.. Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation Expression finds the Primary key is or not. tmp-----> indicate temparary tables nothing but staging A Staging area in a DW is used as a temporary space to hold all the records from the source system. you might need to adjust employee salaries. But in power center 6. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options.. When you create a version of a folder referenced by shortcuts. or convert strings to numbers.just like " create table <tablename>. they will have some name to identified as staging like dwc_tmp_asset_eval.
For example, if you have a shortcut to a source definition in the Marketing folder, version 1.0.0, then you create a new folder version, 1.5.0, the shortcut continues to point to the source definition in version 1.0.0. Maintaining versions of shared folders can result in shortcuts pointing to different versions of the folder. Though shortcuts to different versions do not affect the server, they might prove more difficult to maintain. To avoid this, you can recreate shortcuts pointing to earlier versions, but this solution is not practical for much-used objects. Therefore, when possible, do not version folders referenced by shortcuts. How to join two tables without using the Joiner Transformation. Itz possible to join the two or more tables by using source qualifier.But provided the tables should have relationship. When u drag n drop the tables u will getting the source qualifier for each table.Delete all the source qualifiers.Add a common source qualifier for all.Right click on the source qualifier u will find EDIT click on it.Click on the properties tab,u will find sql query in that u can write ur sqls can do using source qualifer, but some limitations are there. Identifying bottlenecks in various components of Informatica and resolving them. Can Informatica be used as a Cleansing Tool? If Yes, give example of transformations that can implement a data cleansing routine. Yes, we can use Informatica for cleansing data. some time we use stages to cleansing the data. It depends upon performance again else we can use expression to cleasing data. For example an feild X have some values and other with Null values and assigned to target feild where target feild is notnull column, inside an expression we can assign space or some constant value to avoid session failure. The input data is in one format and target is in another format, we can change the format in expression. we can assign some default values to the target to represent complete set of data in the target. info can be used as a cleansing tool. usually EXP is used for cleaning data using data cleansing functions and other functions present in infa. eg, assign default values for a not null fields. remove spaces if any in flat file sources by usning LTRIM and RTRIM.
if date has some char in between then remove that char by using REPLACECHAR. u can use SUBSTR to remove extra char. there are many more functions and uses to be explored. How do you decide whether you need ti do aggregations at database level or at Informatica level? It depends upon our requirment only.If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here i'm explaing why we need to use informatica. what ever it may be informatica is a thrid party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happend total aggregation we need to execute on informatica also. In database we don't have Incremental aggregation facility see informatica is basically a integration tool.it all depends on the source u have and ur requirment.if u have a EMS Q, or flat file or any source other than RDBMS, u need info to do any kind of agg functions. if ur source is a RDBMS, u r not only doing the aggregation using informatica right?? there will be a bussiness logic behind it. and u need to do some other things like looking up against some table or joining the agg result with the actual source. etc... if in informatica if u r asking whether to do it in the mapping level or at DB level, then fine its always better to do agg at the DB level by using SQL over ride in SQ, if only aggr is the main purpose of ur mapping. it definetly improves the performance. How do we estimate the depth of the session scheduling queue? Where do we set the number of maximum concurrent sessions that Informatica can run at a given time? How do we estimate the number of partitons that a mapping really requires? Is it dependent on the machine configuration? it depends upon the informatica version we r using. suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions. Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows. Explain the commit points for Source based commit and Target based commit. Assume appropriate value wherever required.
Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target. Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data. We are using Update Strategy Transformation in mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica.
Operation Insert Update Delete Reject Constant DD_INSERT DD_UPDATE DD_DELETE DD_REJECT Numeric value 0 1 2 3
if u r using an update strategy in ur mapping, there is no such oprtion to check or uncheck for these operations.when u have to perform any of the DML or database operations u have to code it in the UPD manually. so there is no chance for checking this. if u have used DD_UPDATE or DD_REJECT, u can only know it by querying the target table.if its rejected then through session log. what is the procedure to write the query to list the highest salary of three employees? SELECT sal FROM (SELECT sal FROM my_table ORDER BY sal DESC) WHERE ROWNUM < 4; which objects are required by the debugger to create a valid debug session? Intially the session should be valid session. source, target, lookups, expressions should be availble, min 1 break point should be available for debugger to debug your session. We can create a valid debug session even without a single break-point. But we have to give valid database connection details for sources, targets, and lookups used in the mapping and it should contain valid mapplets (if any in the mapping). What is the limit to the number of sources and targets you can have in a mapping there is one formula.. no.of bloccks=0.9*( DTM buffer size/block size)*no.of partitions.
in Dimensional modeling fact table is normalized or denormalized?in case of star schema and incase of snow flake schema? Star schmea--De-Normalized Dimesions Snow Flake Schema-. Unconn lookup don't have that faclity but in some special cases we can use Unconnected. as i said earlier each has its own advan and disadvan and it depends on the requirment.here no. but… it all depends on the requirments.of blocks=(source+targets)*2 As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping. then obviously u go for a unconnected lookup rather than creating same connected lookup many times. when u want to use the same lookup table many times in the same mapping.Normalized dimesions In Dimensional modeling. When you compared both basically connected lookup will return more values and unconnected returns one value. It reduces database and informatica server performance" which is better among connected lookup and unconnected lookup transformations in informatica or any other ETL tool? Its not a easy question to say which is better out of connected.normalized data Snowflake Schema: A
. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favourable. If you are having defined source you can use connected. unconnected lookups. Star Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprise of de. I orginzation point of view it is never encouraged to use N number of tables at a time. Its depends upon our experience and upon the requirment. source is not well defined or from different database you can go for unconnected We are using like that only
Asking question is very easy. conn lookup is in the same pipeline of source and it will accept dynamic caching. Question is " if you make N number of tables to participate at a time in processing what is the position of your database. u will be using the dynamic cache. only when u want to track the changes to the target table records in that perticular run. dynamic cache is usually not preffered as a performance issue.it reduces.
It is called a snowflake schema because the diagram of the schema resembles a snowflake. BONUS))). SALARY1. highly normalized tables make reporting difficult and applications complex. and a product-manufacturer table in a snowflake schema. The following example tests for various conditions and returns 0 if sales is zero or negative: IIF( SALES > 0. The following shows how you can use DECODE instead of IIF :
. they look like snowflakes (see snowflake schema) and the same problems of relational databases arise you need complex queries and business users cannot easily understand the meaning of data. a Product-category table. Because relational databases are the most common data management system in organizations today. What is difference between IIF and DECODE function? You can use nested IIF statements to test multiple conditions. the dimension data has been grouped into multiple tables instead of one large table. For example. In general In Star Schema Fact table is normalized & Dimension tables are denormalized. but dimensional tables in de-normalized second normal form (2NF). Fact tables are always in de-nomalized irrespective of whether it is a Snow Flake or Star Schema. consisting of a single "fact table" with a compound primary key. SALARY3. IIF( SALES < 200. with one segment for each "dimension" and with additional columns of additive. IIF( SALES < 100. DECODE may improve readability. implementing multi-dimensional views of data using a relational database is very appealing. That is.The Star Schema makes multi-dimensional database (MDDB) functionality possible using a traditional relational database. IIF( SALES < 50. Another reason for using star schema is its ease of understanding. The result is more complex queries and reduced query performance. a product dimension table in a star schema might be normalized into a products table. Fact tables in star schema are mostly in third normal form (3NF). SALARY2. 0 ) You can use DECODE instead of IIF in many cases. numeric facts. where we have to use more no of joins.Snowflake schemas normalize dimensions to eliminate redundancy. If you want to normalize dimensional tables. it increases the number of dimension tables and requires more foreign key joins. So Star schema is better way of representing the data. In Snow flake star shema Fact table is normalized & dimensional tables are also normalized. Even if you are using a specific MDDB solution. and is a type of star schema. Fact tables are nothing but OLAP tables and so they are denormalised. its sources likely are relational databases. Although query performance may be improved by advanced DBMS technology and hardware.The Snowflake Schema is a more complex data warehouse model than a star schema.Single Fact table will be surrounded by a group of Dimensional tables comprised of normalized dataThe Star Schema (sometimes referenced as star join schema) is the simplest data warehouse schema. While this saves space.
Variable port is used when we mathematical caluculations are required. SALARY2.2 Totalsal=sal+comm. If any addition i will be more than happy if you can share. Variable port. So click skip and the use the sql overide property in properties tab to join two table for lookup. How to lookup the data on multiple tabels. How to retrive the records from a rejected file. hats off to informatica When you create lookup transformation that time INFA asks for table name so you can choose either source.
. BONUS) What are variable ports and list two situations when they can be used? We have mainly tree ports Inport.if it is relational. if u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that joined table. Inport represents data is flowing into transformation. target .if is flat file FTP connection. SALES > 199.1\Server) These bad files can be imported as flat a file in source then thro' direct maping we can load these files in desired format. explane with syntax or example During the execution of workflow all the rejected rows will be stored in bad files(where your informatica server get installed.C:\Program Files\Informatica PowerCenter 7. For example if you are trying to calculate bonus from emp table
Bonus=sal*0. import and skip.SALES > 0 and SALES < 50.see we can make sure with connection in the properties of session both sources && targets. SALARY3. informatica provieds lookup on joined tables. Outport. SALARY1..+bonus variable port is used to break the complex expression into simpler and also it is used to store intermediate values How does the server recognise the source and target databases? by using ODBC connection. Outport is used when data is mapped to next transformation. SALES > 49 AND SALES < 100. SALES > 99 AND SALES < 200.
it helps in adding incremental data. in that u will have a "distinct" option make use of it can use a dynamic lookup or an aggregator or a sorter for doing this how to use mapping parameters and what is their use mapping parameters and variables make the use of mappings more flexible.and also it avoids creating of multiple mappings.Give in detail? Based on the requirement to your fact table.e) the getting started wizard and slowly changing dimension wizard to load the fact and dimension tables. when data from source is looked up against the dim table. if the two tables are relational. What is the procedure to load the fact table.some times only the existance check will be done and the prod_id itself will be sent to the fact. it may vary as per ur requirments and methods u follow.this is not the fixed rule to be followed. u can now continue this query. all the possible values are stored in DIM table. if u want to use a order by then use -. How to delete duplicate rows in flat files source is any option in informatica use a sorter transformation . choose the sources and data and transform it based on your business needs. What is the use of incremental aggregation? Explain me in brief with an example. just enter filter condition and finally create a parameter file to assgn the value for the variable / parameter and configigure the
.DIM tables are called lookup or reference table. when the informatica server performs incremental aggr. usually source records are looked up with the records in the dimension table.u cannot join a flat file and a relatioanl table. datatypeonce defined the variable/parameter is in the any expression for example in SQ transformation in the source filter prop[erties tab. it passes new source data through the mapping and uses historical chache data to perform new aggregation caluculations incrementaly. and choose type as parameter/variable. we use the 2 wizards (i. all the existing prod_id will be in DIM table. the corresponding keys are sent to the fact table. For the fact table.g product. e.mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping ----> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$. and a where clause. to be more specific. then u can use the SQL lookup over ride option to join the two tables in the lookup properties. for performance we will use it. you need a primary key so use a sequence generator transformation to generate a unique key and pipe it to the target (fact) table with the foreign keys from the source tables.by using these 2 wizards we can create different types of mappings according to the business requirements and load into the star schemas(fact and dimension tables).at the end of the order by.what ever my friends have answered earlier is correct. eg: lookup default query will be select lookup table column_names from lookup_table. its a session option. add column_names of the 2nd table with the qualifier.
how to do this. it should start with the value of 70.. there u will find the last value stored in the repository regarding to mapping variable.clob allow only 9i not 8i and more over list partinition is there in 9i only Can we use aggregator/active transformation after update strategy transformation
. recently i was asked by the interviewer like. the variable value will be saved to the repository after the completion of the session and the next time when u run the session.if there is a parameter file for the mapping variable it uses the value in the parameter file not the value+1 in the repositoryfor example assign the value of the mapping variable as 70. in the concept of mapping parameters and variables. if ther parameter is npt present it uses the initial value which is assigned at the time of creating the variable hi all. can any one comment on significance of oracle 9i in informatica when compared to oracle 8 or 8i. mapping variable represents avalue that can change throughout the session. in workflow manager start-------->session.in othere words higher preference is given to the value in the parameter file mapping parameter represents a constant value that u can define before running session and it returns same value.. the server takes the saved variable value in the repository and starts assigning the next value of the saved value. u can do onething after running the mapping.session properties.next time when i run the session. i mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in informatica Actually oracle 8i not allowed userdefined data types but 9i allows and then blob. in that go for persistant values.. for example i ran a session and in the end it stored a value of 50 to the repository. right clickon the session u will get a menu. i hope ur task will be done it takes value of 51 but u can override the saved variable in the repository by defining the value in the parameter file. then remove it and put ur desired one. not with the value of 51. run the session. however the final step is optional.
so all the dimensions are marinating historical data. The problem will be. please give answers only if u r confident about it. Apart from this. what your manager told is correct. " if we normalized the dimension table we will create such intermediate tables and that will not be efficient Yes. If we give wrong answers lot of people who did't know the answer thought it as the correct answer and may fail in the interview. they are de normalized. Thats why we maintain hierarchy in dimension tables
. it would be a waste of database space and also. then the deleted rows will be subtracted from this aggregator transformation. if u maintain primary key it won't allow the duplicate records with same employee id. First of all i want to tell one thing to all users who r using this site. we maintain Hierarchy in these tables. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column). regarding why dimenstion tables r in denormalised in nature. Similary. this may be a case. why dimenstion tables are denormalized in nature ? Because in Data warehousing historical data should be maintained. all details should be maintain in one table.county > city > state > territory > division > region > nation If we have different tables for all. and now where he is working. i had discussed with my project manager about this. the site must be helpfull to other .. if we have a hierarchy something like this. we need to query all these tables everytime.. once you perform the update strategy. please keep that in the mind.You can use aggregator after update strategy.. to maintain historical data means suppose one employee details like where previously he worked. for efficient query performance it is best if the query picks up an attribute from the dimension table and goes directly to the fact table and do not thru the intermediate tables. if there is a child table and then a parent table. one has to every time join or query both these tables to get the parent child relation. so if we have both child and parent in the same table. if both child and parent are kept in different tables. dear reham thanks for ur responce.. For example. refer it once again in the manual its not wrong. Maintaining Hierarchy is pretty important in the dwh environment.. say you had flagged some rows to be deleted and you had performed aggregator transformation for all rows. what he told is :-> The attributes in a dimension tables are used over again and again in queries. we can always refer immediately. say you are using SUM function. because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table..
source ->number datatype port ->SQ -> decimal datatype.654. hence decimal is taken care. in the mapping. enter precision as 8 and scale as 3 and width as 10 for fixed width flat file while importing flat file definetion just specify the scale for a neumaric data type. In a sequential Batch how can we stop single session? we can stop it using PMCMD command or in the monitor right click on that perticular session and select stop.this will stop the current session and the sessions next to it.Integer is not supported.
. the flat file wizard helps in configuring the properties of the file so that select the numeric column and just enter the precision value and the scale.and based on the business. we decide whether to maintain in the same table or different tables. Where do you start trouble shooting and what are the steps you follow? when the work flow is running slowly u have to find out the bottlenecks in this order target source mapping session system work flow may be slow due to different reasons one is alpa characters in decimal data check it out this and due to insufficient length of strings check with the sql override If you have four lookup tables in the workflow. How do you troubleshoot to improve performance? Use shared cache When a workflow has multiple lookup tables use shered cache there r many ways to improve the mapping which has multiple lookups. precision includes the scale for example if the number is 98888. If you are workflow is running slow in informatica. the flat file source supports only number datatype(no decimal and integer). In the SQ associated with that source will have a data type as decimal for that number port of the source. How do you handle decimal places while importing a flatfile into informatica? while geeting the data from flat file in informatica ie import data from flat file it will ask for the precision just enter that while importing the flat file.
. You can create some generalized transformations to handle the errors and use them in your mapping. 3)we can increase the chache size of the lookup. these r new rows . first of all after doing anything in your designer mode or workflow manager.target..But I'll give that some other time... can anyone explain error handling in informatica with examples so that it will be easy to explain the same in the interview? go to the session log file there we will find the information regarding to the session initiation process. Then click ok button.. Any gotcha\'s or precautions.. 1> First save the changes or new implementations.1) we can create an index for the lookup table if we have permissions(staging area).. Leave the information you have done like "modified this mapping" etc. 2>Then from navigator window.. There is a way also to reload bad/rejected records using information tools.. you will find versioning->Check In.X. these r existing rows..Still there is also another shortcut method for this. only the new rows will come to mapping and the process will be fast . In that window at the lower end side. There will be a pop up window.. 2) divide the lookup mapping into two (a) dedicate one for insert means: source .. Also you can setup bad files and rejected files which you can use to see error records. How do I import VSAM files from source to target. OK. What is the procedure or steps implementing versioning if you are already in version7. load summary. For version control in ETL layer using informatica. (b) dedicate the second one to update : source=target. so by seeing the errors encountered during the session running. For example for data types create one generalized transformation and include in your mapping then you will know the errors where they are occuring. we can resolve the errors. right click on the specific object you are currently in. A window will be opened..
. only the rows which exists allready will come into the mapping. do the following steps.. Do I need a special plugin As far my knowledge by using power exchange tool convert vsam file to oracle tables then do mapping as usual to the target table. errors encountered.
CREATE ONE PROCEDURE AND DECLARE THE SEQUENCE INSIDE THE PROCEDURE. 2. Differences between Normalizer and Normalizer transformation Normalizer: It is a transormation mainly using for cobol sources. Could anyone please tell me what are the steps required for type2 dimension/version data mapping.FINALLY CALL THE PROCEDURE IN INFORMATICA WITH THE HELP OF STORED PROCEDURE TRANSFORMATION.In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL. Determine if the incoming row is 1) a new record 2) an updated record or 3) a record that already exists in the table using two lookup transformations.
.This file is maily used in Cognos Impromptu tool after creating a imr( report) we save the imr as IQD file which is used while creating a cube in power play transformer. how can we implement it go to mapping designer in it go for mapping select wizard in it go for slowly changind dimension here u will find a new window their u need to give the mapping name source table target table and type of scd then if select finish scd 2 mapping is craeted go to waredesigner and generate the table then validate the mapping in mapping designer save it to repository run the session in workflow manager later update the source table and re run agian u will find the difference in target table 1. it's change the rows into coloums and columns into rows Normalization:To remove the retundancy and inconsitecy What is IQD file? IQD file is nothing but Impromptu Query Definetion. data cleansing.In data source type we select Impromptu Query Definetion. If 1) create a pipe that inserts all the rows into the table. sampling? Cleansing:---TO identify and remove the retundacy and inconsistency sampling: just smaple the data throug send the data from source to target How to import oracle sequence into Informatica. What is data merging. Split the mapping into 3 seperate flows using a router transformation.
Two relational tables are connected to SQ Trans.Treate source rows as :update so. one to insert the new.if we go for session properties we can see errors related to data
.then go to transformation statistics ther we can see number of rows in source and target. target and each transformation level.But if u use a UPD or session level properties it necessarily should have a PK.now you can update the rows in the target table if ur database is teradata we can do it with a tpump or mload external loader.g update by ename.3. There is an option insert update insert as update update as update like that by using this we will easily solve By default all the rows in the session is set as insert flag .e. With out using Updatestretagy and sessons options.what are the possible errors it will be thrown? The only two possibilities as of I know is Both the table should have primary key/foreign key relation ship Both the table should be available in the same schema or same database what is the best way to show metadata(number of rows at source.you can change it in the session general properties -. all the incoming rows will be set with update flag. how we can do the update our target table? In session properties.its not a key column in the EMP table. error related data) in a report format When your workflow get completed go to workflow monitor right click the session . update override in target properties is used basically for updating the target table based on a non key column. If 2) create two pipes from the same source. one updating the old record.
what oracle engine internally does is. 1. what are partition points? Partition points mark the thread boundaries in a source pipeline and divide the pipeline into stages. you can use the view REP_SESS_LOG to get these data If u had to split the source level key going into two seperate tables. it basically calculates the cost of each path and the analyses for which path the cost of execution is less and then executes that path so that it can optimize the quey execution. Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques.then What CBO does is.0 where u can append to a flat file. the optimzer runs the query. What are the different ways you could handle this type of situation? foreign key How to append the records in flat file(Informatica) ? Where as in Datastage we have the options i) overwrite the existing file ii) Append existing file this is not there in Informatica v 7. One as surrogate and other as primary. If the table is not analysed . it reads the query and decides which will the best possible way for executing the query. Its about to be shipping in the market.
. Since informatica does not gurantee keys are loaded properly(order!) into those tables. but heard that its included in the latest version 8.. Use: If the table you are trying to query is already analysed. 2. Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query. So depending on the number of rules which are to be applied. So in this process. what are cost based and rule based approaches and the difference Cost based and rule based approaches are the optimization techniques which are used in related to databases.) When ever you process any sql query in Oracle. bcz the third has some disadvantages.You can select these details from the repository table. Oracle follows these optimization techniques. cost based Optimizer(CBO): If a sql query can be executed in 2 different ways ( like may have path 1 and path2 for same query). the Oracle follows RBO. then oracle will go with CBO. where we need to optimize a sql query.
These transactions are complex because there are a lot of context parameters and many special parameters describing modern financial investments. Now you may be thinking. (See Figure 1. We excuse these five fields from the design and keep the remaining 15 fields. what is mystery dimention? using Mystery Dimension ur maitaining the mystery data in ur Project
Find the Obvious Dimension-Related Fields For the first step of triage. When a design challenge such as this confronts me. which we model as facts. But five of the fields turned out to be cumulative measures that are not appropriate to the grain of an individual transaction. A typical raw source data record is likely to be a kind of flat record containing both keys for these entities as well as descriptive text such as account type and customer name. I try to stand back from the details and perform a kind of triage. I discussed the details of this approach in my DBMS column “Data Warehouse Role Models” (August 1997. What on earth was all that stuff? Investment transactions are good examples of complex. we data warehouse architects drive a fact table’s design from a specific data source. account types. We need to place a lot of redundant textual information in conventional dimensions. messy data. because none of the source data transaction
. then it is almost surely a measurement.) Find the Fact.Related Fields The second step of the triage is to look for the numeric measurements. Other straightforward dimension-related fields in our investment transaction include account numbers. The complexity isn’t the database designer’s fault. portfolio numbers. A typical complex example might be a set of records describing investment transactions. We can accomplish this task by creating four views on the single underlying calendar table. transaction types and codes. where we ask a single underlying calendar dimension to play four roles. Each of these can be a time dimension. I find fields in the source data that are obviously parts of dimensions. Because every record in the data represented an investment transaction. what kind of weird transaction could possibly have 15 simultaneous facts? That’s a good question. there are still 12 independent dimensions. In the case of the 50 investment transaction fields. But the 50 fields intimidated me. In the case of the 50 investment transaction fields I’ve described. I hoped that the data source would generate only one fact table. and the end users assured me that all the data was relevant and valuable. Maybe four separate timestamps describe our investment transaction. 20 of the fields clearly fit the characteristics of measurements. and location-specific information. where the grain was the individual transaction. and all the investments were somewhat similar. but after the dust clears. if table is not analysed. customer names and numbers. Oracle will go with full table scan. see Resources). of which four were roles the time dimension played. Anything that is a floating-point number or a scaled integer (such as a currency value) is likely to be a measurement. Timestamps are straightforward.For the first time. If the value varies in a seemingly random way between records and takes on a very large number of different values. page 34. broker names and numbers. we can quickly identify no fewer than 20 of the fields as dimension related. A recent example I studied had more than 50 fields in the raw data.
I couldn’t agree more. But if the enterprise data model describes a kind of abstract. Our goal should be to make these fields into dimensions.
FIGURE 1 The logical progression of transforming a complex single data source into its corresponding dimensional model. this article probably describes a specific episode in building that very useful enterprise data model. we have proceeded correctly. The fields do not appear to be numeric measurements. but no one is entirely sure of their significance. This is almost certainly a bad idea.
Also. Actually. and to isolate a hopefully small subset of difficult data elements that require individual attention. but different set of facts. At this point.
Transform Mystery Fields Into Mystery Dimensions Returning to our problem of 10 rogue fields that seem to be neither dimensions nor facts. so they may not feel like dimensions.
. they seem to take on a small range of discrete values. then I am its biggest fan. Idealized enterprise data models aren’t populated with data. there was no disjoint partitioning of all the transactions that would separate the clumps of facts into nice groups. Many of the codes or alphanumeric fields would otherwise take up too much room. so we will leave it in. including future types of investments that the records had not yet described. the pattern of measurements across these types and accounts was too complex to describe or neatly segment. (See Figure 1. In this sense. we could vindicate the raw data’s design because it had to be flexible enough to handle many different investment transactions situations. There were many transaction types and many investment account types. but pointless question: If we don’t know what the field means. The value of the triage approach is to quickly identify the easy choices (in this case the obvious dimensions and facts).records actually had all 15 facts. Most important. is that someone may need it. then all these problems would have been sorted out and we wouldn’t have to pursue such an ad hoc approach. we may be tempted to just leave them in the fact table. of course. perhaps at this point you are thinking that if we had a proper enterprise data model. But there are still 10 mystery fields left over.
Decide What to Do With the Rest So far we have accounted for 40 of the 50 fields in our original data. When the fields are present. ideal data world. then I have very little patience.) These fields aren’t obvious textual dimension attributes or obvious foreign key values. Some of them are designated as codes. Idealized enterprise data models are only of marginal use when we try to take real data and deliver it to end users on a tight budget and time frame. If an enterprise data model is a model of real data. In that case. describing how data should be if only it were designed correctly. in spite of this frustrating third step of the triage. why don’t we leave it out of the design? The answer. Certain kinds of transactions gave rise to one set of facts and other transactions gave rise to an overlapping. I ask an obligatory. Well. and we could drastically compress them if we could make them into dimensions.
If the number of FieldX + FieldY combinations approaches 100. Even this correlation is pretty interesting.000. rather. it will be interesting to browse the correlated fields against each other. Ultimately. even if we are not completely sure what they mean.
When Should Two Fields Be in a Single Dimension? We are almost within reach of our final goal. If you have five uncorrelated fields. First. and every transaction would produce a new mystery dimension record. We need to make a significant effort to find the correlated mystery fields and group them together into a smaller number of new dimensions. we end up with the Cartesian product of the fields.000 such combinations. a small and convenient mystery dimension. you may have to comb the data. one for each mystery field.000 or 10. More important. Each of these correlated groups becomes a new dimension. Maybe some of them will turn out to have hierarchical relationships. then it would be reasonable to package them all in a single mystery dimension. But the situation is rarely so extreme. But this approach is likely to produce a dimension with as many records as the fact table itself. you should be making pragmatic packaging decisions that best fit your data and your tools.000 discrete values. All the fields would go away to be replaced by a single key. and the two fields should be part of the same dimension. Second. and that should raise a warning flag. The number of FieldX + FieldY combinations might be 5. the number of new dimensions required will be reduced. Suppose FieldX has 100 discrete values and FieldY has 1. If the dimension contains several uncorrelated fields. then there will be very few repeated values for the whole dimension record. While this does place these low cardinality codes and textual values in dimension tables where we can easily index and constrain upon them. Should we just make one huge mystery dimension for all these remaining fields? That would seem to solve a number of problems.Another easy approach is to just make 10 more dimensions. Finally.000. We have separated off the obvious original dimensions and facts with our triage decisions. but there are only 35 = 243 possible combinations. These relationships can be revealed when the fields are compressed into dimension tables where only the unique combinations are presented. We have decided that all that remains is to package the rest of the mystery fields into a few more dimensions. you should not be striving for mathematical elegance. then FieldX is a hierarchical parent of FieldY and they should absolutely be in the same dimension table. It is wise to be flexible when searching for these correlations. counting combinations of values in order to figure out what to do. The key question is: How many unique FieldX + FieldY combinations exist in the data? If there are exactly 1. but they each have only three values. Grouping correlated fields together has a couple of attractive benefits. many of these fields may be mildly or strongly correlated with each other. then the two fields are virtually independent and we would gain very little by placing them in the same dimension. try to keep perspective.
. we now have 22 dimensions in our design. Yes. The secret to this last step of the design is to group the mystery fields together into correlated groups. To discover this case.
The details about it are as follows: 1. 3. Prepare a questionnaire consisting of at least 15 non-trivial questions to collect requirements/information about the organization. which makes the mapping in parallel fashion which is not in Informatica There is a lot of diffrence between informatica an Ab Initio In Ab Initio we r using 3 parllalisim but Informatica using 1 parllalisim In Ab Initio no scheduling option we can scheduled manully or pl/sql script but informatica contains 4 scheduling options Ab Inition contains co-operating system but informatica is not Ramp time is very quickly in Ab Initio campare than Informatica Ab Initio is userfriendly than Informatica Can i start and stop single session in concurent bstch? ya shoor. banks.what is difference b/w Informatica 7. Can you please tell me what should be those 15 questions to ask from a company.Just right click on the particular session and going to recovery option or by using event wait and event rise want to prepare a questionnaire. may be the prime candidate for this) 2. an insurance company.1 and Abinitio in Informatica there is the concept of co-operating system. (For example Telecommunication. Identify a large company/organization that is a prime candidate for DWH project. say a telecom company?
. Give at least four reasons for the selecting the organization. This information is required to build data warehouse.
billing metrics.city.Now magament people :monthly usage.name.name. If the PowerCenter Server cannot finish processing and committing data within the timeout period. it is similar to stop command except it has 60 second time out .sales organization.Numberrate plan:rate plan codeAnd Fact table can be:Billing details(bill #.sales rep number.Now goto business managment team :they can ask for metrics out of billing process for their use.state etc)Sales rep.Abort: Same as Stop but in this case maximum time allowed for buffered data is 60 Seconds. If the server cannot finish processing and commiting data with in 60 sec Stop: In this case data query from source databases is stopped immediately but whatever data has been loaded into buffer. what is difference between lookup cashe and unchashed lookup? Can i run the mapping with out starting the informatica server? the difference between cache and uncacheed lookup iswhen you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation.Bill date.customer id.For example they need :customer billing process. it kills the DTM process and terminates the session.rate plan to perform sales rep and channel performance analysis and rate plan analysis. The PowerCenter Server handles the abort command for the Session task like the stop command. in cache lookup the select statement executes only once and compares the values of the input record with the values in the cachebut in uncache lookup the the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record what is the difference between stop and abort stop: _______If the session u want to stop is a part of batch you must stop the batch.minutes used. there transformations and loading contunes. except it has a timeout period of 60 seconds.
.First of all meet your sponsors and make a BRD(business requirement document) about their expectation from this datawarehouse(main aim comes from them).idsales org:sales ord idBill dimension: Bill #. Stop the outer most bacth\ Abort:---You can issue the abort command .Depend upon the granualirty of your data.call details etc)you can follow star and snow flake schema in this case. if the batch is part of nested batch. So your dimension tables can be:Customer (customer id.
which solves the problem. i am giving one of the best real world example which i found in some website while browsing. Target file has duplicate records. Generate the SQL in SQ Transformation and try to run the same in TOAD. I'm having a SKU dimension of around 150. There you can specify the pl/sql procedure name.What about rapidly changing dimensions?Can u analyze with an example? a rapidly changing dimensions are those in which values are changing continuously and giving lot of difficulty to maintain them. Selecting data from 5 different SQL tables using the user defined join. In workflow can we send multiple email ? yes. we can send multiple e-mail in a workflow What is the difference between materialized view and a data mart? Are they same? what is incremental loading in informatica(that is used to load only updated information in the source)?how and where u use it in informatica?
. description of a rapidly changing dimension by that person: I'm trying to model a retailing case . However these prices change almost daily for quite a lot of these products leading to a huge dimensional table and requiring continuous updations. In addition I'm willing to track changes of the sales and purchase price. eventhough the source tables has single records. can any one tell why we are populating time dimension only with scripts not with mapping? How do we load from PL/SQL script into Informatica mapping? You can use StoredProcedure transformation. how to write a filter condition to get all the records of employees hired between any two given dates.000 unique products which is already a SCD Type 2 for some attributes. go through it. so a better option would be shift those attributes into a fact table as facts. i am sure you like it. when we run the session containing this transformation the pl/sql procedure will get executed.
if you want detail inf on UpdateStrategy refer to the answer i gave to one question on UpdateStrategy.Update Strategy Transformation.you got it correctly. The incremental loading is done in 3 ways by using Transformations. please follow this example carefully by imagining structures of the tables.Dynamic Lookup Transformation. In simple terms you will check in a table whether any column satisfies the condition you are specifying. hourly load etc) 2) what is lookup ? (Lookup is a join is it wright or wrong) ub Lookup transformation is not a join operation. 3. so UpdateStrategy as well. The above answer is slightly correct. ) How to load the data (like refresh load. ex: take source table REWARDS which contains empids who got rewards. Session level in the property tab in the performane have option like "incremantal Aggragetion". In each and every real time datawarehouse project this incremental loading is important.
.Aggregate Transformation.If u enable this property the seession captures only newly records from the source. If it satisfies one or more columns of the satisfied row are used in the mapping. quarterly load. incremental loading is loading updating old rows and inserting newly arrived rows. 1. for this we use UpdateStrategy transformation. 2. In the mapping we can use either Aggregate Transformation or Dynamic Lookup Transformation with Updatestrategy or filter Transformtion for performing update or insert the newly captured records.
4)by the end of ETL process target databases(dimensions. To Update slowly changing dimensions( here we use target table as lookup table and again connect to target table) It is often used to find present values like currency exchange rates. Perform caliculations 3.ERPs etc 3)Cleansing and transformation process is done with ETL(Extraction Transformation Loading) tool. U can load time dimension manually by writing scripts in PL/SQL to load the time dimension table with values for a period.M having my business data for 5 years from 2000 to 2004. then load all the date starting from 1-1-2000 to 31-12-2004 its around 1825 records. 2)datawarehouse starts with collection of data from source systems such as OLTP. Main use of Lookup transformation is 1. Which u can do it fast writing scripts. 5)Now finally with the use of Reporting tools(OLAP) we can get the information which is used for decision support. If you find a match means lookup is succeded. Informatica server quieries the lookup table based on the lookup ports in the transformation.e creation dimensions and facts.
. Lookup transformation is mainly used when the informatica server does not have enough considerable information.import the already existed EMP table into lookup transformation.facts) are ready with data which accomplishes the business rules. then lookup or match empid fields of both tables. It is used to lookup data in a relational table. How to load time dimension? We can use SCD Type 1/2/3 to load any Dimensions based on the requirement. synonym or view. stocks.CRM. You can pass on required columns of matched rows for further processing. what is the architecture of any Data warehousing project? what is the flow? 1)The basic step of datawarehousing starts with datamodelling.. Get related values 2. i. Ex:. Here take sal attribute and by using expression attribute increment salary by Rs1000/-.
DD_DELETE.DD_UPDATE) if condition satisfies do DD_INSERT otherwise do DD_UPDATE DD_INSERT. eg: iif(condition.. while lookups speed will be lesser.DD_REJECT are called as decode options which can perform the respective DML operations. You can do this on session level tooo but there you cannot define any condition.DD_UPDATE.For eg: If you want to do update and insert in one mapping.
. it is essential transformation to perform DML operations on already data populated targets(i. Merging operations will be faster as there is no index concept and Data will be in ASCII mode. Insertion.Refer : Update Strategy in Transformation Guide for more information Update Strategy is the most important transformation of all Informatica transformations. We can also specify some conditions based on which we can derive which update strategy we have to use. The basic thing one should understand about this is .DD_INSERT. why did u use update stategy in your application? Update Strategy is used to drive the data to be Inert.Deletion.In update strategy target table or flat file which gives more performance ? why? we use stored procedure for populating and maintaing databases in our mapping Pros: Loading. Update and Delete depending upon some condition. How do you create single lookup transformation using multiple tables? Write a override sql query..you will create two flows and will make one as insert and one as update depending upon some condition. As there is no indexes. Adjust the ports as per the sql query.Updation. Cons: There is no concept of updating existing records in flat file.e targets which contain some records before this mapping loads data) It is used to perform DML operations. if it is already there in the target(we find this by lookup transformation) update it otherwise insert it. For example take an input row . Sorting.Rejection when records come to this transformation depending on our requirement we can decide whether to insert or update or reject the rows flowing in the mapping.
customer=customer_data. How can you improve the performance of Aggregate transformation? By using sorter transformation before the aggregator transformation.i. 1. Or. 2.i.DECODE(3) for insertion updation deletion and rejection Why did you use stored procedure in your ETL Application? usage of stored procedure has the following advantages 1checks the status of the target database 2drops and recreates indexes 3determines if enough space exists in the database 4performs aspecilized calculation Stored procedure in Informatica will be useful to impose complex business rules.ustomer_Id
.e reduce number of input and output ports.performance issue which one is better? whether connected lookup tranformation or unconnected one? it depends on your data and the type of operation u r doing. 3. DECODE(2) .Give input/output what you need in the transformation.can u explain one critical mapping? 2. we can improve the agrregator performence in the following ways 1. we have to get value for a field 'customer' from order tabel or from customer_data table.increase aggregator cache size.send sorted input.There is a function called DECODE to which we can arguments as 0.on the basis of following rule: If customer_name is null then . DECODE(1) . If u need to calculate a value for all the rows or for the maximum rows coming out of the source then go for a connected lookup.e Index cache and data cache.if it is not so then go for unconnectd lookup.1.3 DECODE(0) .2. Specially in conditional case like.
snowflake schema 3. But my querie is different means suppose if u hv 10 records in source thn i want same 10 records should b loaded twise such tht i should get all records twise means 20 records n only in single target table ..how to enter same record twice in target table? give me syntax. how to get the records starting with particular letter like A in informatica? Declere Target table twice in the mapping and move the output to both the target tables. you will have to change the insert statement For getting the column names in the flat files. you need to make changes in the Informatica server setup.. Under Informatica Server setup-> under configuration there is a option you need to check. how many types of dimensions are available in informatica? hi there r 3 types of dimensions 1.but this is a hardcoding method.otherwise customer=order. Output Metadata for target flat files...star schema 2.stand-alone
. how can i specify first rows as column names in flat files.how to create primary key only on odd numbers? 4.u will find duplicate rows in u r target.. so in this case we will go for unconnected lookup .customer_name.if you change the column names or put in extra columns in the flat file.. use a pre sql statement.. use this syntex insert into table1 select * from table1 (table1 is the name of the table) when we create a target as flat file and source as oracle..glaxy schema i think there r 3 types of dimension table 1. how to get particular record from the table in informatica? 3. 2. don't choose the select distinct option and run the session again.
3. We can merge multiple source qualier query records in union trans at same time .2. That are 1. gives the same meaning How can you say that union Transormation is Active transformation. but mostly used one is type2 SCD. type3 SCD: Here we will add new columns. Junk Dimensions I want each and every one of you who is answering please don't make fun out of this.global There are 3 types of dimensions available according to my knowledge.
. some one gave the answer "star flake schema. snow flake schema etc"how can a schema come under a type of dimension ANSWER: One major classification we use in our real time modelling is Slowly Changing Dimensions type1 SCD: If you want to load an updated row of previously existed row the previous data will be replaced. we have one more type of dimension that is CONFORMED DIMENSION: The dimension which gives the same meaning across different star schemas is called Conformed dimension. So we have both current and past records. its not like expresion trans (each row). Confirmed dimensions 3.local. General dimensions 2. type2 SCD: Here we will add a new row for updated data. So we lose historical data. where ever it was . some one gave the answer "no". so we can say it is active. which aggrees with the concept of datawarehousing maintaining historical data. ex: Time dimension.
. Static lookup cache can not be inserted or updated in static lookup cache.y and when do v use these types of lookups ( ie. there are Additive Facts..e before 7.but it saves time as informatica does not need to connect to your databse every time it needs to lookup. Quantitaive testing 2.
.. static lookup cache adds to the session run time.so remember to select only those columns which are needed How many types of facts and what are they? I know some.dynamic and static ) Connected lookup to lookup only once and Unconnected to avoid multiple lookups of same table Dynamic lookup cahe. Transaction Fact table.0 but now its not the exact benchmark to determine the active transformation When do u use a unconnected lookup and connected lookup. Additive Facts:Fact data that can be additive/aggregative. you can decide on this.. Accumulating Fact: stores row for entire lifetime of event.Union Transformation is a active transformation because it changes the number of rows through the pipeline. There are Factless Facts:Facts without any measures.. It normally has multiple input groups to add on it compare to other transformation....also remember that static lookup eats up space.Qualitative testing Steps. Periodic Facts: That stores only one row per transaction that happend over a period of time. in dynamic lookup cache. what is the difference between dynamic and static lookup. depending on how many rows in your mapping needs a lookup.Factless facts.. you cache all the lookup data at the starting of the session. how do we load data by using period dimension? how do we do unit testing in informatica? how do we load data in informatica Unit testing are of two types 1...Periodic fact table..Semi-Additive..Before union transformation was implement the funda on number of rows was right i. you go and query the database to get the lookup value for each record which needs the lookup. Accumulating Facts.you can insert rows as you pass to target.. Non-Additive facts: Facts that are result of non-additon Semi-Additive Facts: Only few colums data can be added. Non-Additive.
EX: Temperature in fact table will note it as Moderate.Low. Once the session is succeeded the right click on session and go for statistics tab.(according to logic) How can I get distinct values while mapping in Informatica in insertion?
. such fact tables are required to avoid flaking of levels within dimension and to define them as a separate cube connected to the main cube.This is called Quantitative testing. lookup's either we can use first or last value. what transformation you can use inplace of lookup? You can use the joiner transformation by setting as outer join of either master or detail. Factless fact is nothing but Non-additive measures. Steps 1.Take the DATM(DATM means where all business rules are mentioned to the corresponding source columns) and check whether the data is loaded according to the DATM in to target table.High.Create session on themapping and then run workflow.1. in that situation we can use master or detail outer join instead of lookup.First validate the mapping 2. This is called Qualitative testing. This is what a devloper will do in Unit Testing. There you can see how many number of source rows are applied and how many number of rows loaded in to targets and how many number of rows rejected.If any data is not loaded according to the DATM then go and check in the code and rectify it. we need all matching records. This type of things are called Non-additive measures. If once rows are successfully loaded then we will go for qualitative testing. It contains only the foriegn keys of corresponding Dimensions. Factless Fact Tables are the fact tables with no facts or measures(numerical data). for suppose lookup have more than one record matching. Why and where we are using factless fact table? Hi Iam not sure but you can confrirm with other people.
(plz check the exact syntax)). and if the sources are heterogeneous. I mean selecting the check box called  select distinct.).something like :LKP... and it'll return the output after looking up...Which is resonsible for reads the data from various source system and tranforms the data according to business rule and loads the data into the target table How do we do complex mapping by using flatfiles / relational database? How to move the mapping from one database to another? Do you mean migration between repositories?Â Â There are 2 ways of doing this.lkp_abc(input_port)...Â Go to File Menu ....Select 'Export Objects' and give a name . call the unconnected lookup.. we can get distinct values.(lkp_abc is the name of the unconnected lookup. data extraction takes place at the same time the insert. With CDC. from where the input has to be taken and the output is linked? What condition is to be given? The unconnected lookup is used just like a function call.Â Open the mapping you want to migrate.. then we can use SORTER Transformation and in transformation properties select the check box called  select distinct same as in source qualifier.e..You can add an aggregator before insert and group by the feilds that need to be distinct There are two methods to get distinct values: If the sources are databases then we can go for SQL-override in source qualifier by changing the default SQL query. or delete operations occur in the source tables. What is change data capture? Changed Data Capture (CDC) helps identify the data in the source system that has changed since the last extraction. How to define Informatica server? informatica server is the main server component in informatica product family.1 version How to use the unconnected lookup i.. i mean from different file systems.Â 1. using $PMSessionlogcount(specify the number of runs of the session log to save) How to call stored Procedure from Workflow monitor in Informatica 7. update.Â Connect to the repository where you
..an XML file will be generated. thus captured.give the input value just like we pass parameters to functions. how can we store previous session logs We can do this way also.. The change data. is then made available to the target systems in a controlled manner... and the change data is stored inside the database in change tables. in an expression output/variable port or any place where an expression is accepted(like condition in update strategy etc.
it loaded the remaining rows .Â Be sure you open the target folder..try the source definition in the source anayser. i have brought 300 records into my ODS.Â Â Go to the source folder and select mapping name from theÂ object navigator and selectÂ 'copy' from 'Edit' menu. exact solution of the problem is in the session properties have one check box option "recovery to load target". its done. then it will work thanks for ur reply.there is no need for a new mapping for a new databse. but here the problem is even though u are storing the rownumber in a variable. so next time i want to load the remaining records. u can solve this problem by using mapping variable where u need to mention the start value and end value. is there any solution where in the server can read the from the exact point where it has stopped the previous time. You can do a lookup on the target table and check for the rows already present there. You can also use variable to store the rownum of the final row you loaded in the target..Â Now. how do we solve this problem. go to the target folder and select 'Paste' from 'Edit' menu.i don't remeber where ths option is.when it is enable ... it will be waste of server resource.want to migrate and then select File Menu . when ever i start the work flow again it will load from the begining. if we go by the direct meaning of your question. Hence the first 300 records will not be reloaded to the target... and again when ever u start the session the second time it will again read from the begening of the table. By using Sequence GeneratorTransformation u can do it ie by chaging the RESET option in the properties tab of your SequenceGeneratorTransformation..Â Connect to both the repositories. open the respective folders. there is an option to specify from which row we want to read the data(used actually to eliminate reading headers).
. 2. connect to both the repositories. or in the session properties. u can also do it this way. you just need to change the connections in the workflow manager to run the mapping on another database my source is having 1000 rows.. From next time you can use this variable to load the rest of the data to the target You can use a filter transformation and set a condition there as rownum>300 if ur source is a flatfile. keep the destination repository as active. from the navigator panel just drag and drop the mapping to the work area. so my point is..say YES. so i need to load from 301 th record. it will ask whether to copy the mapping.'Import Objects' and select the XML file name.
when u end ur session then that variable will take the next value as the start value next time..90% it has to be de-normalized and off course the fact table has to be de normalized. the main funda of DW is de-normalizing the data for faster access by the reporting tool.x What is the difference between PowerCenter 7 and PowerCenter 8? In Power Center 7..got it!! What is the difference between PowerCenter 6 and powercenter 7? 1)lookup the flat files in informatica 7.This tool is used to Extract the data from different Data Bases and then we can do the required transfermation like data type conversions. now i got an golden opertunity to work in INFORMATICA. Creating a new Repository(Non-Versioned) and then copying all the folders is an option.Transformation and Loadind of data. What is the process flow of informatica.. so pls guide me properly....X but we cann't lookup flat files in informatica 6.as u said if suppose first value is 1 and u ended it up at 300 then next time it will take start value as 301. Informatica is a ETL tool. I need a way to get a non-versioned repository from a versioned one.After that we can load the transformed data into our database..x Session level error handling is available in 7.x The main difference is the version control available in 7.X Also custom transformation is not available in 6.ordering. It consists of data from dimension table (Primary Key's) and Fact table has Foreign keys and measures. Fable August 3. I confused..1.x XML enhancements for data integration in 7. Here is the comment.? I was working in SQL server. Is a fact table normalized or de-normalized? Flat table is always Normalised since no redundants!! Well!! A fact table is always DENORMALISED table.used for the Extraction. I read the above comments.filtering and so on. I have lots of (silly) Questions to build my Career..
.X 2) External Stored Procedure Transformation is not available in informatica 7. then we should ask Kimball know. 2005 Dimensional models are fully denormalized.so if ur building a DW ..2 how can we take a Backup of a Versioned Repository such that I get a Non-Versioned Repository when i restore it..doing some aggregations.X but this transformation included in informatica 6.But is there any other better way.which will be used for the Bussiness Decissions. I will ask lots of questions.
You have to use two joiner transformations. the fact tables with performance metrics are typically normalized. the Fire the same query in the SQL Override of Source qualifier transformation and make a simple pass thru mapping. Can be Verbose Initialisation Verbose Data Normal or Terse. If you want to perform it thru informatica. you can also do it by using a Filter. By default it remains "Normal". can we eliminate duplicate rows by using filter and router transformation ?if so explain me in detail We can eliminate the duplicate rows by checking the distinct option in the properties of the transformation. How can we join 3 database like Flat File. input rows and output rows might not match. Meanwhile.. Oracle. While we advise against a fully normalized with snowflaked dimension attributes in separate tables (creating blizzardlike conditions for the business user).
. how is the union transformation active transformation? Active Transformation is one which can change the number of rows i. The dimension tables of descriptive information are highly denormalized with detailed and hierarchical roll-up attributes in the same table.deptno from dept group by deptno having count(*) >=10.fIRST one will join two tables and the next one will join the third with the resultant of the first joiner. i want to diplay those deptno where more than 10 people exists select count(*). a single denormalized “big wide table” containing both metrics and descriptions in the same table is also ill-advised.e.Router transformation by giving the condition there deptno>=10.Fact Dimensional models combine normalized and denormalized table structures. Db2 in Informatrica. can use SQL query for uniqness if the source is Relational But if the source is Flat file then u should use Shorter or Aggregatot transformation what is tracing level? Ya its the level of information storage in session log. The option comes in the properties tab of transformations. Number of rows coming out of Union transformation might not match the incoming rows. there are 3 depts in dept table and one with 100 people and 2nd with 5 and 3rd with some 30 and so. Other wise.
Assume. then it will not return all the rows This is type of active transformation which is responsible for sorting the data either in the ascending order or descending order according to the key specifier. So. Source (100 rows) ---> Active Transformation ---> Target (< or > 100 rows) Passive Transformation: the transformation that does not change the no. of rows in the Target. If we combine the rows of Table-1 and Table-2. The transformation just looks to the refering table. it is definitely an active type. of row that are passing thru it. it is definetly an Active Transformation. In the other case.which inturn decrease the no of o/p records when compared to i/n records.
. Active transformation number of records passing through the transformation and their rowid will be different. of records increases or decreases by the transformations that follow the look-up transformation. Source (100 rows) ---> Passive Transformation ---> Target (100 rows) Union Transformation: in Union Transformation. why sorter transformation is an active transformation? It allows to sort data either in ascending or descending order according to a specified field. of rows in the Target. of o/p records. Ya since the Union Trnsformation may lead a change to the no of rows incoming. we will get a total of '30' rows in the Target. Table-1 contains '10' rows and Table-2 contains '20' rows.Active Transformation: the transformation that change the no.So distinct always filters the duplicate rows. the port on which the sorting takes place is called as sortkeyport properties if u select distinct eliminate duplicates case sensitive valid for strings to sort the data null treated low null values are given least priority If any transformation has the distinct option then it will be a active one. we may combine the data from two (or) more sources. Look-up in no way can change the no. the union transformation functions very similar to union all statement in oracle. This is a type of passive transformation which is responsible for merging the data comming from different sources. it depends on rowid also. and specify whether the output rows should be distinct.bec active transformation is nothing but the transformation which will change the no. Also used to configure for case-sensitive sorting. The no.
The target tables must posess primary/foreign key relationships. A target load group is a set of source-source qulifier-transformations and target..One more thing is"An active transformation can also behave like a passive" How do we analyse the data at database level? Data can be viewed using Informatica's designer tool. You specify the order there the server will loadto the target accordingly.It will show all the target load groups in the particular mapping. what is the difference between constraind base load ordering and target load plan Constraint based load ordering example: Table 1---Master Tabke 2---Detail If the data in table1 is dependent on the data in table2 then table2 should be loaded first. it has to be loaded into multiple target. We can use data profiling too.Click mappings tab in designer and then target load plan. how u will create header and footer in target using informatica? If you are focus is about the flat files then one can set it in file properties while creating a mapping or at the session level in session properties you can always create a header and a trailer in the target file using an aggregator transformation. it’s loading into multiple target means you have to use Constraint based loading. If you have only one source. If you have multiple source qualifiers. the server loads according to the key relation irrespective of the Target load order plan. So that. Target load order comes in the designer property. But the target tables should have key relationships between them.In such cases to control the load order of the tables we need some conditional loading which is nothing but constraint based load In Informatica this feature is implemented by just one check box at the session level. Here the multiple targets must be generated from one source qualifier. Where as constraint based loading is a session proerty. you have to use Target load order. If you want to view the data on source/target we can preview the data but with some limitations.
. We use this to check the results and also to update slowly changing records.update or delete the records using update strategy. Concatenate the header and the main file in post session command usnig shell script. How to export mappings to the production environment? In the designer go to the main menu and one can see the export/import options. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension" problem. The third will be your main file.ByeMayee when do u we use dynamic cache and when do we use static cache in an connected and unconnected lookup transformation We use dynamic cache only for connected lookup.define the source& target locations in general properties of sessiontreat rows as: Data DrivenCheck this once and let me know.. the new information simply overwrites the original information. the original entry in the customer lookup table has the following record: Customer Key Name State 1001 Christina IllinoisAt a later date. she moved to Los Angeles. One will be your header and other will be your trailer coming from aggregator. Illinois. and they are categorized as follows: In Type 1 Slowly Changing Dimension. create the target with the name u have given in wizard for target(table). And depending on that. Hii. We give an example below: Christina is a customer with ABC Inc. i.Mayee Can u tell me how to go for SCD's and its types. There are in general three ways to solve this type of problem. In other words. Import the exported mapping in to the production repository with replace options. We use dynamic cache to check whether the record already exists in the target table are not. no history is kept. Static cache is the default cache in both connected and unconnected. so if we create one new table in target it gives error. create three separate files in a single pipeline.. we insert. How should ABC Inc. can any one tell me how to run scd1 bec it create two target tables in mapping window and there are only one table in warehouse designer(means target).Take the number of records as count in the aggregator transformation. If u select static cache on lookup table in infa.e in warehouse designer create and execute the target definitions and run the session containing the mapping again. 2003. No't create the target again for the second instance. it own't update the cache and the row in the cache remain constant.Where do we use them mostly The "Slowly Changing Dimension" problem is a common one particular to data warehousing. have u impleted using wizards???? If so. It is just the virtual copy of the same target. this applies to cases where the attribute for a record varies over time. In a nutshell. California on January. She first lived in Chicago. So.
When to use Type 2: Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes.All history is lost. if Christina later moves to Texas on December 15. Usage: About 50% of the time. the original information gets updated. . In our example. the new information replaces the new record. When to use Type 1: Type 1 slowly changing dimension should be used when it is not necessary for the data warehouse to keep track of historical changes.and what is Materialize view views
. since new information is updated. By applying this methodology. the California information will be lost. since there is no need to keep track of the old information. What is a view? How it is related to data independence?And what are the different types of views. recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California.This will cause the size of the table to grow fast. one indicating the original value.Type 3 will not be able to keep all history where an attribute is changed more than once. The new record gets its own primary key. and one indicating the current value. recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California.This allows us to accurately keep all historical information. there will be two columns to indicate the particular attribute of interest. 2003.This allows us to keep some part of history. For example. For example. When to use Type 3: Type III slowly changing dimension should only be used when it is necessary for the data warehouse to track historical changes. 2003): Customer Key Name Original State Current State Effective Date 1001 Christina Illinois California 15-JAN-2003Advantages: . and we have the following table: Customer Key Name State 1001 Christina CaliforniaAdvantages: . In our example.example. There will also be a column that indicates when the current value becomes active. Usage: Type 3 is rarely used in actual practice. Usage: About 50% of the time.This necessarily complicates the ETL process. we will now have the following columns: • Customer Key • Name • Original State • Current State • Effective Date After Christina moved from Illinois to California. storage and performance can become a concern. we add the new information as a new row into the table: Customer Key Name State 1001 Christina Illinois 1005 Christina CaliforniaAdvantages: . Disadvantages: . In Type 2 Slowly Changing Dimension.This is the easiest way to handle the Slowly Changing Dimension problem. the company would not be able to know that Christina lived in Illinois before. In cases where the number of rows for the table is very high to start with. a new record is added to the table to represent the new information. and we have the following table (assuming the effective date of change is January 15. Disadvantages: .This does not increase the size of the table. . both the original and the new record will be present. it is not possible to trace back in history. Disadvantages: . and when such changes will only occur for a finite number of time. In Type 3 Slowly Changing Dimension. in this case. recall we originally have the following table: Customer Key Name State1001 Christina IllinoisTo accomodate Type 3 Slowly Changing Dimension. Therefore.
Through views we can hide the complex and big names of the tables. They look like: Table1: (K . just by creating a view once we can use it at many places. I have 5 temporary tables as sources.If we excutes the query the query will fetch the data from the tables and just make it to view for us. Because of this.it just store the query in file format. Materialize View which is introduce in Oracle 8. View is just an query which is parsed and stored in SGA. types views materilized view As per definition. So whenever this view is referred in a query it can be executed with no lost of time for parsing. we can't use aggregator/group by as we are not sure which one should be removed.Z .
. But there can be a situation like any of the table can contain duplicates like: K1 X N N N N K1 Y N N N N -------------------------This kind of records should be errored out. have a situation here to load the Table into informatica.view does not stores the data. N-Null.Y. One bigger advantage.values) K1 X N N N N K2 X N N N N -------------------------Table2: K1 N X N N N K2 N X N N N -------------------------the other 3 tables are in the same way.Key. It actually stores data like table. X.view is a combination of one or more table.
. what are the different types of transformation available in informatica..13.joiner.e it passes all rows through the transformation. Connected TransformationConnected transformation is connected to other transformations or directly to target table in the mapping.Advanced External Procedure Transformation 16.Aggregator Transformation 14..... and returns a value to that transformation.Rank Transformation 8.Sorter Transformation 12.XML Source Qualifier Transformation 15..UnConnected TransformationAn unconnected transformation is not connected to other transformations in the mapping. Do the following Map1.Transformations can be Connected or UnConnected.e it eliminates rows that do not meet the condition in transformation.Stored Procedure Transformation 11..Filter Transformation 4.Sequence Generator Transformation 10.and now do your requirement design .list of Transformations available in Informatica:1 source qualifier Tranformation2.aggregator.Source instance( t1)---source qualifier---take your original port with eliminated duplicate key records--.1]Active TransformationAn active transformation can change the number of rows that pass through it from source to target i.How can we obtain this functionality in Informatica.Source instance s1---Source qualifier—aggregator transformation---t1(target) | | In aggregator transformation Count_ port(output port)= key port --group by key port Now take your original port with eliminated duplicate key port’s record in map2
Map2. custom tranformationMostly use of particular tranformation depend upon the requirement.In our project we are mostly using source qualifier .Normalizer Transformation 7.Router Transformation 9..2]Passive TransformationA passive transformation does not change the number of rows that pass through it i.Lookup Transformation 6...Joiner Transformation 5.....Expression Transformation 3.Update Strategy Transformation .External Transformation 16.Thanks--afzal
.. and what are the mostly used ones among them Mainly there are two types of tranformation...look up tranformation. It is called within another transformation..
Then T3 will be loaded as it's master table T3 is already been loaded. target column) in informatica? what is Partitioning ? where we can use Partition? wht is advantages?Is it nessisary? In informatica we can tune performance in 5 different levels that is at source level. it will eliminate the duplicate records Before loading to target . hwo can we eliminate duplicate rows from flat file? keep aggregator between source qualifier and target and choose group by field key.tg2 foreign key referencing the tg1's primary key.key range. How to view and Generate Metadata Reports(for a particular session would like to generate a report which shows source table .round robin.session level and at network level. tg1 has a primary key. use an aggregator transformation and make use of group by function to eleminate the duplicates on columns . since it might be confusing. Use incremental aggregation Sort the ports before you perform aggregation Avoid using aggregator transformation after update strategy.tg2 has foreign key referencing primary key of tg4 . If we enable that property. automatically it will remove duplicate rows in flatfiles.its easily dupliuctee now to eliminate the duplicate in flatfiles. To optimize the aggregator transformation.Nanda Use Sorter Transformation. and T2 to witch it refers has been already loaded.tg3 and tg4. It therefore discards duplicate rows compared during the sort operation if u want to delete the duplicate rows in flat files then we go for rank transformation or oracle external procedure tranfornation select all group by ports and select one field for rank. pass through is the default one.tg3 has primary key that tg2 and tg4 refers as foreign key.
. you can use the following options. we have distinct property in sorter transformation.the order in which the informatica will load the target? 2]How can I detect aggregate tranformation causing low performance? T1 and T3 are being the master table and don't have any foreign key refrence to other table will be loaded first. When you configure the Sorter Transformation to treat output rows as distinct. So to tune the performance at session level we go for partitioning and again we have 4 types of partitioning those are pass through.mapping level. it configures all ports as part of the sort key. T3. hash.target level. and at the end T2 will be loaded as it's all master table T1.tg2.1]In certain mapping there are four targets tg1. source column and related target table .
rownum rn from table_name ) where rn > ( select (max(rownum) -n) from table_name ). Have a flexible mechanisam for handling S.It is just a unique identifier or number for each row that can be used for the primary key to the table. In my source table 1000 rec's r there. it can be used at some transformation level key range can be applied at both source or target levels. round robin can not be applied at source level. Unlike other Surrogate key is a Unique identifier for eatch row . Customer Number in Customer table) can change and this makes updates more difficult.Advantages1. This will work fine. Try it and get back to me if u have any issues about the same what is surrogatekey ? In ur project in which situation u has used ? explain with example ? A surrogate key is system genrated/artificial key /sequence number or A surrogate key is a substitution for the natural primary key. in informatica we can do by using sequence generator and filter out the row when exceds 500 You can overide the sql Query in Wofkflow Manager. we can save substantial storage space with integer valued surrogate keys. What is the diff b/w Stored Proc (DB level) & Stored proc trans (INFORMATICA level) ? again why should we use SP trans ?
.but In my project.C. it can be used as a primary key for DWH.Nanda in db2 we write statement as fetch first 500 rows only. col2.e.The DWH does not depends on primary keys generated by OLTP systems for internally identifying the recods.D's2. The only requirement for a surrogate primary key is that it is unique for each row in the tableI it is useful because the natural primary key (i. I felt that the primary reason for the surrogate keys was to record the changing context of the dimension attributes.(particulaly for scd ) The reason for them being integer and integer joins are faster.In hash again we have 2 types that is userdefined and automatic. LIke select * from tab_name where rownum<=1000 minus select * from tab_name where rownum<=500.I want to load 501 rec to 1000 rec into my Target table ? how can u do this ? select * from (select col1.When the new record is inserting into DWH primary keys are autimatically generated such type od keys are called SURROGATE KEY.
Stored procedures are used to automate time-consuming tasks that are too complicated for standard SQL statements. what is the diff b/w STOP & ABORT in INFORMATICA sess level ? Stop:We can Restart the session Abort:WE cant restart the session.Then assign both the ports default values as 1.First of all stored procedures (at DB level) are series of SQL statement. if the workflow has 5 session and running sequentially and 3rd session hasbeen failed how can we run again from only 3rd to 5th session?
.Use Joiner transformation to join the sources using dummy port(use join conditions).We should truncate all the pipeline after that start the session if the workflow has 5 session and running sequentially and 3rd session hasbeen failed how can we run again from only 3rd to 5th session? If multiple sessions in a concurrent batch fail. Just add 2 dummy ports in the joiner transformation with the same datatypes(Integer).4.Drag the copied session outside the batch to be a standalone session. This will ensure that the workflow has started to run from 3rd session and continues till the last one. if a session in a concurrent batch fails and the rest of the sessions complete successfully.To recover a session in a concurrent batch:1.In oracle it is same as a select query without a where clause. And those are stored and compiled at the server side. i guess u know that how to assign a default value.Follow the steps to recover a standalone session.Copy the failed session using Operations-Copy Session.Delete the standalone copy. how can we join the tables if the tables have no primary and forien key relation and no matchig port to join? without common fields or ports in two diffrent tables. 2. u can start from the 3rd session by right clicking on that session and select "start workflow from this task”.In the Informatica it is a transformation that uses same stored procedures which are stored in the database. It ensures that all the records are matched and basically it will be a cartician product. what ever is said above is correct. you might want to truncate all targets and run the batch again.its need atleast one common field or port to join two diffrent tables. one dummy port as detail and other as master.if you don't want to use the stored procedure then you have to create expression transformation and do all the coding in it. Then use these 2 ports in the join condition.Add one dummy port in two sources.3. However. without common column or common data type we can join two sources using dummy ports. its not possible to join two tables.2.In the expression trans assing '1' to each port. 1. you can recover the session as a standalone session. 2.
Delete the standalone copy. select the proper connection (people soft with oracle. how to load the data from people soft hrm to people soft erm using informatica? Following are necessary 1. Advantages of having a EDW:
. However. if a session in a concurrent batch fails and the rest of the sessions complete successfully.sybase.Power Connect license 2. what are the measure objects? Aggregate calculation like sum.avg.If multiple sessions in a concurrent batch fail.To recover a session in a concurrent batch:1.max.Copy the failed session using Operations-Copy Session.It is limited to particular verticle. If the warehouse is build across particular vertical of the company it is called as enterprise data warehouse.Drag the copied session outside the batch to be a standalone session.For example if the warehouse is build across sales vertical then it is termed as EDW for sales hierachy EDW is Enterprise Datawarehouse which means that its a centralised DW for the whole organization.4.2.Import the source and target from people soft using ODBC connections 3. you might want to truncate all targets and run the batch again. Its a single enterprise data warehouse (EDW) with no associated data marts or operational data store (ODS) systems.min these are the measure objetcs.3.Define connection under "Application Connection Browser" for the people soft source/target in workflow manager .Follow the steps to recover a standalone session.db2 and informix) and execute like a normal session.This will ensure that the workflow has started to run from 3rd session and continues till the last one. this apporach is the apporach on Imon which relies on the point of having a single warehouse/centralised where the kimball apporach says to have seperate data marts for each vertical/department. you can recover the session as a standalone session. What is meant by EDW? Its a big data warehouses OR centralized data warehousing OR the old style of warehouse. u can start from the 3rd session by right clicking on that session and select "start workflow from this task".
informatica for load the data form the heterogineous sources to heterogeneous target... Time Dimension will generally load manually by using PL/SQL . to over come is the time it takes to develop and also the management that is required to build a centralised database. what is hash table informatica? Hash partitioning is the type of partition that is supported by Informatica where the hash user keys are specified .
..1. If its can be executed without performance issues then normal load will work If its huge in GB they NWAY partitions can be specified at the source side and the target side. For example. This will allow user to partition the data by rage. Hash table is used to extract the data through Java Virtual MachineHash partitions are some what similar to database partitions.source file name etc ... Golbal view of the Data 2. shell scripts.Use hash partitioning when you want the Informatica Server to distribute rows to the partitions by group. This will be handy while handling partitioned tables. able to perform consistent analysis on a single Data Warehouse. but you do not know how many items have a particular ID number. What are the properties should be notified when we connect the flat file source definition to relational database target definition? 1. we can check also line sequntial buffer insted of 1024 byes. which is fetched from source. 3.we vcann add some more bytes of data how do you load the time dimension.. Same point of source of data for all the users acroos the organization.File is fixed width or delimited 2.. The Informatica Server groups the data based on a partition key. . the Informatica Server uses a hash function to group rows of data among partitions. In hash partitioning.Size of the file.File reader. proc C etc. Can Informatica load heterogeneous targets from heterogeneous sources? yes. 3. you need to sort items by item ID...
( If a new source record look-up primary key port for target table should return null). and use the session level properties. Use the debugger and test the mapping for sample records. how do the testing people will do testing are there any specific tools for testing In informatica there is no method for unit testing. throughput of each session becomes very less and almost same for each session. use a lookup for this. When the operations/loading on the table is in progress the table will be locked. There are two methods to test the mapping. and how we will do them as a etl developer. You can use the incremental loading concept. u can find new updated records a) Create a lookup to target table from Source Qualifier based on primary Key.while running multiple session in parallel which loads data in the same table. Trap this with decode and proceed. set the ata sampling properties for session in workflow manager for specified number of rows and test the mapping. 1. select treat source rows as update and in mapping tab->target view. today again the file has been populated with some more records today. in realtime which one is better star schema or snowflake star schema the surrogate will be linked to which columns in the dimension table. what is the mapping for unit testing in Informatica. select insert and update as insert. If we are trying to load the same table with different partitions then we run into rowID erros if the database is 9i and we can apply a patch to reslove this issue how do u check the source for the latest records that are to be loaded into the target. consider ur targer table as the lookup table. are there any other testings in Informatica. b) Use and expression to evaluate primary key from target look-up. But we have data sampling.. 2. this ensures that only the new records gets inserted and the old records are unchanged.
. How can we improve the performance (throughput) in such cases? think this will be handled by the database we use. Store the last_run_date as SYSDATE each time you run the session in an variable and use it next time you load your target. Keep a timestamp in the target table and from that you can get the daily loaded record status u can check session properties from workflow monitar. i. so how do i find the records populated today.e i have loaded some records yesterday.check for the existence.
It is an active transformation which is used to identify the top and bottom values based on the numerics . what is rank transformation?where can we use this transformation? Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products. Edit the aggregator transformation select Ports tab select Fname then click the check box on GroupBy and uncheck the (O) out port. What will happen if you are using Update Strategy Transformation and your session is configured for "insert"?
.we can go for rank transformation. How can you delete duplicate rows with out using Dynamic Lookup? Tell me any other ways using lookup delete the duplicate rows? For example u have a table Emp_Name and it has two columns Fname. Lname in the source table which has douplicate rows. select Lname then uncheck the (O) out port and click the check box on GroupBy.In real time only star schema will implement because it will take less time and surrogate key will there in each and every dimension table in star schema and this surrogate key will assign as foreign key in fact table. When PMserver is down and the repo is still up we will be prompted for an off-line connection with which we can just monitor the workflows previously ran. To arrange records in Hierarchical Order and to selecte TOP or BOTTOM records. What is exact use of 'Online' and 'Offline' server connect Options while defining Work flow in Work flow monitor? . cuzserver will handle all the syntex errors and Invalid mappings. The system hangs when 'Online' server connect option. Then create 2 new ports Uncheck the (I) import then click Expression on each port. we can stop the batches using server manager or pmcmd commnad what are the real time problems generally come up while doing/running mapping/any transformation?can any body explain with example may be you will encounter with connection faliure. Then second Newport type Lname. workflow monitor always will be connected on-line. Then close the aggregator transformation link to the target table. In the mapping Create Aggregator transformation.by deafult it will create a rankindex port to caliculate the rank can batches be copied/stopped from server manager? yes. In the first new port Expression type Fname. It is same as START WITH and CONNECT BY PRIOR clauses. other then that i don't think so. The Informatica is installed on a Personal laptop When the repo is up and the PMSERVER is also up.
but be sure to have a PK in the target table. Select this option to delete a row from a table. Update else insert : Update the row if it exists.mload. please make it more clear. ELSE u can use the UPD session level options. if u r using a update strategy in any of ur mapping. instead of using a UPD in mapping just select the update in treat source rows and update else insert option. how you seperate distinct value using filter. Where is the cache stored in informatica? cache stored in informatica is in informatica server Cache is stored in the Informatica server memory and over flowed data is stored on the disk in file format which will be automatically deleted after the successful completion of the session run. where you write no dictinct value. Select this option to insert a row into a target table. 3) if u pass only 5 rows to rank. this will do the same job as UPD. 2) for oracle : SQL loader for teradata:tpump.What are the types of External Loader available with Informatica? If you have rank index for top 10. Delete. Update. Update as insert: Insert each row flagged for update. insert it. and for the other transformation load it directly from source. then the info server will not consider UPD for performing any DB operations. what will be the output of such a Rank Transformation? You can set the following update strategy options: Insert. Otherwise. if u select insert or udate or delete. it will rank only the 5 records based on the rank port. how to get two targets T1 containing distinct values and T2 containing duplicate values from one source S1. If you want to store that data you have to use a persistant cache. Use filter transformation for loading the target with no duplicates. If you want to create indexes after the load process which transformation you choose? a) Filter Tranformation b) Aggregator Tranformation c) Stored procedure Tranformation d) Expression Tranformation
. However if you pass only 5 records. then in session properties u have to set treat source rows as Data Driven. You have three different options in this situation: Update as update:Update each row flagged for update if it exists in the target table.
If there is no session-specific directory for the session log. of actual records loaded are equal 2) Check for the duration for a workflow to suceed 3)Check in the session logs for data loaded. What are the various test procedures used to check whether the data is loaded in the backend. At the run time in the monitor you can see it in theÂ performance tab or you can get it from a file. Use this command task in the workflow after the session or else. of records in source and no.perf. Final data will always be clean if followed.. Quality of the data loaded depends on the quality of data in the source. and stores it in the same directory as the session log. In a joiner trasformation. 2) What are the common problems developers face while ETL development 1) Check in the workflow monitor status. If you want to know the performance of a mapping at transformation level. then select the option in the session properties-> collect performance data. and quality of the data loaded in INFORMATICA. performance of the mapping. Why? in joinner transformation informatica server reads all the records from master source builds index and data caches based on master table rows.after building the caches the joiner transformation reads records from the detail source and perform joins Joiner transformation compares each row of the master source against the detail source. whether no. Its done in session level. performance of the mapping.
. which speeds the join process what happens if you try to create a shortcut to a non-shared folder? It only creates a copy of it. What is Transaction? A transaction can be define as DML operation.Its usually not done in the mapping(transformation) level. 1)What are the various test procedures used to check whether the data is loaded in the backend. You can create it with a post session command. If cleansing is required then have to perform some data cleansing operations in informatica. Create a command task which will execute a shell script (if Unix) or any other scripts which contains the create index command. The PowerCenter Server names the file session_name. and quality of the data loaded in INFORMATICA. The fewer unique rows in the master. the PowerCenter Server saves the file in the default log files directory. the fewer iterations of the join comparison occur. you should specify the source with fewer rows as the master source.
txt Then go and define the parameter file: [folder_name. Informatica Live Interview Questions here are some of the interview questions i could not answer.means it can be insertion.ST:s_session_name] $Src_file =c:\program files\informatica\server\bin\abc_source. transformed and undergo some business logic. data from STG will be populated to the final Fact table through a simple one to one mapping. staging area is just a set of intermediate tables. Give like this for example: $Src_file = c:\program files\informatica\server\bin\abc_source.txt If its a relational db.as a parameter. you can define two parameters. one for source and one for target. any body can help giving answers for others also. you can even give an overridden sql at the session level..txt $tgt_file = c:\targets\abc_targets.u can create or maintain these tables in the same database as of ur DWH or in a different DB. can any body write a session parameter file which will change the source and targets for every session. You are supposed to define a parameter file.txt $tgt_file = c:\targets\abc_targets. The monitor window displays the status of each session when you poll the Informatica server. thanks in advance.modification or deletion of data performed by users/ analysts/applicators transaction is nothing but changing one window to another window during process what is polling? It displays the updated information about the session in the monitor window.These tables will be used to store data from the source which will be cleaned.e different source and targets for each session run. how do we remove the staging area this question is logically not correct. Make sure the sql is in a single line.Once the source data is done with the above process.WF:workflow_name. And then in the Parameter file.. i. Explain grouped cross tab? Explain reference cursor What are parallel query's and query hints What is meta data and system catalog What is factless fact schema
What is confirmed dimension Which kind of index is preferred in DWH Why do we use DSS database for OLAP tools confirmed dimension == one dimension that shares with two fact table factless means . and then State. System catalog that we used in the cognos. I think that grouped cross tab has only one measure.tables. The cursor which is not declared in the declaration section but in executable section where we can give the table name dynamically there.privileges. but the side and row headers are grouped like India Mah | Goa 2000 2001 20K 30K 39K 60K China XYZ | PQR 45K 34K 55K 66K
Here the cross tab is grouped on Country.where we have to assign 3 measures for getting the result doubt your answer about the Grouped Cross Tab.so that the cursor can fetch the data from that table
grouped cross tab is the single report which contains number of crosstab report based on the grouped items. here every thing is stored examplemapping.that also contains data. like
.predefined filter etc. using this catalog we generate reports group cross tab is a type of report in cognos.one is event tracking and other is covergae table Bit map indexes preffered in the data ware housing Metedata is data about data.sessions. Similary even we can go further to drill year to Quarters.fact table without measures only contains foreign keys-two types of factless table .in informatica we can see the metedata in the repository.privileges other data. And this is known gruped Cross tab. where you said 3 measure are to be specified. which i feel is wrong.
OLAP tools obviously use data from a DWH which is transformed to generate reports. analysts to extract strategic information which helps in decision making.Here countries are groupe items.
Banglore Hyderabad Chennai INDIA
M1 542 255 45
M2 542 458 254
USA LA Chicago Washington DC PAKISTAN Lahore Karachi Islamabad M1 457 458 7894 M2 875 687 64 M1 578 4785 548 M2 5876 546 556
Rest of the answers are given by friends earlier. DSS->Decision Support System.
. DSS data base is nothing but a DWH. The purpose of a DWH is to provide users data through which they can make their critical besiness decisions. These reports are used by the users.
What is meant by Junk Attribute in Informatica? Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain . The default type is B-Tree indexers which is of high cardinality (normalized data). which will basically have messages in XML format. Of aggreagte columns
. We can use a MQSeries SQ when we have a MQ messaging system as source(queue). thankxregards raghavendra In the requirment collection phase. This product is used for Data Integration from different type of databases and to provide them as single source of data. Talk about the concepts of ODS and information factory. use a JMS source and JMS SQ and an XML Parser. application multi group source qualifier. If you have a TIBCO EMS Queue.Of group By columns 2. When there is need to extract data from a Queue. then use a MQ SQ which will be associated with a Flat file or a Cobal file. just give an example for a better understanding its like same as Source Qualifier but only for MQ Series Product from IBM. Talk about the Kimball vs.Â can anyone explain about incremental aggregation with an example? When you use aggregator transformation to aggregate it creates index and data caches to store the data 1. At the last a dimension will be created with all theÂ left over attributes which is usually called as JUNK Dimension and the attributes are called JUNK Attributes.Partitioning. Inmon approaches. If it satisfies the given condition then bitmap indexers will optimize the performance for this kind of tables. all the attributes that are likely to be used in any dimension will be gathered. or if you have a MQ series queue. just give some explanation why u used the custom transformation where do we use MQ series source qualifier. how will the bitmap indexing will effect the performance Bitmap indexing a indexing technique to tune the performance of SQL queries. while creating a dimension we use all the related attributes of that dimesion from the gathered list. You can use bitmap indexers for de-normalized data or low cardinalities. we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions. how to create a custom transformation. Bitmap Indexing (when to use). can u give a realtime example where exactly u have used it. Talk about challenges of real-time load processing vs. we will use a JMS or a MQ SQ depending on the messaging system. The condition is the amount of DISTINCT rows should be less than 4% of the total rows. batch.
Rank is followed by the Position.
. ODBC is basically a third party driver like Microsoft driver for Oracle. each time you run the session. Somebody ca explain me the 3 points:I want to Know : 1) the differences between using native and ODBC server-side databaseConnections 2)Know the reason why to register a server to the repository is necessary 3)Know the rules associated with transferring and sharing objects between folders. the next rank follows the serial number. here connection will be fast and hence performance. eg: oracle warehouse builder has its own driver to connect to oracle DB which does not use a ODBC driver. Difference between Rank and Dense Rank? Rank: 1 2<--2nd position 2<--3rd position 4 5 Same Rank is assigned to same totals/numbers.the incremental aggreagtion is used when we have historical data in place which will be used in aggregation incremental aggregation uses the cache which contains the historical data and for each group by column value already present in cache it add the data value to its corresponding data cache value and outputs the row . Golf game ususally Ranks this way. 4) Know the rules associated with transferring and sharing objects between repositories 1> Native connection is something which is provided by the same vendor for that tool. in case of a incoming value having no match in index cache the new values for group by and output ports are inserted into the cache Incremental aggregation is specially used for tune the performance of the aggregator. which can be used by any tool to connect to oracle. Dense Rank: 1 2<--2nd position 2<--3rd position 3 4 Same ranks are assigned to same totals/numbers/names. This is usually a Gold Ranking. It captures the change each time (incrementally) you run the transformation and then performs the aggregation function to the changed rows and not to the entire rows. This improves the performance because you are not reading the entire source.
Specially in Sales Domain.only for duplicate values. If it is a flat file.Network organizations ca .another one is the union tranformation and joiner.commercial business net . If we have multiple servers. We dont have any Denormalizer transformation in INformatica. what is the hierarchies in DWH Data sources ---> Data acquisition ---> Warehouse ---> Front end tools ---> Metadata management ---> Data warehouse operation management what is the exact meaning of domain? particular environment or a name that identifies one or more IP addressesexample gov Government agencies edu . This is possible only in PowerCenter and not PowerMart Can any one explain real time complain mappings or complex transformations in Informatica. 2)Know what types of permissions are needed to run and schedule Work flows You need execute permissions on the folder to run/schedule a workflow. You can override any properties other than the source and targets. then we can use diff server to diff sessions to run.. how do you create a mapping using multiple lookup transformation? Use Multiple Lookups in the mapping Use unconnected lookup if same lookup repeats multiple times.. session log. T1.Canada th Thailand in . You may have read and write. How do we pass the unique values to T1 and duplicate values to T2 from the source to these 2 different targets in a single mapping? use this sequence to get the result. we use most of the complexicity in expression transformation involving lot of nested IIF's and Decode statements.Educational institutions org . DTM buffer size. you can override its properties.A central Global repository (GDR) along with the registered Local repositories (LDR) to this GDR. Make sure the source and targets exists in ur db if it is a relational db. cache sizes etc. In the source. 1) I want to Know which mapping properties can be overridden on a Session Task level. if we also have duplicate records and we have 2 targets. You can override sql if its a relational db. Apart from this.for unique values and T2.India Domain in Informatica means . But u need execute permissions as well.2> Registering a server to a repository is necessary because the sessions will be using this server to run. So we will have to use an aggregator followed by an expression.Organizations (nonprofit) mil Military com . Most complex logic we use is denormalization.
. << if u use aggregater as suggested by my friend u will get duplicate as well as distinct records in the second target >> This is not a right approach friends. but not really identify which record is a duplicate. There is a good practice of identifying duplicates. from . take two source instences and in first one embeded distinct in the source qualifier and connect it to the target t1.. you can simply write a query "select .group by <key fields> having count(*) > 1. well you can just get a count from this. then you will filter and wanted to make sure it reacheds the T1 and T2 tgt's appropriately. This would be the easiest way. Normally when you ask someone how to identify a duplicate record in informatica.0))--->Filter(condition is 0)--->t2. sort on key fields by which u want to find the duplicates..source--->sq--->exp-->sorter(with enable select distinct check box)--->t1 --->aggregator(with enabling group by and write count function)--->t2 source--->sq--->exp-->sorter(with enable select distinct check box)--->t1 --->aggregator(with enabling group by and write count function)--->t2 If u want only duplicates to t2 u can follow this sequence --->agg(with enable group by write this code decode(count(col).1.1.. and just write a query in the second source instance to fetch the duplicate records and connect it to the target t2. If it is RDBMS. Use a sorter transformation. then use an expression tranformation. Example: Example:
. great! But what if the source is a flat file? you can use an aggregate and get the count of it. they say "Use aggregator transf".
we can use pmcmd command 4.ldap authentication what is difference between COM & DCOM? What is the difference between Power Centre and Power Mart?
. true. 'Not Duplicate' <--> v_field1_prev = v_field1_curr <--> v_field2_prev = v_field2_curr use a Router transformation and put o_dup_flag = 'Duplicate' in T2 and 'Not Duplicate' in T1.union & custom transformation 2. all the rows come in order and it will evaluate based on the previous and current rows.data proffiling 7. 'Duplicate'. So as we sort.1 version when compared to 6.supporting of 64mb architecture 8.2 version? 1.field1--> field2--> SORTER: field1 --ascending/descending field2 --ascending/descending Expression: --> field1 --> field2 <--> v_field1_curr = field1 <--> v_field2_curr = field2 v_dup_flag = IIF(v_field1_curr = v_field1_prev.lookup on flatfile 3. what are the enhancements made to Informatica 7. false) o_dup_flag = IIF(v_dup_flag = true. Informatica evaluates row by row.we can export independent&dependent repository objects 5.version controlling 6.2. hope its clear.1.
of repository aplicability global repository local repository ERP support n No. high end WH supported supported available n No.delete or reject.Synonym and Flat file. low&mid range WH not supported supported not available
what is lookup transformation and update strategy transformation and explain with an example. To define a flagging of rows in a session it can be insert. Look up transformation is used to lookup the data in a relationa table.view.update .Update or Data driven. Two types of lookups Connected Unconnected Update strategy transformation This is used to control how the rows are flagged for insert. It compares the lookup transformation port values to lookup table column values based on the lookup condition By using lookup we can get the realated value.where as Power mart have single repository(desktop repository)Power Centre again linked to global repositor to share between users power center powermart no. In Update we have three options Update as Update Update as insert
.What is the procedure for creating Independent Data Marts from Informatica 7. The informatica server queries the lookup table based on the lookup ports used in the transformation.1? Power Centre have Multiple Repositories.Delete.Perform a caluclation and Update SCD.
thats all. after loading the data into the dimention tables we will load the data into the fact tables . Optimize transformations....sequence generator. Joiner.. but how 'n' dimensions will be loaded into a single fact table.. dimension table has to be treated as a source.lookup) in mapping designer then to the fact table connect the surrogate to the foreign key and the columns from dimensions to the fact.why because in bulk load u wont create redo log file. like do we use the one dimesion as a source and other dimnsions as a look up table we will load?or else any other logic we will implement? Can i use a session Bulk loading option that time can i make a recovery to the session? If the session is configured to use in bulk mode it will not write recovery information to recovery tables. no..Update else insert what is the logic will you implement to laod the data in to one factv from 'n' number of dimension tables. assume (dimention table as source tables) and fact table as target. So Bulk loading will not perform the recovery as required. later load data into individual dimensions by using sources and transformations(aggregator.
. Rank. to load data into one fact table from more than one dimension tables . to load the data from dimention table to fact table is simple . yaa. For transformations that use data cache (such as Aggregator. limit connected input/output or output ports.. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.when u normal load we create redo log file. the reason for this is that the dimention tables contain the data related to the fact table. You can also perform the following tasks to optimize the mapping: Configure single-pass reading. Eliminate transformation errors. You should minimize the amount of data moved by deleting unnecessary links between transformations. and Lookup transformations). firstly u need to create afact table and dimension tables. Optimize datatype conversions. but in bulk load session performance increases How do you configure mapping in informatica? You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible.
Rank. what is difference between dimention table and fact table and what are different dimention tables and fact tables? In the fact table contain measurable data and less columns and meny rows. Optimize datatype conversions.. limit connected input/output or output ports. Worklet is reusable workflows. Eliminate transformation errors.. You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. You can also perform the following tasks to optimize the mapping: Configure single-pass reading. You should minimize the amount of data moved by deleting unnecessary links between transformations. Optimize expressions.. For transformations that use data cache (such as Aggregator. If a certain set of task has to be reused in many workflows then we use worklets. what are mapping parameters and varibles in which situation we can use it
.Optimize expressions.. It might contain more than on task in it. Joiner.less rows Its contain primary key what is worklet and what use of worklet and in which situation we can use it A set of worlflow tasks is called worklet. Workflow tasks means 1)timer2)decesion3)command4)eventwait5)eventrise6)mail etc. Optimize transformations.non additive. But we r use diffrent situations by using this only Worklet is a set of tasks. The use of worklet in a workflow is similar to the use of mapplet in a mapping. and Lookup transformations). it has to be placed inside a workflow. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. We can use these worklets in other workflows. To execute a Worklet.. It's contain primarykey Diffrent types of fact tables: additive. semi additive In the dimensions table contain textual descrption of data and also contain meny columns.
converting columns to rows. it will be very difficult to edit the mapping and then change the attribute. They r after taking loans relocated in to another place that time i feel to diffcult maintain both prvious and current adresses in the sense i am using scd2 This is an simple example of complex mapping I have an requirement where in the columns names in a table (Table A) should appear in rows of target table (Table B) i. This is must for Incremental Data Loading. Mapping parameters have a constant value through out the session whereas in mapping variable the values change and the informatica server saves the values in the repository and uses next time when u run the session. maintain the history data and maintain the most recent changaes data. This makes the process simple. In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run. I involved in construct a 1 dataware house Meny customer is there in my bank project. Mapping parameter values remain constant. UPDATE or REJECT for target database. Default flag is Insert.If we need to change certain attributes of a mapping after every time the session is run. what is meant by complex mapping. Complex maping means involved in more logic and more business rules. DELETE. But value of mapping variables can be changed by using variable function. Actually in my project complex mapping is In my bank project. So we use mapping parameters and variables and define the values in a parameter file. If we need to change the parameter value then we need to edit the parameter file .e. Then we could edit the parameter file to change the attribute values. Is it possible through Informatica? If so. explain use of update strategy transformation? To flag source records as INSERT. If we need to increment the attribute value by 1 after every session run then we can use mapping variables . table A values
. how? if data in tables as follows Table A Key-1 char(3).
How can u load the records from 10001 th record when u run the session next time in informatica 6. 'A'. If its bulk then recovery wont work as expected Skip the number of initial rows to skip to 10001 in the source type option from session property
. but the target load type should be normal. A. null )) l_code from a. table b values 1T 1A 1G 2A 2T 2L 3A and output required is as 1. max(decode( bcode. max(decode( bcode.000 records in to the target.1? using performance recovery option Running the session in recovery mode will work. A the SQL query in source qualifier should be select key_1._______ 1 2 3 Table B bkey-a char(3). null )) a_code. max(decode( bcode. bcode. bcode. 'T'. T.bkey_a group by key_1 / use sequence genetrator transformation If a session fails after loading of 10. 'L'. L 3. bcode char(1).key_1 = b. null )) t_code. A 2. T. b where a. bcode.
How to perform a "Loop Scope / Loop condition" in an Informatica program ? Give me few examples . as per my knowledge i give the answer. in cache lookup the select statement executes only once and compares the values of the input record with the values in the cachebut in uncache lookup the the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record
. what is difference between lookup cashe and unchashed lookup? Can i run the mapping with out starting the informatica server? the difference between cache and uncacheed lookup iswhen you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation. can we run a group of sessions without using workflow manager ya Its Posible using pmcmd Command with out using the workflow Manager run the group of session.