P. 1
52060045-informatica

52060045-informatica

|Views: 228|Likes:
Published by Manish Kumar

More info:

Published by: Manish Kumar on Jan 07, 2012
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

02/02/2013

pdf

text

original

Informatica Question and Answers

what is rank transformation?where can we use this ... Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation. Where is the cache stored in informatica? cache stored in informatica is in informatica server.

If you want to create indexes after the load process which transformation you choose?stored procedure transformation In a joiner transformation, you should specify the source with fewer rows as the master source. Why? In joiner transformation Inforrmatica
server reads all the records from master source builds index and data caches based on master table rows after building the caches the joiner transformation reads records from the detail source and perform joins What happens if you try to create a shortcut to a non-shared folder? It only creates a copy of it. What is Transaction?

A transaction can be defined as DML operation. means it can be insertion, modification or deletion of data performed by users/ analysts/applicators

Can any body write a session parameter file which will change the source and targets for every session i.e different source and targets for each session run.
You are supposed to define a parameter file. And then in the Parameter file, you can define two parameters, one for source and one for target. Give like this for example: $Src_file = c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt Then go and define the parameter file: [folder_name.WF:workflow_name.ST:s_session_name] $Src_file =c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt If its a relational db, you can even give an overridden sql at the session level...as a parameter. Make sure the sql is in a single line.

Informatica Live Interview Questions
here are some of the interview questions i could not answer, any body can help giving answers for others also. thanks in advance. Explain grouped cross tab? Explain reference cursor What are parallel query's and query hints

What is meta data and system catalog What is factless fact schema What is confirmed dimension Which kind of index is preferred in DWH Why do we use DSS database for OLAP tools confirmed dimension == one dimension that shares with two fact table factless means, fact table without measures only contains foreign keys-two types of factless table, one is event tracking and other is coverage table Bit map indexes preferred in the data ware housing Metadate is data about data, here every thing is stored example-mapping, sessions, privileges other data, in informatica we can see the Metadate in the repository. System catalog that we used in the cognos, that also contains data, tables, privileges, predefined filter etc, using this catalog we generate reports group cross tab is a type of report in cognos, where we have to assign 3 measures for getting the result What is meant by Junk Attribute in Informatica? Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions.

Can anyone explain about incremental aggregation with an example?
When you use aggregator transformation to aggregate it creates index and data caches to store the data 1.Of group by columns 2. Of aggregate columns the incremental aggregation is used when we have historical data in place which will be used in aggregation incremental aggregation uses the cache which contains the historical data and for each group by column value already present in cache it add the data value to its corresponding data cache value and outputs the row in case of a incoming value having no match in index cache the new values for group by and output ports are inserted into the cache . Difference between Rank and Dense Rank?

Rank: 1 2<--2nd position 2<--3rd position 4 5 Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game usually Ranks this way. This is usually a Gold Ranking. Dense Rank: 1 2<--2nd position 2<--3rd position 3 4 Same ranks are assigned to same totals/numbers/names. The next rank follows the serial number.

About Informatica Power center 7: 1) I want to know which mapping properties can be overridden on a Session Task level. 2)Know what types of permissions are needed to run and schedule Work flows.
1) I want to Know which mapping properties can be overridden on a Session Task level? You can override any properties other than the source and targets. Make sure the source and targets exist in your db if it is a relational db. If it is a flat file, you can override its properties. You can override sql if its a relational db, session log, DTM buffer size, cache sizes etc. 2) Know what types of permissions are needed to run and schedule Work flows You need execute permissions on the folder to run/schedule a workflow. You may have read and write. But u need execute permissions as well.

Can any one explain real time complain mappings or complex transformations in Informatica. Especially in Sales Domain.
Most complex logic we use is denormalization. We don’t have any Denormalizer transformation in Informatica. So we will have to use an aggregator followed by an expression. Apart from this, we use most of the complex in expression transformation involving lot of nested IIF and Decode statements...another one is the union transformation and joiner.

How do you create a mapping using multiple lookup transformation?
Use unconnected lookup if same lookup repeats multiple times.

In the source, if we also have duplicate records and we have 2 targets, T1- for unique values and T2- only for duplicate values. How do we pass the unique values to T1 and duplicate values to T2 from the source to these 2 different targets in a single mapping?

Soln1: source--->sq--->exp-->sorter (with enable select distinct check box) --->t1
--->aggregator (with enabling group by and write count function) --->t2 If u wants only duplicates to t2 u can follow this sequence --->agg (with enable group by write this code decode(count(col),1,1,0))-->Filter(condition is 0)--->t2.

Soln2: take two source instances and in first one embedded distinct in the source qualifier and connect
it to the target t1. and just write a query in the second source instance to fetch the duplicate records and connect it to the target t2. << if u use aggregator as suggested by my friend u will get duplicate as well as distinct records in the second target >>

Soln3: Use a sorter transformation. Sort on key fields by which u want to find the duplicates. then use
an expression transformation. Example: Example: field1--> field2--> SORTER: field1 --ascending/descending field2 --ascending/descending

all the linked columns will reflect that change.1? Power Centre have Multiple Repositories. The informatica server queries the lookup table based on the lookup ports used in the transformation. What are the enhancements made to Informatica 7.We can use up to 64 partitions What is the difference between Power Centre and Power Mart? What is the procedure for creating Independent Data Marts from Informatica 7.2.Expression: --> field1 --> field2 <--> v_field1_curr = field1 <--> v_field2_curr = field2 v_dup_flag = IIF(v_field1_curr = v_field1_prev.e. It compares the lookup transformation port values to lookup table column values based on the lookup condition By using lookup we can get related value.Union and custom transformation.We can lookup a flat file . if we change any data type of a field. low&mid range WH not supported supported not available What is lookup transformation and update strategy transformation and explain with an example.We can write to XML target. So as we sort. all the rows come in order and it will evaluate based on the previous and current rows. true. Synonym and Flat file.. high end WH supported supported available Powermart n No.where as Power mart have single repository(desktop repository)Power Centre again linked to global repositor to share between users Power center No. Look up transformation is used to lookup the data in a relational table.1. of repository aplicability global repository local repository ERP support n No.. Perform a calculation and Update SCD.2 version? In 7+ versions .There is propagate option i. Two types of lookups Connected Unconnected . 'Not Duplicate' <--> v_field1_prev = v_field1_curr <--> v_field2_prev = v_field2_curr Use a Router transformation and put o_dup_flag = 'Duplicate' in T2 and 'Not Duplicate' in T1.1 version when compared to 6. view. Informatica evaluates row by row. false) o_dup_flag = IIF(v_dup_flag = true. 'Duplicate'.

To load data into one fact table from more than one dimension tables. why because in bulk load u won’t create redo log file..Update strategy transformation This is used to control how the rows are flagged for insert. Rank. Delete. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. update. You should minimize the amount of data moved by deleting unnecessary links between transformations. and Lookup transformations). Eliminate transformation errors. Firstly you need to create a fact table and dimension tables. Optimize expressions.. You can also perform the following tasks to optimize the mapping: • • • • • Configure single-pass reading. For transformations that use data cache (such as Aggregator. To load the data from dimension table to fact table is simple . but in bulk load session performance increases.. assume (dimension table as source tables) and fact table as target. For transformations that use data cache (such as Aggregator. Joiner. You can also perform the following tasks to optimize the mapping: . In Update we have three options Update as Update Update as insert Update else insert What is the logic will you implement to load the data in to one fact able from 'n' number of dimension tables. No. Rank.. limit connected input/output or output ports. . Update or Data driven. limit connected input/output or output ports. that all. Optimize datatype conversions. So Bulk loading will not perform the recovery as required. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. and Lookup transformations). Joiner. lookup) in mapping designer then to the fact table connect the surrogate to the foreign key and the columns from dimensions to the fact. You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. After loading the data into the dimension tables we will load the data into the fact tables for this is that the dimension tables contain the data related to the fact table.. You should minimize the amount of data moved by deleting unnecessary links between transformations. delete or reject.. Later load data into individual dimensions by using sources and transformations (aggregator. when u normal load we create redo log file. sequence generator. How do you configure mapping in informatica You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. Optimize transformations. the reason Can i use a session Bulk loading option that time can i make a recovery to the session? If the session is configured to use in bulk mode it will not write recovery information to recovery tables. To define a flagging of rows in a session it can be insert..

What is difference between dimension table and fact table and what are different dimension tables and fact tables In the fact table contain measurable data and fewer columns and many rows.e. Is it possible through Informatica? If so. Mapping parameter values remain constant. It's contain primary key Different types of fact tables: Additive. Then we could edit the parameter file to change the attribute values. Eliminate transformation errors. non additive. what is meant by complex mapping. it has to be placed inside a workflow. What are mapping parameters and variables in which situation we can use it If we need to change certain attributes of a mapping after every time the session is run. less rows Its contain primary key What are Work let and what use of work let and in which situation we can use it Worklet is a set of tasks. Complex mapping means involved in more logic and more business rules. So we use mapping parameters and variables and define the values in a parameter file. how? if data in tables as follows Table A . But value of mapping variables can be changed by using variable function. Optimize expressions. Optimize datatype conversions. Optimize transformations. If a certain set of task has to be reused in many workflows then we use work lets. This makes the process simple. I involved in construct a 1 data ware houseMany customer is there in my bank project. If we need to increment the attribute value by 1 after every session run then we can use mapping variables In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run. explain use of update strategy transformation Maintain the history data and maintain the most recent changes data. semi additive In the dimensions table contain textual description of data and also contain many columns. If we need to change the parameter values then we need to edit the parameter file. To execute a Work let.Actually in my project complex mapping isIn my bank project. The use of work let in a workflow is similar to the use of mapplet in a mapping. it will be very difficult to edit the mapping and then change the attribute. They r after taking loans relocated in to another place that time i feel to difficult maintain both previous and current addressesin the sense i am using scd2This is an simple example of complex mapping I have an requirement where in the columns names in a table (Table A) should appear in rows of target table (Table B) i. converting columns to rows.• ○ ○ ○ ○ ○ Configure single-pass reading.

1? Simple solution. A 2. bcode. Stop the outer most bacth\ Abort:---You can issue the abort command . max(decode( bcode. stop: _______If the session u want to stop is a part of batch you must stop the batch. bcode char(1). max(decode( bcode. b where a. null )) a_code.000 records in to the target How can u load the records from 10001 th record when u run the session next time in informatica 6.key_1 = b. it is similar to stop command except it has 60 second time out . max(decode( bcode. what is the difference between stop and abort The Power Center Server handles the abort command for the Session task like the stop command. T. if the batch is part of nested batch.bkey_a group by key_1 / If a session fails after loading of 10. table b values 1T 1A 1G 2A 2T 2L 3A and output required is as 1. If the server cannot finish processing and committing data with in 60 sec .Key-1 char(3). null )) t_code. it kills the DTM process and terminates the session. A. T. 'L'. Nothing by using performance recovery option Can we run a group of sessions without using workflow manager ya Its Possible using pmcmd Command with out using the workflow Manager run the group of session. 'T'. 'A'. table A values _______ 1 2 3 Table B bkey-a char(3). bcode. L 3. null )) l_code from a. except it has a timeout period of 60 seconds. bcode. A the SQL query in source qualifier should be select key_1. If the Power Center Server cannot finish processing and committing data within the timeout period.

.. Now go to business management team they can ask for metrics out of billing process for their use. state etc) Sales rep sales rep number. in cache lookup the select statement executes only once and compares the values of the input record with the values in the cache but in uncached lookup the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record I want to prepare a questionnaire. idsalesorg: sales ord idBill dimension: Bill #. Give at least four reasons for the selecting the organization. Depend upon the granularity of your data.. name. say a telecom company? First of all meet your sponsors and make a BRD (business requirement document) about their expectation from this data warehouse (main aim comes from them). Can i start and stop single session in concurrent batch? Just right click on the particular session and going to recovery option or by using event wait and event rise What is Micro Strategy? Why is it used for? Can any one explain in detail about it? Micro strategy is again an BI tool which is a HOLAP. Identify a large company/organization that is a prime candidate for DWH project. rate plan to perform sales rep and channel performance analysis and rate plan analysis.. city.For example they need customer billing process. This information is required to build data warehouse. an insurance company. may be the prime candidate for this) 2. billing metrics.. Numberrate plan:rate plan codeAnd Fact table can be:Billing details(bill #.. Now management people monthly usage. banks. Can you please tell me what should be those 15 questions to ask from a company.1 and Abinitio There is a lot of difference between Inforrmatica an Abinitio In Ab Initio we r using 3 parllalisim but Informatica using 1 parllalisim In Ab Initio no scheduling option we can scheduled manully or pl/sql script but informatica contains 4 scheduling options Ab Inition contains co-operating system but informatica is not .customer id. sales organization. It has a full range of reporting on web also in windows. minutes used.basically a reporting tool. call details etc)you can follow star and snow flake schema in this case. Prepare a questionnaire consisting of at least 15 non-trivial questions to collect requirements/information about the organization. 3.Bill date. name. u can create 2 dimensional report and also cubes in here.What is difference between lookup cache and uncached lookup? Can i run the mapping with out starting the informatica server? The difference between cache and uncached lookup is when you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation.. So your dimension tables can be Customer (customer id. What is difference b/w Informatica 7. The details about it are as follows: 1.. (For example Telecommunication.

If u had to split the source level key going into two separate tables. target and each transformation level. the optimzer runs the query. it basically calculates the cost of each path and the analyses for which path the cost of execution is less and then executes that path so that it can optimize the query execution. For the first time. So depending on the number of rules which are to be applied. error related data) in a report format When your workflow gets completed go to workflow monitor right click the session .) When ever you process any SQL query in Oracle. if table is not analysed.Ramp time is very quickly in Ab Initio campare than Informatica Ab Initio is userfriendly than Informatica What is mystery dimension? Also known as Junk Dimensions Making sense of the rogue fields in your fact table. What are the different ways you could handle this type of situation? foreign key what is the best way to show metadata(number of rows at source. bcz the third has some disadvantages. where we need to optimize a SQL query. Since informatica does not gurantee keys are loaded properly(order!) into those tables. Its about to be shipping in the market. it reads the query and decides which will the best possible way for executing the query. Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques. if we go for session properties we can see errors related to data . Oracle follows these optimization techniques. But heard that it’s included in the latest version 8.then What CBO does is. Oracle will go with full table scan. the Oracle follows RBO.0 where u can append to a flat file.then go to transformation statistics there we can see number of rows in source and target.. One as surrogate and other as primary. cost based Optimizer (CBO): If a SQL query can be executed in 2 different ways ( like may have path 1 and path2 for same query). Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query. 1. What is cost based and rule based approaches and the difference Cost based and rule based approaches are the optimization techniques which are used in related to databases. Use: If the table you are trying to query is already analysed. what oracle engine internally does is. 2. So in this process. then oracle will go with CBO. what are partition points? Partition points mark the thread boundaries in a source pipeline and divide the pipeline into stages. How to append the records in flat file (Informatica) ? Where as in Data stage we have the options i) overwrite the existing file ii) Append existing file This is not there in Informatica v 7. If the table is not analysed .

Treate source rows as :update so.In data source type we select Impromptu Query Definetion. data cleansing. Do I need a special plugin .you can change it in the session general properties -. you can use the view REP_SESS_LOG to get these data Two relational tables are connected to SQ transformation.finally call the procedure in informatica with the help of stored procedure transformation What is data merging. what are the possible errors it will be thrown? We can connect two relational tables in one sq Transformation. Normalizer: It is a transormation mainly using for cobol sources. it's change the rows into coloums and columns into rows Normalization:To remove the retundancy and inconsitecy How do I import VSAM files from source to target. No errors will be perform With out using Updatestrategy and sessons options. now you can update the rows in the target table Could anyone please tell me what are the steps required for type2 dimension/version data mapping. Differences between Normalizer and Normalizer transformation.You can select these details from the repository table. sampling? Cleansing:---TO identify and remove the retundacy and inconsistency sampling: just smaple the data throug send the data from source to target What is IQD file? IQD file is nothing but Impromptu Query Definition. all the incoming rows will be set with update flag. how we can do the update our target table? Soln1: You can use this by using "update override" in target properties Soln2: In session properties. Create one procedure and declare the sequence inside the procedure. This file is mainly used in Cognos Impromptu tool after creating a imr ( report) we save the imr as IQD file which is used while creating a cube in power play transformer. how can we implement it Go to mapping designer in it go for mapping select wizard in it go for slowly changing dimension Here u will find a new window their u need to give the mapping name source table target table and type of slowly changing dimension then if select finish slowly changing dimension 2 mapping is created go to ware designer and generate the table then validate the mapping in mapping designer save it to repository run the session in workflow manager later update the source table and re run again u will find the difference in target table How to import oracle sequence into Informatica. There is an option insert update insert as update update as update like that by using this we will easily solve Soln3: By default all the rows in the session is set as insert flag .

(b) Dedicate the second one to update : source=target.target.. 3)we can increase the chache size of the lookup If you are workflow is running slow in informatica. these r new rows only the new rows will come to mapping and the process will be fast . 2>Then from navigator window. Where do you start trouble shooting and what are the steps you follow? SOLN1: when the work flow is running slowly you have to find out the bottlenecks in this order target source mapping session system SOLN2: work flow may be slow due to different reasons one is alpha characters in decimal data check it out this and due to insufficient length of strings check with the SQL override How do you handle decimal places while importing a flatfile into informatica? while importing the flat file. right click on the specific object you are currently in.. In that window at the lower end side..In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL What is the procedure or steps implementing versioning if you are already in version7. errors encountered. do the following steps. If you have four lookup tables in the workflow How do you troubleshoot to improve performance? There r many ways to improve the mapping which has multiple lookups.. Then click ok button. go to the session log file there we will find the information regarding to the session initiation process. 2) Divide the lookup mapping into two (a) dedicate one for insert means: source ... the flat file wizard helps in configuring the properties of the file so that select the numeric column and just enter the precision value and the scale. so by seeing the errors encountered during the session running. can anyone explain error handling in informatica with examples so that it will be easy to explain the same in the interview. 1) We can create an index for the lookup table if we have permissions(staging area). we can resolve the errors.654. Any gotcha\'s or precautions..X. Leave the information you have done like "modified this mapping" etc. these r existing rows only the rows which exists allready will come into the mapping. 1> First save the changes or new implementations. first of all after doing anything in your designer mode or workflow manager. load summary. For version control in ETL layer using informatica. Where do you start trouble shooting and what are the steps you follow? If you are workflow is running slow in informatica. you will find versioning->Check In. There will be a pop up window. Precision includes the scale for examples if the number is 98888. A window will be opened. enter precision as 8 and scale as 3 and width as 10 for fixed width flat file In a sequential Batch how can we stop single session? .

so all the dimensions are marinating historical data. the server takes the saved variable value in the repository and starts assigning the next value of the saved value. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column). Because in Data warehousing historical data should be maintained.. then remove it and put ur desired one. SOLN1: u can do onething after running the mapping. all details should be maintain in one table.. the variable value will be saved to the repository after the completion of the session and the next time when u run the session. if u maintain primary key it won't allow the duplicate records with same employee id. they are de normalized. for example i ran a session and in the end it stored a value of 50 to the repository. run the session. it should start with the value of 70. in that go for persistant values. i hope ur task will be done SOLN2: it takes value of 51 but u can override the saved variable in the repository by defining the value in the parameter file. in workflow manager start-------->session. i mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in informatica it's very easy Actually oracle 8i not allowed user defined data types But 9i allows and then blob.. how to do this. there u will find the last value stored in the repository regarding to mapping variable. not with the value of 51. But we can use passive transformation Can any one comment on significance of oracle 9i in informatica when compared to oracle 8 or 8i.if there is a parameter file for the mapping variable it uses the value in the parameter file not the value+1 in the repositoryfor example assign the value of the mapping variable as 70.next time when i run the session. to maintain historical data means suppose one employee details like where previously he worked. lob allow only 9i not 8i and more over list partinition is there in 9i only in the concept of mapping parameters and variables. right clickon the session u will get a menu..in othere words higher preference is given to the value in the parameter file how to use mapping parameters and what is their use . and now where he is working. why dimenstion tables are denormalized in nature ?. but the update flag will not be remain..we have a task called wait event using that we can stop. we start using raise event. because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table Can we use aggregator/active transformation after update strategy transformation? We can use.

Outport. then according to the specifications the fact tables should be loaded. Outport is used when data is mapped to next transformation. it is general mapping as we do for other tables. .if is flat file FTP connection... SOLN2: During the execution of workflow all the rejected rows will be stored in bad files (where your informatica server get installed C:\Program Files\Inforrmatica Power Center 7. in that u will have a "distinct" option make use of it . you can also use as for example consider price and quantity and total as a variable we can make a sum on the total_amt by giving sum (total_amt) variable port is used to break the complex expression into simpler and also it is used to store intermediate values What is difference between IIF and DECODE function.if it is relational.see we can make sure with connection in the properties of session both sources & targets What are variable ports and list two situations when they can be used? We have mainly three ports Inport.. specifications will play important role for loading the fact. if u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that joined table.. How to lookup the data on multiple tabels. How does the server recognise the source and target databases? By using ODBC connection. just enter filter condition and finally create a parameter file to assign the value for the variable / parameter and configure the session properties. SOLN2: first dimenstion tables need to be loaded. data type once defined the variable/parameter is in the any expression for example in SQ transformation in the source filter properties tab. it helps in adding incremental data mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping ----> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$. What is the use of incremental aggregation? Explain me in brief with an example. if their parameter is not present it uses the initial value which is assigned at the time of creating the variable How to delete duplicate rows in flat files source is any option in informatica Use a sorter transformation .by using these 2 wizards we can create different types of mappings according to the business requirements and load into the star schemas(fact and dimension tables).Mapping parameters and variables make the use of mappings more flexible and also it avoids creating of multiple mappings. Variable port. Its a session option when the informatica server performs incremental aggregation it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally for performance we will use it.Give in detail? SOLN1: we use the 2 wizards (i. explane with syntax or example SOLN1: there is one utility called "reject Loader" where we can find out the reject records and able to refine and reload the rejected records. Variable port is used when we mathematical calculations are required.e) the getting started wizard and slowly changing dimension wizard to load the fact and dimension tables. however the final step is optional. and choose type as parameter/variable. Don’t think that fact table’s r different in case of loading.1\Server) These bad files can be imported as flat a file in source then thro' direct mapping we can load these files in desired format. informatica provieds lookup on joined tables How to retrieve the records from a rejected file. What is the procedure to load the fact table. Inport represents data is flowing into transformation.

how many concurrent threads r u allowed to run on the db server? which objects are required by the debugger to create a valid debug session? Initially the session should be valid session.. I organization point of view it is never encouraged to use N number of tables at a time. you might as well use the Rank transformation. expressions should be available min 1 break point should be available for debugger to debug your session. It reduces database and informatica server performance" The restriction is only on the database side. Informatica server Object is must.Normalized dimensions which is better among connected lookup and unconnected lookup transformations in informatica or any other ETL tool? When you compared both basically connected lookup will return more values and unconnected returns one value conn lookup is in the same pipeline of source and it will accept dynamic caching. SALARY3. Source. check out the help file on how to use it. IIF( SALES < 200.You can use nested IIF statements to test multiple conditions. IIF( SALES < 50. SALARY1. SALARY3. 0 ) You can use DECODE instead of IIF in many cases. target. The following shows how you can use DECODE instead of IIF : SALES > 0 and SALES < 50. beacaz we can use dynamic cache with it . DECODE may improve readability. SALES > 199. IIF( SALES < 100. Question is " if you make N number of tables to participate at a time in processing what is the position of your database. SALARY2. what is the procedure to write the query to list the highest salary of three employees? SELECT sal FROM (SELECT sal FROM my_table ORDER BY sal DESC) WHERE ROWNUM < 4. SALES > 49 AND SALES < 100. The following example tests for various conditions and returns 0 if sales is zero or negative: IIF( SALES > 0. lookups. Unconn lookup don't have that facility but in some special cases we can use Unconnected. where as unconnected is concerned it has a single return port. Star schema--De-Normalized dimensions Snow Flake Schema-. SALARY1.. SALARY2.(in case of etl informatica is concerned) What is the limit to the number of sources and targets you can have in a mapping As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping. also connected loop up can send multiple columns in a single row. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favorable I think the better one is connected look up. . BONUS) in Dimensional modeling fact table is normalized or denormalized?in case of star schema and incase of snow flake schema? No concept of normailzation in the case of star schema but in the case of snow flack schema dimension table must be normalized. since this is informatica. BONUS))). SALES > 99 AND SALES < 200.

Right click on the source qualifier u will find EDIT click on it. Source based commit will commit the data into target based on commit interval so for every 10. what ever it may be informatica is a third party tool. In database we don't have Incremental aggregation facility. The best way to find out bottlenecks is writing to flat file and see where the bottle neck is . How to join two tables without using the Joiner Transformation SOLN1: It possible to join the two or more tables by using source qualifier. If that happened total aggregation we need to execute on informatica also. How do you decide whether you need it do aggregations at database level or at Informatica level? It depends upon our requirement only If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Suppose session is configured with commit interval of 10. we can change the format in expression. Yes. Update or insert files are known by checking the target file or table only. When u drag n drop the table u will getting the source qualifier for each table. u will find sql query in that u can write ur sqls .000. i. The input data is in one format and target is in another format. Here I am explaining why we need to use informatica. Target based commit will commit the data into target based on buffer size of the target.. So for every 6. we can use Informatica for cleansing data some time we use stages to cleansing the data. For example a field X has some values and other with Null values and assigned to target field where target field is not null column. but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. Assume appropriate value wherever required. Explain the commit points for Source based commit and Target based commit. Identifying bottlenecks in various components of Informatica and resolving them. Click on the properties tab. We can assign some default values to the target to represent complete set of data in the target.000 rows. Add a common source qualifier for all. It depends upon performance again else we can use expression to cleansing data. inside an expression we can assign space or some constant value to avoid session failure. If any rejected rows are there automatically it will be updated to the session log file. But provided the tables should have relationship. No necessary to process entire values again and again unless this can be done if nobody deleted that cache files. it commits the data into target when ever the buffer fills Let us assume that the buffer size is 6. How do we estimate the number of partitions that a mapping really requires? Is it dependent on the machine configuration? It depends upon the informatica version we r using suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions Can Informatica be used as a Cleansing Tool? If yes give example of transformations that can implement a data cleansing routine. Delete all the source qualifiers.e.We are using Update Strategy Transformation in mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica.000 rows it commits the data.000 rows it will commit into target. In Designer while creating Update Strategy Transformation uncheck "forward to next transformation".000 rows and source has 50. so it will take more time to process aggregation compared to the database.

So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab..0 Features in 7. In a filter expression we want to compare one date field with a db2 system field CURRENT DATE.2 and 7.2 and Informatica 7. also between Versions 6. other wise u will get that type of error Use Sysdate or use to_date for the current date what does the expression n filter transformations do in Informatica Slowly growing target wizard? EXPESSION transformation detects and flags the rows from source. Under its properties tab. they introduce workflow manager and workflow monitor. Versioning LDAP authentication Support of 64 bit architectures Differences between Informatica 6.1). the db2 date format is "yyyymmdd" where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date formate to local database date formate is compulsary. What are the Differences between Informatica Power Center versions 6.1 they introduce a new thing called repository server and in place of server manager(5.. Our Syntax: datefield = CURRENT DATE (we didn't define it by ports. Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation how to create the staging area in your database A Staging area in a DW is used as a temporary space to hold all the records from the source system. Note: you can only join 2 tables with Joiner Transformation but you can join two tables from different databases. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options. but this is not valid (PMParser: Missing Operator). but source qualifier transformation is used to join only n tables from same database SOLN3: use Source Qualifier transformation to join tables on the SAME database. you can specify the user-defined join.1. Can someone help us. Union and custom transformation . whats the diff between Informatica powercenter server. In ver 7x u have the option of looking up (lookup) on a flat file.SOLN2: joiner transformation is used to join n (n>1) tables from same or different databases. U can write to XML target.2 and 5. its a system field ).1? The main difference between informatica 5.1 are : 1. repositoryserver and repository? Power center server contains the scheduled runs at which time data should load from source to target Repository contains all the definitions of the mappings done in designer. Any select statement you can run on a database.1 is that in 6. you can do also in Source Qualifier.1 and 6.

Unconnected Unconnected Run a stored procedure every time a row passes through the Stored Procedure Connected or transformation. Run a stored procedure before or after your session. Unconnected Run a stored procedure based on data that passes through the mapping. you must create variables for each output parameter. Pass parameters to the stored procedure and receive a single output parameter. Lookup on flat file 3. less time as compared to above and less cost. We can export independent and dependent rep objects 6. Grid servers working on different operating systems can coexist on same server 4.2. which will need more crossfunctional skills and timetaking process also costly. Run nested stored procedures. mormal load) Compare Data Warehousing Top-Down approach with Bottom-up approach in top down approch: first we have to build dataware house then we will build data marts. Normal Load and Bulk load If the database supports bulk load option from Inforrmatica then using BULK LOAD for intial loading the tables is recommended. what are the difference between view and materialized view? .(incremental load concept is differnt dont merge with bulk load. Unconnected Connected or Unconnected Connected or Unconnected Unconnected Unconnected Discuss which is better among incremental load. the data mart that is first build will remain as a proff of concept for the others. such as pre. in bottom up approach: first we will build data marts then data warehuse. see Calling a Stored Procedure From an Expression. Data profilling What is the difference between connected and unconnected stored procedures. For details. Run a stored procedure once during your mapping. Note: To get multiple output parameters from an unconnected Stored Procedure transformation. We ca move mapping in any web application 7. We can use pmcmdrep 5. such as when a specific port does not contain a null value. Version controlling 8. Pass parameters to the stored procedure and receive multiple output parameters.or postsession. Call multiple times within a mapping. What is the difference between summary filter and detail filter summary filter can be applied on a group of rows that contain a common value where as detail filters can be applied on each and every rec of the data base. Depending upon the requirment we should choose between Normal and incremental loading strategies If supported by the database bulk load can do the loading faster than normal load.

Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse. A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data

can we modify the data in flat file?

Just open the text file with notepad, change what ever you want (but datatype should be the same)

how to get the first 100 rows from the flat file into the target?
SOLN1: task ----->(link) session (workflow manager) double click on link and type $$source sucsess rows(parameter in session variables) = 100 it should automatically stops session. SOLN2: 1. Use test download option if you want to use it for testing. 2. Put counter/sequence generator in mapping and perform it.

can we lookup a table from a source qualifer transformation-unconnected lookup
No. we can't do. I will explain you why. 1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query. 2) source qualifier don't have any variables feilds to utalize as expression. what is a junk dimension A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades. What is the difference between Narmal load and Bul... Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed.Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases. compartivly Bulk load is pretty faster than normal load.

At the max how many tranformations can be us in a mapping?
There is no such limitation to use this number of transformations. But in performance point of view using too many transformations will reduce the session performance. My idea is "if needed more tranformations to use in a mapping its better to go for some stored procedure." Waht are main advantages and purpose of using Normalizer Transformation in Informatica? Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data

How do u convert rows to columns in Normalizer? could you explain us?? Normally, its used to convert columns to rows but for converting rows to columns, we need an aggregator and expression and little effort is needed for coding. Denormalization is not possible with a Normalizer transformation.

Discuss the advantages & Disadvantages of star & snowflake schema?
In a star schema every dimension will have a primary key. In a star schema, a dimension table will not have any parent table. Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to drill down the data from topmost hierachies to the lowermost hierarchies. star schema consists of single fact table surrounded by some dimensional table.In snowflake schema the dimension tables are connected with some subdimension table. In starflake dimensional ables r denormalized,in snowflake dimension tables r normalized. star schema is used for report generation ,snowflake schema is used for cube. The advantage of snowflake schema is that the normalized tables r easier to maintain.it also saves the storage space. The disadvantage of snowflake schema is that it reduces the effectiveness of navigation across the tables due to large no of joins between them. what is a time dimension? give an example.

Time dimension is one of important in Datawarehouse. Whenever u genetated the report , that time u access all data from thro time dimension. eg. employee time dimension Fields : Date key, full date, day of wek, day , month,quarter,fiscal year

What r the connected or unconnected transforamations?
Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not. much like calling a program by name and by reference. use unconnected transforms when you wanna call the same transform many times in a single mapping An unconnected transformation cant be connected to another transformation. but it can be called inside another transformation. uncondition transformation are directly connected and can/used in as many as other transformations. If you are using a transformation several times, use unconditional. You get better performance.

How can U create or import flat file definition in to the warehouse designer?
U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer. U can not create or import flat file defintion in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.

What r the tasks that Loadmanger process will do?

Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process. Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again. Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session. Creating log files: Loadmanger creates logfile contains the status of session.

How do you transfert the data from data warehouse to flatfile?
You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool.

Diff between informatica repositry server & informatica server
Informatica Repository Server:It's manages connections to the repository from client application. Informatica Server:It's extracts the source data,performs the data transformation,and loads the transformed data into the target Router transformation

A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group. What are 2 modes of data movement in Informatica Server?The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server. a) Unicode - IS allows 2 bytes for each character and uses additional byte for each nonascii character (such as Japanese characters) b) ASCII - IS holds all data in a single byte The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.

How to read rejected data or bad data from bad file and reload it to target?
correction the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator.

Explain the informatica Architecture in detail
Informatica server connects source data and target data using native odbc drivers again it connect to the repository for running sessions and retriveing metadata information source------>informatica server--------->target

writer. Runs sessions from master servers. 7. 9. the DTM performs the following tasks: 1.f. Runs workflow tasks. Distributes sessions to worker servers. Creates the session log file. Verifies connection object permissions.monitor ¢Õ ¢Õdesigner w. When the PowerCenter Server runs a session.adm. 11. Runs post-session shell commands.transform. Validates session code pages if data code page validation is enabled.When the PowerCenter Server runs a workflow. the PowerCenter Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users. As the amount of data within an organization expands and real-time demand for information grows. Creates and expands session variables. 4. 3. Runs post-session stored procedures and SQL. Sends post-session email. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks.manager how can we partition a session in Informatica? The Informatica® PowerCenter® Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning. What is Data cleansing. Creates the workflow log file. 10. and load data. 2.| | REPOSITORY repository←Repository→Repository ser. 6. ¢Õ source←informatica server→target control server -------------¢Õ w. the Load Manager performs the following tasks: 1. 8. 4. Runs pre-session shell commands. 5. Locks the workflow and reads workflow properties. 7. 8.f. Runs pre-session stored procedures and SQL. 5. while ensuring data integrity throughout the execution process. Checks query conversions if data code page validation is disabled. and transformation threads to extract. Sends post-session email if the DTM terminates abnormally. Starts the DTM to run sessions. 2. Reads the parameter file and expands workflow variables. What is Load Manager? While running a Workflow..the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks. 6. reader.? . Creates and runs mapping. 3. Fetches session and mapping metadata from the repository.

This feature can save U great deal of work. It doesn't mean we should not place in any other folder. incomplete. or formatted incorrectly. redundant. The all sub systesms maintinns the customer address can be different. Transformations can be active or passive.Then we can create a mapplet which contains a series of Lkp transformations to find each dimension key and use it in each fact table mapping instead of creating the same Lkp logic in each mapping. clean it before it is add to Datawarehouse. what is a transforamation?It is a repostitory object that generates. if we place in server src folder by default src will be selected at time session creation How many ways you can update a relational source defintion and what r they?Two ways 1.U can demote it to a standard transformation at any time. out-of-date. such as a Filter transformation that removes rows that do not meet the filter condition.e to the next transformation or target) with/with out modifying the dataWhat r the active and passive transforamtions?An active transforamtion can change the number of rows that pass through it. Edit the definition 2.What r the methods for creating reusable transforamtions?Two methods 1. To provide support for Mainframes source data.all instances of it inherit the changes. If u change the properties of a reusable transformation in mapping.A passive transformation does not change the number of rows that pass through it. such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.When u need to incorporate this transformation into maping.What r the unsupported repository . In performance point of view its better to place the file in server local src folder. We might need a address cleansing to tool to have the customers addresses in clean and neat form. An active transformation can change the number of rows that pass through it.After U add a transformation to the mapping . U can promote it to the status of reusable transformation.U can revert it to the original reusable transformation properties by clicking the revert button. This is nothing but polising of data. What r the reusable transforamtions?Reusable transformations can be used in multiple mappings.The process of finding and removing or correcting data that is incorrect.Design it in the transformation developer. Other typical example can be Addresses. if you need path please check the server properties availble at workflow manager. The other may store it as MALE and FEMALE.Since cobol sources r oftenly consists of Denormailzed data.Since the instance of reusable transforamation is a pointer to that transforamtion.U add an instance of it to maping. Once U promote a standard transformation to reusable status.Promote a standard transformation from the mapping designer.U can change the transforamation in the transformation developer. For example of one of the sub system store the Gender as M and F.modifies or passes data. A passive transformation does not change the number of rows that pass through it. 2.Later if U change the definition of the transformation .A transformation is repository object that pass data to the next stage(i. So we need to polish this data. Reimport the definitionWhich transformation should u need while using the cobol sources as source defintions?Normalizer transformaiton which is used to normalize the data.which files r used as a source definitions?COBOL Copy-book filesWhere should U place the flat file to import the flat file defintion to the designer? There is no such restrication to place the source file.its instances automatically reflect these changes. What is the maplet? For Ex:Suppose we have several fact tables that require a series of dimension keys.

If the Informatica Server requires more space. • Mappings. Target definitions that are configured as cubes and dimensions.5 style Look Up functions XML source definitions IBM MQ source definitions• Source definitions. • Multi-dimensional metadata.objects for a mapplet?COBOL source definition Joiner transformations Normalizer transformations Non reusable sequence generator transformations. A workflow is a set of instructions that describes how and when to run tasks related to extracting. • Mapplets. and loading data. the Informatica Server creates index and data caches in memory to process the transformation. it stores overflow values in cache files. We can use mapping parameters or variables in any transformation of the same maping or mapplet in which U have created maping parameters or variables. Pre or post session stored procedures Target defintions Power mart 3.A mapping parameter retains the same value throughout the entire session. use a sorter before the aggregator 2.Because reusable tranformation is not contained with any maplet or maping. views.Then define the value of parameter in a parameter file for the session. Each session corresponds to a single mapping.If the informatica server requires more space. What r the diffrence between joiner transformation and source qualifier transformation?U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation.it stores overflow values in cache files. When you run a workflow that uses an Aggregator transformation. Two relational sources should come from same datasource in sourcequalifier. These are the instructions that the Informatica Server uses to transform and move data.Can U use the maping parameters or variables created in one maping into another maping?NO.U declare and use the parameter in a maping or maplet. • Sessions and workflows. U need matching keys to join two relational sources in source qualifier transformation. When u use the maping parameter . • Reusable transformations.Can u use the maping parameters or variables created in one maping into any other reusable transformation?Yes. Transformations that you can use in multiple mappings. A set of transformations that you can use in multiple mappings. How can U improve session performance in aggregator transformation? use sorted input: 1. A session is a type of task that you can put in a workflow. the key order is also very important What is aggregate cache in aggregator transforamtion?The aggregator stores data in the aggregate cache until it completes aggregate calculations. Definitions of database objects or files that contain the target data. Unlike a mapping parameter. Sessions and workflows store information about how and when the Informatica Server moves data.U can join relatinal sources .The informatica server saves the value of maping variable to the repository at the end of session run and uses that value next time U run the session. • Target definitions.the informatica server creates index and data caches in memory to process the transformation.Where as u doesn’t need matching keys to join two sources.a maping variable represents a value that can change throughout the session.When u run a session that uses an aggregator transformation. synonyms) or files that provide source data. Definitions of database objects (tables. transforming.What r the mapping paramaters and maping variables?Maping parameter represents a constant value that U can define before running a session. A set of source and target definitions along with transformations containing business logic that you build into the transformation. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by.

In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?Both pipelines begin with the same original data source. 6. Click any box in the M column to switch the master/detail relationship for the sources.all rows from both master and detail ( matching or non matching) follw this 1. Drag all the desired input/output ports from the first source into the Joiner transformation. Select the Condition tab and set the condition. making it easier for you or others to understand or remember what the transformation does.all master rows and only matching rows from detail Full outer -.which r coming from diffrent sources also. Select the Ports tab.what r the settiings that u use to cofigure the joiner transformation?• Master and detail source • Type of join • Condition of the join the Joiner transformation supports the following join types. Select and drag all the desired input/output ports from the second source into the Joiner transformation. Either input pipelines contains a connected or unconnected Sequence Generator transformation. Either input pipelines contains an Update Strategy transformation. 8. Both input pipelines originate from the same Normalizer transformation. choose Transformation-Create. Enter a name. 7. Change the master/detail relationship if necessary by selecting the master source in the M column. 2. click OK. Both input pipelines originate from the same Source Qualifier transformation. Select the Joiner transformation. 3. The Designer creates the Joiner transformation. Keep in mind that you cannot use a Sequence Generator or Update Strategy transformation as a source to a Joiner transformation. 5. This description appears in the Repository Manager. The naming convention for Joiner transformations is JNR_TransformationName. Add default values for specific ports as necessary. Tip: Designating the source with fewer unique records as master increases performance during a join. The Designer configures the second set of source fields and master fields by default.all detail rows and only matching rows from master Detail outer -. which you set in the Properties tab: • Normal (Default) • Master Outer • Detail Outer • Full Outer What r the join types in joiner transformation? Normal (Default) -. since the fields in one of the sources may be empty.only matching rows from both master and detail Master outer -. In the Mapping Designer. Both input pipelines originate from the same Joiner transformation. You can specify a default value if the target database does not handle NULLs. Double-click the title bar of the Joiner transformation to open the Edit Transformations dialog box. Enter a description for the transformation. The Designer creates input/output ports for the source fields in the Joiner as detail fields by default. . You can edit this property later. 4. Certain ports are likely to contain NULL values.

What r the types of lookup? 1. 5.It compares the lookup transformation port values to lookup table column values based on the look up condition. Choose Repository-Save to save changes to the mapping.The informatica server stores condition values in the index cache and output values in the data cache. Click OK. After building the caches. 2. U can use a dynamic or static cache Cache includes all lookup columns used in the maping Support user defined default values U can use a static cache. but not the calculated value (such as net sales). What r the joiner caches?When a Joiner transformation occurs in a session. 1. 11. 12. Does not support user defiend default values What is meant by lookup caches?The informatica server builds a cache in memory when it processes the first row af a data in a cached look up transformation. Get a related value. Click the Add button to add a condition. 3.Why use the lookup transformation ?To perform the following tasks. Select the Properties tab and enter any additional settings for the transformations. Many normalized tables include values used in a calculation.view.It allocates memory for the cache based on the amount u configure in the transformation or session properties. the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows. if your source table includes employee ID. You can add multiple conditions. Connected lookup Unconnected lookup Persistent cache Re-cache from database Static cache Dynamic cache Shared cache Differences between connected and unconnected lookup? Connected lookup Unconnected lookup Receives input values diectly from the Receives input values from the result of a lkp expression in pipe line. Perform a calculation.What r the types of lookup caches?Persistent cache: U can save the lookup cache files and reuse them the next time the informatica server processes a lookup transformation configured to use the cache. .synonym. 2.9. Cache includes all lookup out put ports in the lookup condition and the lookup/return port. Update slowly changing dimension tables. You can use a Lookup transformation to determine whether records already exist in the target. a another transformation. 4. such as gross sales per invoice or sales tax. Informatica server queries the look up table based on the lookup ports in the transformation. For example. The Joiner transformation only supports equivalent (=) joins: 10. but you want to include the employee name in your target table to make your summary data easier to read. The master and detail ports must have matching datatypes. the Joiner transformation reads records from the detail source and perform joinswhat is the look up transformation?Use lookup transformation in u’r mapping to lookup data in a relational table.

U can share unnamed cache between transformations in the same maping. Two types of output groups User defined groups .If the input row out-ranks a stored row. Static cache: U can configure a static or readonly cache for only lookup table.u can create a look up transformation to use dynamic cache. U can configure the lookup transformation to rebuild the lookup cache.the informatica server does not update the cache while it prosesses the lookup transformation.How the informatica server sorts the string values in Ranktransformation?When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder. use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task. Dynamic cache U can insert rows into the cache as u pass to the target The informatica server inserts rows into cache when the condition is false.If U configure the seeion to use a binary sort order. a Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. When U drag the COBOL source in to the mapping Designer workspace.creating input and output ports for every column in the source.The informatica server stores group information in an index cache and row data in a data cache. if you create a Rank transformation that ranks the top 5 salespersons for each quarter. U can pass these rows to the target table Which transformation should we use to normalize the COBOL and relational sources?Normalizer Transformation.when the lookup condition is true.the informatica server replaces the stored row with the input row. If you need to test the same input data based on multiple conditions. However.the normalizer transformation automatically appears. Dynamic cache: If u want to cache the target table and insert new rows into cache and the target.What r the rank caches?During the session .It caches the lookup table and lookup values in the cache for each row that comes into the transformation. informatica server returns the default value for connected transformations and null for unconnected transformations. When the condition is not true.What r the types of groups in Router transformation?Input group Output group The designer copies property information from the input ports of the input group to create a set of output ports for each output group.Recache from database: If the persistent cache is not synchronized with he lookup table.the informatica server compares an inout row with rows in the datacache.What is the Rankindex in Ranktransformation?The Designer automatically creates a RANKINDEX port for each Rank transformation. the rank index numbers the salespeople from 1 to 5:What is the Router transformation?A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data.Difference between static cache and dynamic cache Static cache U can not insert or update the cache The informatica server returns a value from the lookup table or cache when the condition is true. For example.the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string. A Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.By default informatica server creates a static cache. This indicates that the row is not in the cache or target table. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group. Shared cache: U can share the lookup cache between multiple transactions.The informatica server dynamically inerts data to the target table.

Database administrators create stored procedures to automate time-consuming tasks that are too complicated for standard SQL statements What r the types of data that passes between informatica server and stored procedure?3 types of data Input/Out put parameters Return Values Status code.Default group U can not modify or delete default groups. In PowerCenter and PowerMart. the Informatica Server adds a WHERE clause to the default query. When you configure a session.The stored procedure issues a status code that notifies whether or not stored procedure completed sucessfully.This value can not seen by the user. The Joiner transformation supports the following join types. If you include a user-defined join.If u have the multiple source qualifiers connected to the multiple targets. the Informatica Server replaces the join information specified by the metadata in the SQL query. What is the default join that source qualifier provides?Inner equi join. • Filter records when the Informatica Server reads source data. • Create a custom query to issue a special SELECT statement for the Informatica Server to read source data.What is the status code?Status code provides error handling for the informatica server during the session. • Specify sorted ports. What is source qualifier transformation? What r the tasks that source qualifier performs? When you add a relational or a flat file source definition to a mapping. you need to connect it to a Source Qualifier transformation. If you choose Select Distinct. the Informatica Server adds a SELECT DISTINCT statement to the default SQL query.U can designatethe order in which informatica server loads data into the targets. A target load order group is the collection of source qualifiers. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. What is the target load order?U specify the target loadorder based on source qualifiers in a maping. what is update strategy transformation ? The model you choose constitutes your update strategy. Two sources should have matching data types. which you set in the Properties tab: • • • • Normal (Default) Master Outer Detail Outer Full Outer What r the basic needs to join two sources in a source qualifier?Two sources should have primary and Foreign key relation ships. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. • Select only distinct values from the source.It only used by the informatica server to determine whether to continue running the session or stop. transformations. • Specify an outer join rather than the default inner join. how to handle changes to existing rows. and targets linked together in a mapping. you can instruct the Informatica Server to . For example.Why we use stored procedure transformation? A Stored Procedure transformation is an important tool for populating and maintaining databases. If you include a filter condition. you set your update strategy at two different levels: • Within a session. If you specify a number for sorted ports. the Informatica Server adds an ORDER BY clause to the default SQL query. • Join data originating from the same source database. you might use a custom query to perform aggregate calculations or execute a stored procedure.

Update else Insert: This option enables informatica to flag the records either for update if they are old or insert. you can instruct the Informatica Server to either treat all records in the same way (for example. In other words. Within a mapping.either treat all rows in the same way (for example. treat all rows as inserts). Within a mapping. In the Type 1 Dimension mapping.What is the default source option for update stratgey transformation?Data driven. update. Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table. delete. delete. delete or reject. Use this mapping when you want to drop all existing data from your table before loading new data. or reject. if they are new records from source. or use instructions coded into the session mapping to flag records for different database operations. treat all records as inserts). Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Slowly Growing target : Loads a slowly growing fact or dimension table by inserting new rows. If u do not choose data driven option setting. update. • Within a mapping. update. Describe two levels in which update strategy transformation sets?Within a session. or reject. you use the Update Strategy transformation to flag records for insert.What r the options in the target session of update strategy transsformatioin?Insert Delete Update Update as update Update as insert Update esle insert Truncate table Update as Insert: This option specified all the update records from source to be flagged as inserts in the target.What is Datadriven?The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert. or use instructions coded into the session mapping to flag rows for different database operations. When you configure a session. you use the Update Strategy transformation to flag rows for insert. Changes are tracked in the target table by versioning the primary key and creating a version number for .What r the mapings that we use for slowly changing dimension table? Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. Use this mapping to load new data when existing data does not require updates. What r the types of maping wizards that r to be provided in Informatica?Simple Pass through Slowly Growing Target Slowly Changing the Dimension Type1 Most recent values Type2Full History Version Flag Date Type3 Current and one previous What r the types of maping in Getting Started Wizard?Simple Pass through maping : Loads a static fact or dimension table by inserting all rows. Within a mapping.the informatica server ignores all update strategy transformations in the mapping. instead of updating the records in the target they are inserted as new records. all rows contain current dimension data.

Define maping and sessions? Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation. When updating an existing dimension.what are the meta data of source U import?Source name Database location Column names .What r the new features of the server manager in the informatica 5.This maping also inserts both new and changed dimensions in to the target. you could not make reports from here. and handle pre. and transform data.Explained in previous question. but you can generate metadata report.and mapping parameters and maping variables.And changes r tracked by the effective date range for each version of each dimension.While importing the relational source defintion from database. Creates threads to initialize the session. Flag indiactes the dimension is new or newlyupdated. With a meta data reporter.Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server?Informatica server manager. Rows containing changes to existing dimensions are updated in the target. Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.0?U can use command line arguments for a session or batch.and post-session operations.u can access information about U’r repository with out having knowledge of sql. that is not going to be used for business analysis What is metadata reporter?It is a web based application that enables you to run reports againist repository metadata. Process session data using threads: Informatica server runs the session in two processes.If we use the informatica server on a SMP system. Version numbers and versioned primary keys track the order of changes to each dimension. And updated dimensions r saved with the value 0.How can u recognise whether or not the newly added rows in the source r gets insert in the target ?In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date RangeWhat r two types of processes that informatica runs the session? Load manager Process: Starts the session. and sends post-session email when the session completes. Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target.transformation language or underlying tables in the repository.U can use multiple CPU’s to process a session concurently.Can u generate reports in Informatcia? It is a ETL tool. read.each dimension in the table.what is polling?It displays the updated information about the session in the monitor window. The monitor window displays the status of each session when U poll the informatica server. Session : It is a set of instructions that describe how and when to move data from source to targets. Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension. Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table.Recent dimensions will gets saved with cuurent flag value 1. The DTM process. the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updatesWhat r the different types of Type2 dimension maping?Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number. creates the DTM process. Parallel data processing: This feature is available for powercenter only.This allows U to change the values of session parameters. write.And newly added dimension in source will inserted into target with a primary key.

U can choose to merge the targets. and load for each partition in parallel.log).informatica server directly communicates the repository to check whether or not the session and users r valid. Transformation thread: It will be created to tranform data.Why we use partitioning the session in informatica? Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target. Reader thread: One thread will be created for each partition of a source.informatica server reads multiple files concurently.All the metadata of sessions and mappings will be stored in repository.it creates the DTM process.What r the different threads in DTM process?Master thread: Creates and manages all other threads Maping thread: One maping thread will be creates for each session.Datatypes Key constraints What r the designer tools for creating tranformations?Mapping designer Tansformation developer Mapplet designerHow many ways u create ports?Two ways 1. transformation. What is DTM process?After the loadmanger performs validations for session.These files will be created in informatica home directory.It writes information about session into log files such as initialization process. Install the informatica server on a machine with multiple CPU’s.I creates the master thread.Click the add buttion on the ports tab.It also creates an error log for error messages. To achieve the session partition what r the necessary tasks u have to do?Configure the session to partition source data.What r the data movement modes in informatcia?Datamovement modes determines how informatcia server handles the charector data.How the informatica server increases the session performance through partitioning the source?For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection. ASCII mode Uni code mode.Fectchs session and maping information.Master thread creates and manges all the other threads.It reads data from source.Why u use repository connectivity?When u edit.server.Two types of datamovement modes avialable in informatica.schedule the sesion each time. Writer thread: It will be created to load data to the target.Informatica server reads multiple partitions of a single source concurently.Drag the port from another transforamtion 2. Informatica server can achieve high performance by partitioning the pipleline and performing the extract .Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently.creation of sql commands for reader and writer . Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. Session log file: Informatica server creates session log file for each session.For loading the data informatica server creates a seperate file for each partition(of a source file). Pre and post session threads: This will be created to perform pre and post session operations.What r the out put files that the informatica server creates during the session running?Informatica server log: Informatica server(on unix) creates a log for all status and error messages(default name: pm. For XML and file sources.DTM is to create and manage the threads that carry out the session tasks.U choose the datamovement in the informatica server configuration settings.

Batches r two types Sequential: Runs sessions one after the other Concurrent: Runs session at same time. Reject file: This file contains the rows of data that the writer does notwrite to targets. How many number of sessions that u can create in a batch?Any number of sessions.What r the different options used to configure the sequential batches?Two options .update.the informatica server creates the target file based on file prpoerties entered in the session property sheet.The amount of detail in session log file depends on the tracing level that u set.Session detail include information such as table name. Indicator file: If u use the flat file as a target.To genarate this file select the performance detail option in the session property sheet. Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.But that target folder or repository should consists of mapping of that session.If u have several independent sessions u can use concurrent batches.delete or reject. Control file: Informatica server creates control file and a target file when U run a session that uses the external loader. output file: If session writes to a target file.the indicator file contains a number to indicate whether the row was marked for insert.When the informatica server marks that a batch is failed?If one of session is configured to "run if previous completes" and that previous session failsWhat is a command that used to run a batch?pmcmd is used to start a batch. If target folder or repository is not having the maping of copying session .U can configure the informatica server to create indicator file.For each target row.targets and session to the target folder.threads. u should have to copy that maping first before u copy the sessionIn addition. Session detail file: This file contains load statistics for each targets in mapping. If u have sessions with source-target dependencies u have to go for sequential batch to start the sessions one after another. Aggreagtor transformation Joiner transformation Rank transformation Lookup transformationIn which circumstances that informatica server creates Reject files?When it encounters the DD_Reject in update strategy transformation. Cache files: When the informatica server creates memory cache it also creates cache files. you can copy the workflow from the Repository manager. associated source. By using copy session wizard u can copy a session in a different folder or repository.The control file contains the information about the target flat file such as data format and loading instructios for the external loader.Can u copy the session to a different folder or repository?Yes. This will automatically copy the mapping.U can view this file by double clicking on the session in monitor window Performance detail file: This file contains information known as session performance details which helps U where performance can be improved.One if the session completed sucessfully the other if the session fails.For the following circumstances informatica server creates index and datacache files.U can create two different messages. Whch runs all the sessions at the same time.What is batch and describe about types of batches?Grouping of session is known as batch.number of rows written or rejected.errors encountered and load summary. Violates database constraint Filed in the rows was truncated or overflowed.

enclose the file name in double quotes: -paramfile ”$PMRootDir\my file. Database connections Source file names: use this parameter when u want to change the name or location of session source file between session runs Target file name : Use this parameter when u want to change the name or location of session target file between session runs. the parameter file name cannot have beginning or trailing spaces.in case of concurrent batch we cant do like this.Run the session only if previous session completes sucessfully.Can u start a batches with in a batch?U can not. If u want to start batch that resides in a batch. U can define the following values in parameter file Maping parameters Maping variables session parameters For Windows command prompt users.txt' How can u access the remote source into U’r session?Relational source: To acess relational source which is situated in a remote place .represent values U might want to change between sessions such as database connections or source files. Hetrogenous : When U’r maping contains more than one source type. FileSource : To access the remote source file U must configure the FTP connection to the host machine before u create the session. Reject file name : Use this parameter when u want to change the name or location of session reject files between session runs. Server manager also allows U to create userdefined session parameters.If u partition a session with a file target the informatica server creates one target file for each partition.U can configure session properties to merge these target fileswhat r the transformations that restricts the partitioning of sessions?Advanced External procedure tranformation and External procedure transformation: This transformation contains a check box on the properties tab to allow partitioning.How can u stop a batch?By using server manager or pmcmd. use the backslash (\) with the dollar sign ($). Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source .By setting the option always runs the session. Always runs the session.What r the session parameters?Session parameters r like maping parameters. If the name includes spaces.create a new independent batch and copy the necessary sessions into the new batch.What is parameter file?Parameter file is to define the values for parameters and variables used in a session.Following r user defined session parameters.In a sequential batch can u run the session if previous session fails?Yes.txt” Note: When you write a pmcmd command that includes a parameter file located on another machine. This ensures that the machine where the variable is defined expands the server variable.u need to configure database connection to the datasource.What is difference between partioning of relatonal target and partitioning of file targets?If u parttion a session with a relational target informatica server creates multiple connections to the target database to write target data concurently.the server manager creates a hetrogenous session that displays source options for all types.Can u start a session inside a batch idividually?We can start our required session only in case of sequential batch. pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '$PMRootDir/myfile.A parameter file is a file created by text editor such as word pad or notepad.

So aviod netwrok connections.Moving target database into server system may improve session performance. single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session. Data generally moves across a network at less than 1 MB per second.Joiner Transformation : U can not partition the master source for a joiner transformation Normalizer Transformation XML targets. . move those files to the machine that consists of informatica server.targets and informatica server to improve session performance.Distibuting the session load to multiple informatica servers may improve session performance. Aviod transformation errors to improve the session performance. Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines.Increase the session performance by following.So concurent batches may also increase the session performance. Also.To do this go to server manger .which allows data to cross the network at one time. If the sessioin containd lookup transformation u can improve the session performance by enabling the look up cache. Thus network connections ofteny affect on session performance. If u r target consists key constraints and indexes u slow the loading of data.u can use incremental aggregation to improve session performance. Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.Unicode mode takes 2 bytes to store a character. If a session joins multiple source tables in one Source Qualifier. Flat files: If u’r flat files stored on a machine other than the informatca server. Relational datasources: Minimize the connections to sources . In some cases if a session contains a aggregator transformation . whereas a local disk moves data five to twenty times faster.create that filter transformation nearer to the sources or u can use filter condition in source qualifier.Because ASCII datamovement mode stores a character value in one byte. optimizing the query may improve performance. The performance of the Informatica Server is related to network connections. Running a parallel sessions by using concurrent batches will also reduce the time of loading the data.Performance tuning in Informatica?The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.choose server configure database connections. Removing of staging areas may improve session performance. We can improve the session performance by configuring the network packet size. U can run the multiple informatica servers againist the same repository. Run the informatica server in ASCII datamovement mode improves the session performance. If U’r session contains filter transformation .

Metadata can include information such as mappings describing how to transform source data. We cant use COBOL source qualifier.) The centralized repository in a domain.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet. We can not include source definitions in reusable transformations. • Reusable transformations. The global repository can contain common objects to be shared throughout the domain through global shortcuts. unrelated and unconnected to other repositories. and loading data. Each session corresponds to a single mappingWhat is power center repository?The PowerCenter repository allows you to share metadata across repositories to create a data mart domain.But it is transparent in case of reusable transformation.Thsea tables stores metadata in specific format the informatica server. used by the Informatica Server and Client tools.• Standalone repository. transforming. • Global repository.Because they must group data before processing it. a group of connected repositories. A session is a type of task that you can put in a workflow. • Sessions and workflows. A set of transformations that you can use in multiple mappings.To improve session performance in this case use sorted ports option.Define informatica repository?The Informatica repository is a relational database that stores information.What is difference between maplet and reusable transformation?Maplet consists of set of transformations that is reusable.But we can add sources to a maplet. Whole transformation logic will be hided in case of maplet. Transformations that you can use in multiple mappings. A repository that functions individually.The Repository Manager connects to the repository database and runs the code needed to create the repository tables. you can create a single global repository to store metadata used across an enterprise. • Local repository. In a data mart domain. • Target definitions. If u create a variables or parameters in maplet that can not be used in another maping or maplet.Where as we can make them as a reusable transformations. and product version.Rank and joiner transformation may oftenly decrease the session performance . • Mappings. Definitions of database objects or files that contain the target data. sessions indicating when you want the Informatica Server to perform the transformations.client tools use.What r the types of metadata that stores in repository?Following r the types of metadata that stores in the repository Database connections Global objects Mappings Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source definitions Target defintions Transformations• Source definitions.A reusable transformation is a single transformation that can be reusable. or metadata. Target definitions that are configured as cubes and dimensions.joiner. synonyms) or files that provide source data. • Mapplets. Use repository manager to create the repository.Aggreagator. The repository also stores administrative information such as usernames and passwords. • Multi-dimensional metadata. A workflow is a set of instructions that describes how and when to run tasks related to extracting. These are the instructions that the Informatica Server uses to transform and move data. views. (PowerCenter only. Definitions of database objects (tables. Sessions and workflows store information about how and when the Informatica Server moves data.normalizer transformations in maplet. A set of source and target definitions along with transformations containing business logic that you build into the transformation. and connect strings for sources and targets. and a number of local repositories to share the global metadata as needed. permissions and privileges. Each domain can contain one global repository. (PowerCenter .

you apply captured changes in the source to aggregate calculations in a session.Explain about perform recovery?When the Informatica Server starts a recovery session.Where as in external procedure transformation procedure or function will be executed out side of data source.only. You can work with remote.How can u load the records from 10001 th record when u run the session next time?As explained above informatcia server has 3 methods to recovering the sessions.Explain about Recovering sessions?If you stop a session or if an error causes a session to stop. session. · Truncate the target tables and run the session again if the session is not recoverable.Use performing recovery to load the records from where the session fails. Each local repository in the domain can connect to the global repository and use objects in its shared folders. rather than forcing it to process the entire source and recalculate the same calculations each time you run the session.) A repository within a domain that is not the global repository. Types of tracing level Normal Verbose Verbose init Verbose dataWhat is difference between stored procedure transformation and external procedure transformation?In case of storedprocedure transformation procedure will be compiled and executed in a relational data source.What is tracing level and what r the types of tracing level?Tracing level represents the amount of information that informatcia server writes in a log file. Customized repeat: Informatica server runs the session at the dats and times secified in the repeat dialog box. refer to the session and error logs to determine the cause of failure.U need data base connection to import the stored procedure in to u’r maping.No need to have data base connection in case of external procedure transformation.How can u work with remote database in informatica?did u work directly by using remote connections?To work with remote datasource u need to connect it with remote connections.Instead u bring that source into U r local machine where informatica server resides.000 records in to the target. you can configure the session to process only those changes.or u can manually run the session. · Consider performing recovery if the Informatica Server has issued at least one commit. This allows the Informatica Server to update your target incrementally. If the source changes only incrementally and you can capture changes. The method you use to complete the session depends on the properties of the mapping. Use one of the following methods to complete the session: · Run the session again if the Informatica Server has not issued a commit. it reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to . But you have to Configure FTP Connection details IP address User authentication what is incremantal aggregation?When using incremental aggregation.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time.But it is not preferable to work with that remote source directly by using remote connections . and then complete the session.What r the scheduling options to run a sesion?U can shedule a session to run at a given time or intervel.Ie u need to make it as a DLL to access in u r maping. and Informatica Server configuration. If a session fails after loading of 10. Run every: Informatica server runs the session at regular intervels as u configured. Correct the errors. Different options of scheduling Run only on demand: server runs the session only when user starts session explicitly Run once: Informatica server runs the session only once at a specified date and time.

Drag the copied session outside the batch to be a standalone session. 2. 3. if the Informatica Server commits 10.Follow the steps to recover a standalone session. If the sources or targets changes after initial session fails. If a standalone session fails. 2. and click OK. if a session in a concurrent batch fails and the rest of the sessions complete successfully. If you do not configure a session in a sequential batch to stop on failure. select Perform Recovery.Delete the standalone copy. The Informatica Server then reads all sources again and starts processing from the next row ID.If i done any modifications for my table in back end does it reflect in informatca warehouse or maping desginer or source analyzer?NO.It displays u all the information that is to be stored in repository.After the batch completes.Run the session. To recover a session in a concurrent batch: 1.000 and starts loading with row 10. Informatica is not at all concern with back end data base. and the remaining sessions in the batch complete. 2.the target database. To recover sessions using the menu: 1. you can recover the session as a standalone session. you can run recovery starting with the failed session. when you run recovery. start recovery. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.001. when a session does not complete.From the command line.What r the circumstances that infromatica server results an unreciverable session?The source qualifier transformation does not use sorted ports. If u change the partition information after the initial session fails.000 rows before the session fails. stop the session. Run the session from the beginning when the Informatica Server cannot run recovery or when running recovery might result in inconsistent data. the next time you run the session. open the session property sheet. If the maping consists of sequence generator or normalizer transformation. With the failed session highlighted. you need to truncate the target tables and run the session from the beginning. the Informatica Server attempts to recover the previous session. 4. To recover sessions using pmcmd: 1. highlight the session you want to recover. select Server Requests-Start Session in Recovery Mode from the menu. 4. . you might want to truncate all targets and run the batch again. 2. Perform Recovery is disabled in the Informatica Server setup. and click OK.On the Log Files tab. you can run recovery using a menu command or pmcmd. However. Perform recovery is disabled in the informatica server configuration. The Informatica Server completes the session and then runs the rest of the batch.How can u complete unrcoverable sessions?Under certain circumstances. For example. If a concuurent batche contains multiple failed sessions. 3.How to recover the standalone session?A standalone session is a session that is not nested in a batch.How can u recover the session in sequential batches?If you configure a session in a sequential batch to stop on failure. From the command line. 5.If want to reflect back end changes to informatica screens.Copy the failed session using Operations-Copy Session. By default.Clear Perform Recovery. open the session property sheet. recover the failed session as a standalone session. In the Server Manager. These options are not available for batched sessions.In the Server Manager. Use the Perform Recovery session property To recover sessions in sequential batches configured to stop on failure: 1. Select Server Requests-Stop from the menu. the Informatica Server bypasses the rows up to 10. 3. How to recover sessions in concurrent batches?If multiple sessions in a concurrent batch fail. If you do not clear Perform Recovery.

can u map these three ports directly to target?NO. then there will not be any data loss. Delete.tc) for each of those groups. MEDIAN. time. It represents all data queried from the source.again u have to import from back end to informatica by valid connection. There are no target options for ERP target type Target Options for Relational are Insert. One code page can be a subset or superset of another. For accurate data movement. If you don't use join means not only diffrent sources but homegeous sources are show same error. it also contains additional characters not contained in the other code page. What are Target Types on the Server?Target Types are File. XML and ERP What are Target Options on the Servers?Target Options for File Target type are FTP File.informix) to a single source qualifier. AVG. Update (else Insert). stddev. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. this particular transform is a connected/active transform which can take the incoming data form the mapping pipeline and group them based on the group by ports specified and can calculated aggregate funtions like ( avg. Update (as Update). COUNT. what is a source qualifier? It is a transformation which represents the data Informatica server reads from source.And u have to replace the existing files with imported files. Geography. count. which use that dimension. Relational.After draging the ports of three sources(sql server. MAX. Loader and MQ. customer and . Egs. PERCENTILE. What is Code Page Compatibility?Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in the Unicode data movement mode.Unless and until u join those three ports in source qualifier u cannot map them directly if u drag three hetrogenous sources and populated to target without any join means you are entertaining Carteisn product. MIN. LAST. and VARIANCE. Superset . FIRST. You can use a Connected Lookup with dynamic cache on the target What are Aggregate transformation? Aggregator transform is much like the Group by clause in traditional SQL.. sum. What are Dimensions and various types of Dimensions? set of level properties that describe a specific aspect of a business. If the code pages are identical. If you are not interested to use joins at source qualifier level u can add some joins sepratly. If you are importing Japanese data into mapping. Subset .A code page is a subset of another code page when all characters in the code page are encoded in the other code page... What are various types of Aggregation? Various types of aggregation are SUM. Update (as Insert). the target code page must be a superset of the source code page. STDDEV. used for analyzing the factual measures of one or more cubes. From a performanace perspective if your mapping has an AGGREGATOR transform use filters and sorters very early in the pipeline if there is any need for them.A code page is a superset of another code page when it contains the character encoded in the other code page. u must select the Japanese code page of source data. What is Code Page used for? Code Page is used to identify characters that might be in different languages. and Truncate Table How do you identify existing rows of data in the target table using lookup transformation? Can identify existing rows of data using unconnected lookup transformation.e.oracle.

Main thread of the DTM process. This is also known as buffer memory. or redefine them. edit.We can use unconnected lookup transformation to determine whether the records already exist in the target or not. Why we use lookup transformations?Lookup Transformations can access data from relational tables that are not sources in mapping.Mapping thread . With Lookup transformation. the DTM creates a set of threads for each partition to allow concurrent processing. The primary purpose of the DTM process is to create and manage threads that carry out the session tasks. You can create. The master thread creates and manages all other threads. It creates the main thread. What is Session and Batches?Session .One Thread to Each Session. You can also change the values of user-defined extensions.reader thread-One Thread for Each Partition for Each Source Pipeline. you can store your contact information with the mapping.Operational Data Store. it creates the DTM process..WRITER THREAD-One Thread for Each Partition if target exist in the source pipeline write to the target. You associate information with repository metadata using metadata extensions. Update slowly changing dimension tables . but you cannot create. User-defined.tRANSFORMATION THREAD .Batches .Run Session At The Same Time.product. After creating the session. Fetches Session and Mapping Information. There Are Two Types Of Batches : Sequential . Following are the types of threads that DTM creates: Master thread . which is called the master thread. and view user-defined metadata extensions. · If we partition a session.One or More Transformation Thread For Each Partition. For example. what is ODS (operation data source) ANS1: ODS . What is Data Transformation Manager?After the load manager performs validations for the session. we can use either the server manager or the command line program pmcmd to start or stop the session.Run Session One after the Other.A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets.It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server. You can view and change the values of vendor-defined metadata extensions. · The DTM allocates process memory for the session and divide it into buffers. when you create a mapping. Third-party application vendors create vendor-defined metadata extensions. Informatica Client applications can contain the following types of metadata extensions: • • Vendor-defined. delete.concurrent . When Informatica server writes messages to the session log it includes thread type and thread ID. . The DTM process is the second process associated with the session run. You create user-defined metadata extensions using PowerCenter/PowerMart.Pre and Post Session Thread-One Thread each to Perform Pre and Post Session Operations. Creates and manages all other threads. delete. ETL Questions and Answers what is the metadata extension? Informatica allows end users and partners to extend the metadata stored in the repository by associating information with individual objects in the repository. we can accomplish the following tasks: Get a related value-Get the Employee Name from Employee table based on the Employee IDPerform Calculation.

worklet. emails and sessions. unconnected lookup You cannot lookup from a source qualifier directly.txt as [foldername_session_name] ABC='hello world" In the session properties u can give in the parameter file name field abc.and has minimal history retained can we lookup a table from source qualifier transformation. What are the different Lookup methods used in Informatica? In the lookup transormation mainly 2 types 1)connected 2)unconnected lookup Connected lookup: 1)It recive the value directly from pipeline 2)it iwill use both dynamic and static 3)it return multiple value 4)it support userdefined value Unconnected lookup:it recives the value :lkp expression 2)it will be use only dynamic 3)it return only single value 4)it does not support user defined values What are parameter files ? Where do we use them? Parameter file is any text file where u can define a value for the parameter defined in the informatica session.a group of transformations that can be called within a mapping. ie. not snapshots. Once data was poopulated in ODS aggregated data will be loaded into into EDW through ODS. Workflow .a workflow that can be called within a workflow. Session . Workflow .ODS Comes between staging area & Data Warehouse. you can override the SQL in the source qualifier to join with the lookup table to perform the lookup.controls the execution of tasks such as commands. Session . Mapping . The data is ODS will be at the low level of granularity. this parameter file can be referenced in the session properties.a task associated with a mapping to define the connections and other configurations for that mapping.Contains live data. emails and sessions.represents the flow and transformation of data from source to taraget.a workflow that can be called within a workflow. ANS2: An updatable set of integrated operational data used for enterprise.When the informatica sessions runs the values for the parameter is fetched from the specified file.a group of transformations that can be called within a mapping.txt What is a mapping. Worklet .wide tactical decision making. Mapplet .controls the execution of tasks such as commands. Worklet . What is the difference between Power Center & Power Mart? Power Mart is designed for: Low range of warehouses . workflow. mapplet? Mapping .a task associated with a mapping to define the connections and other configurations for that mapping. However. For eg : $$ABC is defined in the infomatica mapping and the value for this variable is defined in the file called abc. session.represents the flow and transformation of data from source to taraget. Mapplet .

Horizontal partitioning. Incremental . Server Manager 4. Power Mart Designer 2. Repository . 2. there are further 2 techniques:Refresh load . Vertical partitioning (reduces efficiency in the context of a data warehouse). Cognos Business Objects What are snapshots? What are materialized views & where do we use them? What is a materialized view log? Materialized view is a view in wich data is also stored in some temp table.Where delta or difference between target and source data is dumped at regular intervals. reduces work involved with addition of new data. 2. What is Full load & Incremental or Refresh load? Full Load is the entire data dump load taking place the very first time. What are the various tools? .But In materialized View data is stored in some temp tables.Where the existing data is truncated and reloaded completely. OLAp tools are as follows. Informatica Datastage Business Objects Data Integrator Abinitio. Can Informatica load heterogeneous targets from heterogeneous sources? yes! it loads from heterogeneous sources.Name a few The various ETL tools are as follows. independently-manageable components because it: 1.i. Server 3. Timestamp for previous delta load has to be maintained. Gradually to synchronize the target data with source data.e if we will go with the View concept in DB in that we only store query and once we call View it extract data from DB. What is partitioning? What are the types of partitioning? Partitioning is a part of physical data warehouse design that is carried out to improve performance and simplify stored-data management. What are the modules in Power Mart? 1. reduces work involved with purging of old data. Two types of partitioning are: 1. Power mart is designed for: High-end warehouses Global as well as local repositories ERP support.only for local repositories mainly desktop environment. Partitioning is done to break up a large table into smaller..

bottom tier .middle tier . Repository Manager What is a staging area? Do we need it? What is the purpose of a staging area? Staging area is place where you hold temporary tables on data warehouse server. 2. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data . and perform data cleansing and merging . The 3 tiers are: 1. it might mess up the OLTP because of format changes between a warehouse and OLTP. If we attempt to load data directly from OLTP. Keeping the OLTP data intact is very important for both the OLTP and the warehouse. ROLAP server 2. Order Invoiced Stat).g. before loading the data into warehouse A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. the tables that are to be extracted from various sources. 1.consists of the database 2. When addressing a table some dimension key must reflect the need for a record to get extracted. Application tier .consists of the analytical server 3.5. Do Flat mapping i. Middle tier contains two types of servers. How to determine what records to extract? Data modeler will provide the ETL developer. Foolproof would be adding an archive flag to record which gets reset when record changes What are the various transformation available? Aggregator Transformation Expression Transformation Filter Transformation Joiner Transformation Lookup Transformation Normalizer Transformation Rank Transformation Router Transformation Sequence Generator Transformation Stored Procedure Transformation Sorter Transformation Update Strategy Transformation XML Source Qualifier Transformation Advanced External Procedure Transformation External Transformation What is a three tier data warehouse? Three tier data warehouse contains three tier such as bottom tier. Mostly it will be from time dimension (e.tier that interacts with the end-user . Staging area is a temp schema used to 1.e dumping all the OLTP data in to it without applying any business rules pushing data into staging will take less time because there is no business rules or transformation applied on it. Used for data cleansing and validation using First Logic.g. date >= 1st of current mth) or a transaction flag (e. MOLAP server Top tier deals with presentation or visualization of the results . Presentation tier . middle tier and top tier. Bottom tier deals with retrieving related data’s or information from various information repositories by using SQL. Data tier .

data profiling. What are the various methods of getting incremental records or delta records from the source systems getting incremental records from source systems to target can be done by using incremental aggregation transformation Techniques of Error Handling . This helps you to extract the data from different ODS/Database. Normally ETL Tool stands for Extraction Transformation Loader 2. transform and load the data into Data Warehouse for decision making. These row indicators or of four types D-valid data. transformation. On top of it. This task was tedious and cumbersome in many cases since it involved many resources. Before the evolution of ETL Tools. maintaining the code placed a great challenge among the programmers. we can check why a record has been rejected and this bad file contains first column a row indicator and second column a column indicator. These difficulties are eliminated by ETL Tools since they are very powerful and they offer many advantages in all stages of ETL process starting from extraction.Truncated data. the above mentioned ETL process was done manually by using SQL code created by programmers.Ignore . Als they can be used in source qualifier filter.Do we need an ETL tool? When do we go for the tools in the market? ETL Tools are meant to extract. 3. N-null data. Rejecting bad records to a flat file . we can use it in any expression in a mapping or a mapplet. Both COM and Inforrmatica Procedures are supported using External procedure Transformation . If you have a requirement like this you need to get the ETL tools. 1. complex coding and more work hours. data cleansing. user defined joins or extract overrides and in expression editor of reusable transformations. debugging and loading into data warehouse when compared to the old method. loading the records and reviewing them (default values) Rejection of records either at the database due to constraint key violation or the informatica server when writing data into target table These rejected records we can find in the bad file folder where a reject file will be created for a session. T. else you no need any ETL How can we use mapping variables in Informatica? Where do we use them? After creating a variable. Their values can change automatically between sessions. O-overflowed data. And depending on these indicators we can changes to load data successfully to target. Can we use procedural logic inside Inforrmatica If yes how if now how can we use external procedural logic in Inforrmatica? We can use External Procedure Transformation to use external procedures.

so that it will be useful and easy to perform transformations during the ETL process. such as an Expression transformation that performs a calculation on data and passes all rows through the transformation Active transformations Advanced External Procedure Aggregator Application Source Qualifier Filter Joiner Normalizer Rank Router Update Strategy Passive transformation Expression External Procedure Maplet.or post-session shell command for a Session task. Transformations can be active or passive. You can use a Command task anywhere in the workflow or worklet to run shell commands. in the following ways: 1.Input Lookup Sequence generator XML Source Qualifier Maplet . 2. The Informatica Metadata is stored in Informatica repository What are active transformation / Passive transformations? An active transformation can change the number of rows as output after a transformation. Standalone Command task. An active transformation can change the number of rows that pass through it. the transformations. For more information about specifying pre-session and post-session shell commands What is Informatica Metadata and where is it stored? Informatica Metadata contains all the information about the source tables. A passive transformation does not change the number of rows that pass through it. while a passive transformation does not change the number of rows and passes through the same number of rows that was given to it as input. target tables. such as a Filter transformation that removes rows that do not meet the filter condition.Output . You can call a Command task as the pre.1 How do we call shell scripts from Inforrmatica? You can use a Command task to call the shell scripts.Can we override a native sql query within Informatica? Where do we do it? How do we do it? we can override a sql query in the sql override property of a source qualifier What is latest version of Power Center / Power Mart? Power Center 7.and post-session shell command. Pre.

or FORCE). It also reduces the learning curve on the team. FORCE is the default.Primary key materialized views allow materialized view master tables to be reorganized without affecting the eligibility of the . Tool based ETL provides maintainability.FAST ClauseThe FAST refreshes use the materialized view logs (as seen above) to send the rows that have changed from master tables to the materialized view. Materialized view log created. Materialized view created. SQL> CREATE MATERIALIZED VIEW LOG ON emp.FORCE ClauseWhen you specify a FORCE clause. The refresh method used by Oracle to refresh data in materialized view b. ease of development and graphical view of the flow.When do we Analyze the tables? How do we do it? When the data in the data warehouse changes frequently we need to analyze the tables. this can extend the overall development time. Analyze tables will compute/update the table statistics. Oracle will perform a fast refresh if one is possible or a complete refresh otherwise. If you request a complete refresh.e.Refresh Method .Materialized view log created.dept_no)REFRESH CLAUSE[refresh [fast| complete|force] [on demand | commit] [start with date] [next date] [with {primary key|rowid}]]The refresh option specifies: a.You should create a materialized view log for the master tables if you specify the REFRESH FAST clause.Note: When you create a materialized view using the FAST option you will need to create a view log on the master tables(s) as shown below:SQL> CREATE MATERIALIZED VIEW LOG ON emp. Whether the view is primary key based or row-id based c. Handcoded ETL is good when there is minimal transformational logic involved. To use the PRIMARY KEY clause you should have defined PRIMARY KEY on the master table or else you should use ROWID based materialized views. If you do not specify a refresh method (FAST.Materialized views are not eligible for fast refresh if the defined subquery contains an analytic function.PRIMARY KEY and ROWID ClauseWITH PRIMARY KEY is used to create a primary key materialized view i. However. COMPLETE. Compare ETL & Manual development? There are pros and cons of both tool based ETL and hand-coded ETL. the materialized view is based on the primary key of the master table instead of ROWID (for ROWID clause). Oracle performs a complete refresh even if a fast refresh is possible. PRIMARY KEY is the default option.Rowid Materialized ViewsThe following statement creates the row id materialized view on table emp located on a remote database:SQL> CREATE MATERIALIZED VIEW mv_emp_rowid REFRESH WITH ROWID AS SELECT * FROM emp@remote_db.Refresh Method .Sub query Materialized ViewsThe following statement creates a sub query materialized view based on the emp and dept tables located on the remote database:SQL> CREATE MATERIALIZED VIEW mv_empdeptAS SELECT * FROM emp@remote_db eWHERE EXISTS (SELECT * FROM dept@remote_db d WHERE e. The time and interval at which the view is to be refreshed Refresh Method . that will help to boost the performance of your SQL. It is also good when the sources and targets are in the same environment. Primary Key Materialized ViewsThe following statement creates the primary-key materialized view on the table emp located on a remote database.SQL> CREATE MATERIALIZED VIEW mv_emp_pk REFRESH FAST START WITH SYSDATE NEXT SYSDATE + 1/48 WITH PRIMARY KEY AS SELECT * FROM emp@remote_db.COMPLETE ClauseThe complete refresh re-creates the entire materialized view. depending on the skill level of the team. Materialized view log created.dept_no = d.

Materialized view created. Types of tracing level: Normal Verbose Verbose init Verbose data . Informatica Training in Bangalore.materialized view for fast refresh. such as an Expression transformation that performs a calculation on data and passes all rows through the transformation. Joins & Set operations Timing the refreshThe START WITH clause tells the database when to perform the first replication from the master table to the local base table. such as a Filter transformation that removes rows that do not meet the filter condition. It should evaluate to a future point in time. Rowid materialized views should have a single master table and cannot contain any of the following: • Distinct or aggregate functions • GROUP BY Subqueries .Part 15 What are active and passive transformations? Transformations can be active or passive.In the above example. An active transformation can change the number of rows that pass through it. A passive transformation does not change the number of rows that pass through it. What is tracing level and what are the types of tracing levels? Tracing level represents the amount of information that informatcia server writes in a log file. Marathahalli Top of Form Bottom of Form Informatica Interview Questions . the first copy of the materialized view is made at SYSDATE and the interval at which the refresh has to be performed is every two days. The NEXT clause specifies the interval between refreshesSQL> CREATE MATERIALIZED VIEW mv_emp_pk REFRESH FAST START WITH SYSDATE NEXT SYSDATE + 2 WITH PRIMARY KEY AS SELECT * FROM emp@remote_db.

Pre and post session threads: This will be created to perform pre and post session operations. . Active transformation is the transformation that changes the number of rows that pass through it. It reads data from source. If any rejected rows are there automatically it will be updated to the session log file. Is a fact table normalized or de-normalized? A fact table is always DENORMALISED table. Reader thread: One thread will be created for each partition of a source. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . It consists of data from dimension table (Primary Key's) and Fact table has foreign keys and measures. In union transformation the number of rows resulting from union can be different from the actual number of rows. If we are using Update Strategy Transformation in a mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica? In Designer while creating Update Strategy Transformation uncheck "forward to next transformation".How can you say that union Transormation is Active transformation? By Definition. Transformation thread: It will be created to transform data.Fectchs session and mapping information.Part 14 What are the different threads in DTM process? Master thread: Creates and manages all other threads Mapping thread: One mapping thread will be creates for each session. Writer thread: It will be created to load data to the target.

Normal Load and Bulk load? It depends on the requirement. Which is better among incremental load. What are the tasks that Load manger process will do? Manages the session and batch scheduling: When you start the informatica server the load manager launches and queries the repository for a list of sessions configured to run on the informatica server. When you drag and drop the tables you will be getting the source qualifier for each table. Add a common source qualifier for all. But provided the tables should have relationship. Whereas detail filters can be applied on each and every red of the data base. How to join two tables without using the Joiner Transformation? It’s possible to join the two or more tables by using source qualifier. Reading the parameter file: If the session uses a parameter files. What is the difference between summary filter and detail filter? Summary filter can be applied on a group of rows that contain a common value.Update or insert files are known by checking the target file or table only. Locking and reading the session: When the informatica server starts a session load manager locks the session from the repository. Delete all the source qualifiers. Locking prevents starting the session again and again. Otherwise Incremental load can be better as it takes only that data which is not available previously on the target. click on it. When you configure the session the load manager maintains list of list of sessions and session start times. .loadmanager reads the parameter file and verifies that the session level parameters are declared in the file Verifies permission and privileges: When the session starts load manger checks whether or not the user have privileges to run the session. When you start a session load manger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process. Right click on the source qualifier you will find EDIT. Click on the properties tab and then you will find sql query in that you can write your sql.

and loading data. transforming. Target definitions: Definitions of database objects or files that contain the target data. synonyms) or files that provide source data. in this you will have a "distinct" option make use of it. It allows the testing to be done on one or more conditions. . Each session corresponds to a single mapping. It is similar to filter transformation. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . views. Multi-dimensional metadata: Target definitions that are configured as cubes and dimensions. Mappings: A set of source and target definitions along with transformations containing business logic that you build into the transformation. How to delete duplicate rows in flat files source? Use a sorter transformation. A session is a type of task that you can put in a workflow. What type of metadata is stored in repository? Source definitions: Definitions of database objects (tables. A workflow is a set of instructions that describes how and when to run tasks related to extracting.Creating log files: Load manger creates log file contains the status of session.Part 13 What is Router transformation? Router transformation allows you to use a condition to test data. Reusable transformations: Transformations that you can use in multiple mappings. These are the instructions that the Informatica Server uses to transform and move data. Mapplets: A set of transformations that you can use in multiple mappings. Sessions and workflows: Sessions and workflows store information about how and when the Informatica Server moves data.

What are reusable transformations? You can design using two methods: 1. Semi additive Dimensions table contain textual description of data. once you perform the update strategy. It allocates memory for the cache based on the amount you configure in the transformation or session properties. say you had flagged some rows to be deleted and you had performed aggregator transformation for all rows. Can you use the mapping parameters or variables created in one mapping into any other reusable transformation? Yes. Non additive 3.Part 12 What is meant by lookup cache? The informatica server builds a cache in memory when it processes the first row at a data in a cached look up transformation. Because reusable transformation is not contained with any maplet or mapping. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions .Can you use aggregator/active transformation after update strategy transformation? You can use aggregator after update strategy. say you are using SUM function. The informatica server stores condition values in the index cache and output values in the data cache. What is the difference between dimension table and fact table and what are different dimension tables and fact tables? Fact table contain measurable data. If you are . using transformation developer 2. The problem will be. Additive 2. then the deleted rows will be subtracted from this aggregator transformation. It contains primary key. contains primary key Different types of fact tables: 1. Create normal one and promote it to reusable What is Code Page used for? Code Page is used to identify characters that might be in different languages.

or you can manually run the session. Can you use a session Bulk loading options and during this time can you make a recovery to the session? If the session is configured to use in bulk mode it will not write recovery information to recovery tables. 3) Cache includes all lookup output ports in the lookup condition and the lookup/return port. Unconnected lookup: 1) Receives input values from the result of a lkp expression in a another transformation.importing Japanese data into mapping. 4) Does not support user defined default values. Run once: Informatica server runs the session only once at a specified date and time. A parameter file is a file created by text editor such as word pad or notepad. You can define the following values in parameter file: . 2) you can use a dynamic or static cache. you must select the Japanese code page of source data. 2) You can use a static cache.Part 11 What are the scheduling options to run a session? A session can be scheduled to run at a given time or intervel. What is parameter file? Parameter file is to define the values for parameters and variables used in a session. So Bulk loading will not perform the recovery as required. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . 4) Support user defined default values. 3) Cache includes all lookup columns used in the mapping. Different options of scheduling: Run only on demand: server runs the session only when user starts session explicitly. What are the differences between connected and unconnected lookup? Connected lookup: 1) Receives input values directly from the pipe line. Customized repeat: Informatica server runs the session at the dates and times specified in the repeat dialog box. Run every: Informatica server runs the session at regular intervals as u configured.

and sends post-session email when the session completes. that represent values you might want to change between sessions such as database connections or source files.Mapping parameters mapping variables session parameters. write. . and handle pre. How can you transform row to a column? 1. creates the DTM process. and transform data. By setting the option always runs the session.and post-session operations. The DTM process: Creates threads to initialize the session. Target file name: Use this parameter when you want to change the name or location of session target file between session runs. In a sequential batch can you run the session if previous session fails? Yes.Use pivot function in oracle What are the basic needs to join two sources in a source qualifier? Basic need to join two sources using source qualifier: 1) Both sources should be in same database 2) The should have at least one column in common with same data types 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions .Part 10 What are two types of processes that informatica runs the session? Load manager Process: Starts the session. Following are user defined session parameters: Database connections Source file names: Use this parameter when you want to change the name or location of session source file between session runs. Reject file name: Use this parameter when you want to change the name or location of session reject files between session runs. What are the session parameters? Session parameters are like mapping parameters. We can use normalizer transformation or 2. Server manager also allows you to create user defined session parameters. read.

In the Informatica it is a transformation that uses same stored procedures which are stored in the database. if you don't want to use the stored procedure then you have to create expression transformation and do all the coding in it. What is the method of loading 5 flat files of having same structure to a single target and which transformations I can use? Two Methods. What is the default join that source qualifier provides? Inner equi join. So we use mapping parameters and variables and define the values in a parameter file. This makes the process simple. Stored procedures are used to automate timeconsuming tasks that are too complicated for standard SQL statements. If we need to change the parameter value then we needs to edit the parameter file. it will be very difficult to edit the mapping and then change the attribute. But value of mapping variables can be changed by using variable function. Write all files in one directory then use file repository concept (don’t forget to type source file type as indirect in the session). . Violates database constraint Field in the rows was truncated or overflown. Use union transformation to combine multiple input files into a single target. In which circumstances that informatica server creates Reject files? When it encounters the DD_Reject in update strategy transformation. 2. Then we could edit the parameter file to change the attribute values. And those are stored and compiled at the server side. What is the difference between Stored Procedure (DB level) and Stored proc trans (INFORMATICA level) ? Why should we use SP trans ? First of all stored procedures (at DB level) are series of SQL statement. In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run.What are mapping parameters and variables in which situation we can use it ? If we need to change certain attributes of a mapping after every time the session is run. If we need to increment the attribute value by 1 after every session run then we can use mapping variables. Mapping parameter values remain constant. 1.

for duplicate record: condition: falg = 'Y' 2. For distinct Records condition flag = 'N' What r the types of lookup caches? 1) Static Cache 2) Dynamic Cache 3) Persistent Cache 4) Reusable Cache .'Y'. Expression: Flag= iif(col1 =prev_col1. This is a scenario in which the source has 2 columns 10 A 10 A 20 C 30 D 40 E 20 C and there should be 2 targets one to show the duplicate values and another target for distinct rows.'N') prev_col1 = col1 Router: 1.Part 9 What are variable ports and list two situations when they can be used? We have mainly tree ports Import.0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . Import represents data is flowing into transformation. T1 T2 10 A 10 A 20 C 20 C 30 D which transformation can be used to load data into target? 40 E Step1: sort the source data based on the unique key. Variable port is used when we mathematical calculations are required. Output. Out port is used when data is mapped to next transformation. Variable port.

We get this error while using too large tables. 3) If we have mappings loading multiple target tables we have to provide the Target Load Plan in the sequence we want them to get loaded. As the amount of data within an organization expands and real-time demand for information grows. Informatica Interview Questions .The database passwords (production) is changed in a periodic manner and the same is not updated at the Informatica side. 5) We might get some poor performance issues while reading from large tables. In update strategy target table or flat file which gives more performance? Why? Pros: Loading. in the session properties you have to select Treat Source Rows: Data Driven. 4) Error: Snapshot too old is a very common error when using Oracle tables. while ensuring data integrity throughout the execution process. All the source tables should be indexed and updated regularly.Part 8 Is sorter an active or passive transformation? What happens if we uncheck the distinct option in sorter? Will it be under active or passive transformation? Sorter is an active transformation. Because this distinct option eliminates the duplicate records from the table. Your mappings will fail in this case and you will get database connectivity error. the Power Center Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.5) Shared Cache What are the real times problems that generally come up while doing/running mapping/any transformation? Explain with an example? Here are few real time examples of problems while running informatica mappings: 1) Informatica uses OBDC connections to connect to the databases. If we do not select this Informatica server will ignore updates and it only inserts rows. Ideally we should schedule these loads when server is not very busy (meaning when no other loads are running). How can we partition a session in Informatica? Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning. Merging operations will be faster as there is no index concept and Data . if you don't check the distinct option it is considered as a passive transformation. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks. 2) If you are using Update strategy transformation in the mapping. Sorting.

while lookups speed will be lesser. enclose the file name in double quotes: -paramfile ?$PMRootDirmy file. For UNIX shell users.txt' Informatica interview questions . use the backslash () with the dollar sign ($). Cons: There is no concept of updating existing records in flat file. the parameter file name cannot have beginning or trailing spaces.txt? Note: When you write a pmcmd command that includes a parameter file located on another machine. You create a set of metadata tables within the repository database that the informatica application and tools access. What is parameter file? When you start a workflow. The informatica client and server access the repository to save and retrieve . As there is no indexes.Part 7 Define informatica repository? Infromatica Repository: The informatica repository is at the center of the informatica suite. enclose the parameter file name in single quotes: -paramfile '$PMRootDir/myfile. Pmcmd startworkflow -UV USERNAME -PV PASSWORD -s SALES: 6258 -f east -w wSalesAvg -paramfile '$PMRootDir/myfile.txt' For Windows command prompt users. What is the difference between constraint base load ordering and target load plan ? Constraint based load ordering Example: Table 1---Master Take 2---Detail If the data in Table-1 is dependent on the data in Table-2 then Table-2 should be loaded first. In Informatica this feature is implemented by just one check box at the session level. This ensures that the machine where the variable is defined expands the server variable.will be in ASCII mode. If the name includes spaces. you can optionally enter the directory and name of a parameter file. In such cases to control the load order of the tables we need some conditional loading which is nothing but constraint based load. The Informatica Server runs the workflow using the parameters in the file you specify.

Each domain can contain one global repository. In situations where sorted input cannot be supplied. Connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. then the table will contain only valid data. or is called by an expression in another transformation in the mapping. (Such as violation of not null constraint. updating etc. How can you improve session performance in aggregator transformation? One way is supplying the sorted input to aggregator transformation. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure. All data entering the transformation through the input ports affects the stored procedure. a group of connected repositories. value error. Global repository: (Power Center only. insertion. The global repository can contain common objects to be shared throughout the domain through global shortcuts. It either runs before or after the session.) The centralized repository in a domain. .e.metadata. Joiner can be used to join tables from difference source systems where as Source qualifier can be used to join tables in the same database. but we still need a common key from both tables. The row indicators signify what operation is going to take place (i. What is power center repository? Standalone repository: A repository that functions individually.) If one rectifies the error in the data present in the bad file and then reloads the data in the target.bad and it contains the records rejected by informatica server.Part 6 Explain error handling in informatica with examples? There is one file called the bad file which generally has the format as *. What are the difference between joiner transformation and source qualifier transformation? Joiner Transformation can be used to join tables from heterogeneous (different sources). Informatica interview questions . We definitely need a common key to join two tables no mater they are in same database or difference databases. deletion. What is the difference between connected and unconnected stored procedures? Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. There are two parameters one for the types of row and other for the types of columns. overflow etc. If we join two tables without a common key we will end up in a Cartesian Join. we need to configure data cache and index cache at session/transformation level to allocate more space to support aggregation. The column indicators contain information regarding why the column has been rejected. unrelated and unconnected to other repositories.). or the results of a stored procedure sent as an output parameter to another transformation.

(Power Center only. Within a mapping. the informatica server ignores all update strategy transformations in the mapping.) A repository within a domain that is not the global repository. In Power Center and Power Mart. it will not change. Example unconnected lookup uses static cache. update. treat all rows as inserts). Unicode mode: In this mode informatica server sorts the data as per the sorted order in session.reader. connected lookup). . When you configure a session. Explain Informatica server Architecture? Informatica server.Local repository.data transfer manager. Dynamic Cache: The cache is updated as to reflect the update in the table (or source) for which it is referring to. you use the Update Strategy transformation to flag rows for insert. you can instruct the Informatica Server to either treat all rows in the same way (for example. (Ex.temp server and writer are the components of informatica server. Within a mapping. ASCII Mode: In this mode informatica server sorts the date as per the binary order. Each local repository in the domain can connect to the global repository and use objects in its shared folders. If you do not choose data driven option setting. update. delete. delete or reject. Explain difference between static and dynamic cache with one example? Static Cache: Once the data is cached. First load manager sends a request to the reader if the reader is ready to read the data from source and dump into the temp server and data transfer manager manages the load and it send the request to writer as per first in first out process and writer takes the data from temp server and loads it into the target. What is update strategy transformation? The model you choose constitutes your update strategy. What is Data driven? The informatica server follows instructions coded into update strategy transformations with in the session mapping determine how to flag records for insert. How the informatica server sorts the string values in Rank transformation? We can run informatica server either in UNICODE data moment mode or ASCII data moment mode. or reject. you set your update strategy at two different levels: Within a session. how to handle changes to existing rows. or use instructions coded into the session mapping to flag rows for different database operations. load manager/rs.

Static lookup cache adds to the session run time.There you can see how many numbers of source rows are applied and how many number of rows loaded in to targets and how many number of rows rejected. Also remember that static lookup eats up space. Qualitative testing Steps: 1. First validate the mapping 2. What are the output files that the informatica server creates during the session run What are the output files that the informatica server creates during the session run? Informatica server log: Informatica server(on Unix) creates a log for all status and . How do we do unit testing in informatica? How do we load data in informatica? Unit testing in informatica are of two types 1. If once rows are successfully loaded then we will go for qualitative testing. so remember to select only those columns which are needed. but it saves time as informatica does not need to connect to your database every time it needs to lookup. Quantitative testing 2. you go and query the database to get the lookup value for each record which needs the lookup. In dynamic lookup cache. you cache all the lookup data at the starting of the session. Depending on how many rows in your mapping needs a lookup. Steps: 1.This is what a developer will do in Unit Testing. This is called Qualitative testing.When do you use an unconnected lookup and connected lookup? Or what is the difference between dynamic and static lookup? Or Why and when do we use dynamic and static lookup? In static lookup cache. This is called Quantitative testing. If any data is not loaded according to the DATM then go and check in the code and rectify it. you can decide on this. Once the session is succeeded then right click on session and go for statistics tab.Create session on the mapping and then run workflow.Take the DATM (DATM means where all business rules are mentioned to the corresponding source columns) and check whether the data is loaded according to the DATM in to target table.

These files will be created in informatica home directory. delete or reject. It also creates an error log for error messages. the informatica server creates the target file based on file properties entered in the session property sheet. creation of sql commands for reader and writer threads. Reject file: This file contains the rows of data that the writer does not write to targets.error messages(default name: pm. One if the session completed successfully the other if the session fails. update. Session detail include information such as table name. the indicator file contains a number to indicate whether the row was marked for insert. errors encountered and load summary. To generate this file select the performance detail option in the session property sheet. The amount of detail in session log file depends on the tracing level that you set. number of rows written or rejected you can view this file by double clicking on the session in monitor window. It writes information about session into log files such as initialization process.You can create two different messages. you can configure the informatica server to create indicator file. For the following circumstances informatica server creates index and data cache . The control file contains the information about the target flat file such as data format and loading instructions for the external loader. Indicator file: If you use the flat file as a target.server. Post session email: Post session email allows you to automatically communicate information about a session run to designated recipents.log). Output file: If session writes to a target file. Cache files: When the informatica server creates memory cache it also creates cache files. Session log file: Informatica server creates session log file for each session. For each target row. Session detail file: This file contains load statistics for each target in mapping. Performance detail file: This file contains information known as session performance details which helps you where performance can be improved. Control file: Informatica server creates control file and a target file when you run a session that uses the external loader.

In the SQ associated with that source will have a data type as decimal for that number port of the source. What is the use of incremental aggregation? Explain in brief with an example? It’s a session option.files: Aggregator transformation Joiner transformation Rank transformation Lookup transformation How do you handle decimal places while importing a flat file into informatica? While importing flat file definition just specify the scale for a numeric data type. Source . whereas a local disk moves data five to twenty times faster.If you have the multiple source qualifiers connected to the multiple targets.So aviod . Thus network connections often affect on session performance.Increase the session performance by following: The performance of the Informatica Server is related to network connections. What can you do to increase performance or explain Performance tuning in Informatica? What can you do to increase performance or explain Performance tuning in Informatica? The goal of performance tuning is to optimize session performance so sessions run during the available load window for the Informatica Server. Hence decimal is taken care. it passes new source data through the mapping and uses historical chache data to perform new aggregation caluculations incrementaly.it changes the rows into columns and columns into rows Normalization: To remove the redundancy and inconsistency What is the target load order? You specify the target load order based on source qualifiers in a maping. Data generally moves across a network at less than 1 MB per second. Integer is not supported. you can designate the order in which informatica server loads data into the targets. In the mapping.Number data type port . Differences between Normalizer and Normalizer transformation? Normalizer: It is a transormation mainly used for Cobol sources. When the informatica server performs incremental aggregation.SQ . For performance we will use it.decimal datatype. the flat file source supports only number data type(no decimal and integer).

Distibuting the session load to multiple informatica servers may improve session performance. Flat files: If your flat files stored on a machine other than the informatca server. choose server configure database connections. To improve the session performance in this case drop constraints and indexes before you run the session and rebuild them after completion of session. Partitioning the session improves the session performance by creating multiple connections to . We can improve the session performance by configuring the network packet size. You can run the multiple informatica servers’ againist the same repository. single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes. Because ASCII datamovement mode stores a character value in one byte. Also. Run the informatica server in ASCII datamovement mode improves the session performance. Running parallel sessions by using concurrent batches will also reduce the time of loading the data. targets and informatica server to improve session performance. If a session joins multiple source tables in one Source Qualifier.Unicode mode takes 2 bytes to store a character. Relational datasources: Minimize the connections to sources. Staging areas: If you use staging areas you force informatica server to perform multiple datapasses. To do this go to server manger.Moving target database into server system may improve session performance. If your target consists key constraints and indexes you slow the loading of data.Removing of staging areas may improve session performance.netwrok connections. So concurent batches may also increase the session performance. move those files to the machine that consists of informatica server. which allows data to cross the network at one time. optimizing the query may improve performance.

Aviod transformation errors to improve the session performance. where the schema is inclined slightly towards normalization. What is the difference between view and materialized view? .Because they must group data before processing it. In some cases if a session contains an aggregator transformation. What is snow flake scheme design in database? Snow flake schema is one of the designs that are present in database design.sources and targets and loads data in paralel pipe lines. create that filter transformation nearer to the sources or you can use filter condition in source qualifier. then the snow flake design is utilized. where as hierarchies are split into different tables in snow flake schema. If the dimensional table is split into many tables. If your session contains filter transformation. whereas snow flake schemas have one or more parent tables. A star schema has one fact table and is associated with numerous dimensions table and depicts a star. you can use incremental aggregation to improve session performance. The drilling down data from top most hierarchies to the lowermost hierarchies can be done. Explain the difference between star and snowflake schemas? Star schema: A highly de-normalized technique. Snow flake schema: The normalized principles applied star schema is known as Snow flake schema. Snow flake schema serves the purpose of dimensional modeling in data warehousing. Aggreagator. Rank and joiner transformation may often decrease the session performance . The dimensional table itself consists of hierarchies of dimensions in star schema. The reason is that. Differences: • • A dimension table will not have parent table in star schema. If the session contained lookup transformation you can improve the session performance by enabling the look up cache. It contains joins in depth. Every dimension table is associated with sub dimension table. To improve session performance in this case use sorted ports option. the tables split further.

) . These conformed dimensions have a static structure. It is derived from a fact table. Any dimension table that is used by multiple fact tables can be conformed dimensions. 2. Management is centralized. that means services can be started and stopped on nodes via a central web interface. 3. E. This dimension is called a junk dimension. The data is created when a query is fired on the view.0? The architecture of Power Center 8 has changed a lot: 1. When a view is created. The column (dimension) which is a part of fact table but does not map to any dimension. data of a materialized view is stored.g. Informatica has added "push down optimization" which moves data transformation processing to the native relational database I/O engine whenever it is most appropriate. It has a support for unstructured data which includes spreadsheets. Client Tools access the repository via that centralized machine. They can be compared and combined mathematically. It has added performance improvements (To bump up systems performance. the data is not stored in the database. The Repository Service and Integration Service (as replacement for Rep Server and Informatica Server) can be run on different computers in a network (so called nodes). This data helps in decision making. presentations and . performing calculations etc. Whereas. What is junk dimension? A single dimension is formed by lumping a number of small dimensions. The process of grouping random flags and text attributes in dimension by transmitting them to a distinguished sub dimension is related to junk dimension. What is degenerate dimension table? A degenerate table does not have its own dimension table. 5. It provides high availability. What is the difference between Informatica 7. scalability and flexibility. Microsoft Word files.0 and 8. resources are distributed dynamically. 7. On the other hand. 4. email.A view is created by combining data from different tables. employee_id What is conformed fact and conformed dimensions use for? Conformed fact in a warehouse allows itself to have same name in separate tables.PDF documents. PC8 is service-oriented for modularity. Running all services on one machine is still possible. seamless fail over. even redundantly. Hence. The data stored by calculating it before hand using queries. eliminating single points of failure. of course. 6. Conformed dimensions can be used across multiple data marts. Junk dimension has unrelated attributes. Materialized view usually used in data warehousing has data. a view does not have data of itself.

Data from various resources extracted and organized in the data warehouse selectively for analysis and accessibility. A dimension table can provide additional and descriptive information (dimension) of the field of a fact table.1. User defined functions 15. Hence. and matching capabilities. Data warehousing is the central repository for the data of several business systems in an enterprise. Midstream SQL transformation has been added in 8. If I want to know the number of resources used for a task. What are fact tables and dimension tables? As mentioned. What actually is required to create a data warehouse can be considered as Data Warehousing. Fact table in a data warehouse consists of facts and/or measures. 9. 12. Dynamic configuration of caches and partitioning 13. That means extracting data from different sources such as flat files.g. not in 8. On the other hand. e. . Informatica has added a new web based administrative console. Informatica has now added more tightly integrated data profiling. databases or XML data. What is Data warehousing? A data warehouse can be considered as a storage area where interest specific or relevant data is stored irrespective of the source. 14. the relation between a fact and dimension table is one to many. Ability to write a Custom Transformation in C++ or Java. 10. my fact table will store the actual measure (of resources) while my Dimension table will store the task and resource details. Data warehousing merges data from multiple sources into an easy and complete form. patterns by shifting through large data repositories using pattern recognition techniques. What is ETL process in data warehousing? ETL stands for Extraction. The nature of data in a fact table is usually numerical. Data mining is the process of correlations.1. PowerCenter 8 release has "Append to Target file" feature. Data mining is normally used for models and forecasting. 11. transformation and loading. transforming this data depending on the application’s need and loads this data into data warehouse. data in a warehouse comes from the transactions. Java transformation is introduced.1.8. Explain the difference between data mining and data warehousing? Data mining is a method for comparing large amounts of data for the purpose of finding patterns. cleansing. dimension table in a data warehouse contains fields used to describe the data in fact tables.

OLAP stands for OnLine Analytical Processing. What are cubes? Multi dimensional data is logically represented by Cubes in data warehousing. whereas snow flake schemas have one or more parent tables. The reason is that. then the snow flake design is utilized. The dimension and the data are represented by the edge and the body of the cube respectively. Applications that supports and manges transactions which involve high volumes of data are supported by OLTP system. What is snow flake scheme design in database? Snow flake schema is one of the designs that are present in database design. OLTP is based on client-server architecture and supports transactions across networks. Snow flake schema: The normalized principles applied star schema is known as Snow flake schema. It contains joins in depth. A star schema has one fact table and is associated with numerous dimensions table and depicts a star. Snow flake schema serves the purpose of dimensional modeling in data warehousing. The dimensional table itself consists of hierarchies of dimensions in star schema. Materialized view usually used in data warehousing has data. performing calculations etc. A cube typically includes the aggregations that are needed for business intelligence queries. a view does not have data of itself. An insight of data coming from various resources can be gained by a user with the support of OLAP. On the other hand. The drilling down data from top most hierarchies to the lowermost hierarchies can be done. Business data analysis and complex calculations on low volumes of data are performed by OLAP. The data stored by . OLAP environments view the data in the form of hierarchical cube. where the schema is inclined slightly towards normalization. What is the difference between view and materialized view? A view is created by combining data from different tables. If the dimensional table is split into many tables. Differences: • • A dimension table will not have parent table in star schema.What is an OLTP system and OLAP system? OLTP stands for OnLine Transaction Processing. This data helps in decision making. Every dimension table is associated with sub dimension table. Explain the difference between star and snowflake schemas? Star schema: A highly de-normalized technique. where as hierarchies are split into different tables in snow flake schema. the tables split further. Hence.

They provide a single integrated view of a customer across multiple business lines. What is degenerate dimension table? A degenerate table does not have its own dimension table. Any dimension table that is used by multiple fact tables can be conformed dimensions. It contains Meta data. When a view is created. Conformed dimensions can be used across multiple data marts. What is junk dimension? A single dimension is formed by lumping a number of small dimensions. What is active data warehousing? An Active data warehouse aims to capture data continuously and deliver real time data. Whereas. What is Virtual Data Warehousing? A virtual data warehouse provides a compact view of the data inventory. The column (dimension) which is a part of fact table but does not map to any dimension. On the other hand . data of a materialized view is stored. They can be compared and combined mathematically. It is derived from a fact table. The data is created when a query is fired on the view. These conformed dimensions have a static structure. Junk dimension has unrelated attributes. They can be fast as they allow users to filter the most important pieces of data from different legacy applications. E. The process of grouping random flags and text attributes in dimension by transmitting them to a distinguished sub dimension is related to junk dimension. employee_id What is conformed fact and conformed dimensions use for? Conformed fact in a warehouse allows itself to have same name in separate tables. It is associated with Business Intelligence Systems What is the difference between dependent and independent data warehouse? A dependent data warehouse stored the data in a central data warehouse. This dimension is called a junk dimension. the data is not stored in the database.g. It uses middleware to build connections to different data sources.calculating it before hand using queries.

Logical models are used to explore domain concepts. What are various methods of loading Dimension tables? Conventional load: Here the data is checked for any table constraints before loading. Difference between data modeling and data mining? Data modeling aims to identify all entities that have data. The Dimensional model will only have physical model. An example of this can be city of an employee. Direct or Faster load: The data is directly loaded without checking for any constraints. Conceptual models are typically used to explore high level business concepts in case of stakeholders. logical or Physical data models. an ER model will have both logical and physical model. The Primary keys of fact dimensional table are the foreign keys of fact tables. that models an ER diagram represents the entire businesses or applications processes. What is the difference between OLAP and data warehouse? A data warehouse serves as a repository to store historical data that can be used for analysis. What is the difference between ER Modeling and Dimensional Modeling? ER modeling. It then defines a relationship between these entities. OLAP tool helps to organize data in the warehouse using multidimensional models. planning strategies. OLAP is Online Analytical processing that can be used to analyze and evaluate data in a warehouse. finding meaningful patterns etc. What is Data Mart? Data mart stores particular data that is gathered from different sources. Data mining helps in reporting.independent data warehouse does not make use of a central data warehouse. These queries can be fired on the data warehouse. Describe the foreign key columns in fact table and dimension table? The primary keys of entity tables are the foreign keys of dimension tables. Data marts can be used to focus on specific business needs. The warehouse has data coming from varied sources. Particular data may belong to some specific community (group of people) or genre. The row of this data in the dimension can be . This is to say. Data models can be conceptual. it can be used to convert a large amount of data into a sensible form. Define the term slowly changing dimensions (SCD)? SCD are dimensions whose data changes very slowly. This dimension will change very slowly. While Physical models are used to explore database design. Data mining is used to examine or explore the data using queries. This diagram can be segregated into multiple Dimensional models.

The fact table in start schema will have foreign key references of dimension tables. allow updating of records based on the lookup condition. BUS schema has conformed dimension and standardized definition of facts. Define BUS Schema? A BUS schema is to identify the common dimensions across business processes. It reflects the businesses real time information. consistency. Snow Flake Schema: A star schema that is applied with normalized principles is known as Snow flake schema. Usually numerical data is stored with multiple columns and many rows. Lookup tables. profit margin is a non-additive fact for it has no meaning to add them up for the account level or the day level. It resembles a star. Define non-additive facts? The facts that can not be summed up for the dimensions present in the fact table are called nonadditive facts. What is real time data-warehousing? In real time data-warehousing.either replaced completely without any track of old record OR a new row can be inserted. OR the change can be tracked What is a Star Schema? A star schema comprises of fact and dimension tables. typos etc. For example. What is the difference between star and snowflake schema? Star Schema: A de-normalized technique in which one fact table is associated with several dimension tables. Every dimension table is associated with sub dimension table. the state of the business at that time will be returned. Data cleaning Methods: . The facts can be useful if there are changes in dimensions. What is data cleaning? How can we do that? Data cleaning is the process of identifying erroneous data. This means that when the query is fired in the warehouse. Fact table contains the fact or the actual data. Explain the use lookup tables and Aggregate tables? An aggregate table contains summarized view of data. like identifying conforming dimensions. Dimension tables contain attributes or smaller granular data. the warehouse is updated every time the system performs a transaction. The data is checked for accuracy. using the primary key of the target.

This means that we need to find the lowest level of information that can store in a fact table.Used to detect syntax errors.Confirms that the input data matches in format with expected data. employee_perfomance_weekly can be considered lower levels of granularity. they don’t really have facts or any information but are more commonly used for tracking some information of an event. Their structure makes it possible for the system to combine multiple indexes together so that they can access the underlying table faster. Employee_performance_daily. Statistical Methods. What is Data Cardinality? Cardinality is the term used in database relations to denote the occurrences of data on either side of the relation. Data Transformation . E. To find the number of leaves taken by an employee in a month. What is a level of Granularity of a fact table? A fact table is usually designed at a low level of Granularity. range. making them fast to read. Bitmap indexes have a significant space and performance advantage over other structures for such data.values of mean.g. What is Bit Mapped Index? Bitmap indexes make use of bit arrays (bitmaps) to answer queries by performing bitwise logical operations. standard deviation. or clustering algorithms etc are used to find erroneous data. What is the purpose of Fact less Fact Table? Fact less tables are so called because they simply contain keys which refer to the dimension tables. Employee performance is a very high level of granularity.Parsing . Eg. The Disadvantage of Bitmap indexes is: The overhead on maintaining them is enormous. The advantages of Bitmap indexes are: They have a highly compressed structure. Bitmap indexes are useful in the data warehousing applications. Hence. Tables that have less number of insert or update operations can be good candidates. Duplicate elimination . There are 3 basic types of cardinality: .This process gets rid of duplicate entries. They work well with data that has a lower cardinality which means the data that take fewer distinct values.

: A data column containing LAST_NAME (there may be several entries of the same last name) Low data cardinality:Values of a data column are very usual. This is used to determine the relationships Types of cardinalities: The Link Cardinality .g.: flag statuses: 0/1 Determining data cardinality is a substantial aspect used in data modeling. . e.g.0: M relation The Child Cardinality .1: M mandatory relationship The Characteristic Cardinality .1:0 relationships The Physical Segment Cardinality . e.1: M relationship.: email ids and the user names Normal data cardinality:Values of a data column are somewhat uncommon but never unique.High data cardinality:Values of a data column are very uncommon.0: M relationship The Paradox Cardinality .1:1 relationship The Possession Cardinality .0:0 relationships The Sub-type Cardinality . e.g.

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->