You are on page 1of 70

Informatica Question and Answers

what is rank transformation?where can we use this ... Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation. Where is the cache stored in informatica? cache stored in informatica is in informatica server.

If you want to create indexes after the load process which transformation you choose?stored procedure transformation In a joiner transformation, you should specify the source with fewer rows as the master source. Why? In joiner transformation Inforrmatica
server reads all the records from master source builds index and data caches based on master table rows after building the caches the joiner transformation reads records from the detail source and perform joins What happens if you try to create a shortcut to a non-shared folder? It only creates a copy of it. What is Transaction?

A transaction can be defined as DML operation. means it can be insertion, modification or deletion of data performed by users/ analysts/applicators

Can any body write a session parameter file which will change the source and targets for every session i.e different source and targets for each session run.
You are supposed to define a parameter file. And then in the Parameter file, you can define two parameters, one for source and one for target. Give like this for example: $Src_file = c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt Then go and define the parameter file: [folder_name.WF:workflow_name.ST:s_session_name] $Src_file =c:\program files\informatica\server\bin\abc_source.txt $tgt_file = c:\targets\abc_targets.txt If its a relational db, you can even give an overridden sql at the session level...as a parameter. Make sure the sql is in a single line.

Informatica Live Interview Questions
here are some of the interview questions i could not answer, any body can help giving answers for others also. thanks in advance. Explain grouped cross tab? Explain reference cursor What are parallel query's and query hints

What is meta data and system catalog What is factless fact schema What is confirmed dimension Which kind of index is preferred in DWH Why do we use DSS database for OLAP tools confirmed dimension == one dimension that shares with two fact table factless means, fact table without measures only contains foreign keys-two types of factless table, one is event tracking and other is coverage table Bit map indexes preferred in the data ware housing Metadate is data about data, here every thing is stored example-mapping, sessions, privileges other data, in informatica we can see the Metadate in the repository. System catalog that we used in the cognos, that also contains data, tables, privileges, predefined filter etc, using this catalog we generate reports group cross tab is a type of report in cognos, where we have to assign 3 measures for getting the result What is meant by Junk Attribute in Informatica? Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions.

Can anyone explain about incremental aggregation with an example?
When you use aggregator transformation to aggregate it creates index and data caches to store the data 1.Of group by columns 2. Of aggregate columns the incremental aggregation is used when we have historical data in place which will be used in aggregation incremental aggregation uses the cache which contains the historical data and for each group by column value already present in cache it add the data value to its corresponding data cache value and outputs the row in case of a incoming value having no match in index cache the new values for group by and output ports are inserted into the cache . Difference between Rank and Dense Rank?

Rank: 1 2<--2nd position 2<--3rd position 4 5 Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf game usually Ranks this way. This is usually a Gold Ranking. Dense Rank: 1 2<--2nd position 2<--3rd position 3 4 Same ranks are assigned to same totals/numbers/names. The next rank follows the serial number.

About Informatica Power center 7: 1) I want to know which mapping properties can be overridden on a Session Task level. 2)Know what types of permissions are needed to run and schedule Work flows.
1) I want to Know which mapping properties can be overridden on a Session Task level? You can override any properties other than the source and targets. Make sure the source and targets exist in your db if it is a relational db. If it is a flat file, you can override its properties. You can override sql if its a relational db, session log, DTM buffer size, cache sizes etc. 2) Know what types of permissions are needed to run and schedule Work flows You need execute permissions on the folder to run/schedule a workflow. You may have read and write. But u need execute permissions as well.

Can any one explain real time complain mappings or complex transformations in Informatica. Especially in Sales Domain.
Most complex logic we use is denormalization. We don’t have any Denormalizer transformation in Informatica. So we will have to use an aggregator followed by an expression. Apart from this, we use most of the complex in expression transformation involving lot of nested IIF and Decode statements...another one is the union transformation and joiner.

How do you create a mapping using multiple lookup transformation?
Use unconnected lookup if same lookup repeats multiple times.

In the source, if we also have duplicate records and we have 2 targets, T1- for unique values and T2- only for duplicate values. How do we pass the unique values to T1 and duplicate values to T2 from the source to these 2 different targets in a single mapping?

Soln1: source--->sq--->exp-->sorter (with enable select distinct check box) --->t1
--->aggregator (with enabling group by and write count function) --->t2 If u wants only duplicates to t2 u can follow this sequence --->agg (with enable group by write this code decode(count(col),1,1,0))-->Filter(condition is 0)--->t2.

Soln2: take two source instances and in first one embedded distinct in the source qualifier and connect
it to the target t1. and just write a query in the second source instance to fetch the duplicate records and connect it to the target t2. << if u use aggregator as suggested by my friend u will get duplicate as well as distinct records in the second target >>

Soln3: Use a sorter transformation. Sort on key fields by which u want to find the duplicates. then use
an expression transformation. Example: Example: field1--> field2--> SORTER: field1 --ascending/descending field2 --ascending/descending

We can use up to 64 partitions What is the difference between Power Centre and Power Mart? What is the procedure for creating Independent Data Marts from Informatica 7.. all the linked columns will reflect that change.where as Power mart have single repository(desktop repository)Power Centre again linked to global repositor to share between users Power center No.Union and custom transformation. Perform a calculation and Update SCD. So as we sort.. 'Duplicate'. The informatica server queries the lookup table based on the lookup ports used in the transformation.2. high end WH supported supported available Powermart n No.There is propagate option i. Synonym and Flat file.We can lookup a flat file .e. What are the enhancements made to Informatica 7. view. true.1 version when compared to 6. of repository aplicability global repository local repository ERP support n No.Expression: --> field1 --> field2 <--> v_field1_curr = field1 <--> v_field2_curr = field2 v_dup_flag = IIF(v_field1_curr = v_field1_prev.We can write to XML target. if we change any data type of a field. low&mid range WH not supported supported not available What is lookup transformation and update strategy transformation and explain with an example.2 version? In 7+ versions . all the rows come in order and it will evaluate based on the previous and current rows. 'Not Duplicate' <--> v_field1_prev = v_field1_curr <--> v_field2_prev = v_field2_curr Use a Router transformation and put o_dup_flag = 'Duplicate' in T2 and 'Not Duplicate' in T1. Look up transformation is used to lookup the data in a relational table. It compares the lookup transformation port values to lookup table column values based on the lookup condition By using lookup we can get related value. Informatica evaluates row by row. Two types of lookups Connected Unconnected . false) o_dup_flag = IIF(v_dup_flag = true.1.1? Power Centre have Multiple Repositories.

To load data into one fact table from more than one dimension tables. when u normal load we create redo log file. Optimize expressions. Joiner. delete or reject. You should minimize the amount of data moved by deleting unnecessary links between transformations.. limit connected input/output or output ports. Later load data into individual dimensions by using sources and transformations (aggregator. No. You can also perform the following tasks to optimize the mapping: . For transformations that use data cache (such as Aggregator. update.. and Lookup transformations). How do you configure mapping in informatica You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. Optimize transformations. lookup) in mapping designer then to the fact table connect the surrogate to the foreign key and the columns from dimensions to the fact. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. Rank... You should minimize the amount of data moved by deleting unnecessary links between transformations. .. You can also perform the following tasks to optimize the mapping: • • • • • Configure single-pass reading. that all. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache. and Lookup transformations). sequence generator. Eliminate transformation errors. So Bulk loading will not perform the recovery as required. Rank. but in bulk load session performance increases. assume (dimension table as source tables) and fact table as target. why because in bulk load u won’t create redo log file. Joiner. Optimize datatype conversions. Firstly you need to create a fact table and dimension tables. You should configure the mapping with the least number of transformations and expressions to do the most amount of work possible. To load the data from dimension table to fact table is simple . the reason Can i use a session Bulk loading option that time can i make a recovery to the session? If the session is configured to use in bulk mode it will not write recovery information to recovery tables. To define a flagging of rows in a session it can be insert. Update or Data driven. Delete.Update strategy transformation This is used to control how the rows are flagged for insert.. limit connected input/output or output ports. For transformations that use data cache (such as Aggregator. In Update we have three options Update as Update Update as insert Update else insert What is the logic will you implement to load the data in to one fact able from 'n' number of dimension tables. After loading the data into the dimension tables we will load the data into the fact tables for this is that the dimension tables contain the data related to the fact table..

Then we could edit the parameter file to change the attribute values. less rows Its contain primary key What are Work let and what use of work let and in which situation we can use it Worklet is a set of tasks. Complex mapping means involved in more logic and more business rules. it will be very difficult to edit the mapping and then change the attribute. how? if data in tables as follows Table A . If we need to change the parameter values then we need to edit the parameter file.Actually in my project complex mapping isIn my bank project. semi additive In the dimensions table contain textual description of data and also contain many columns. But value of mapping variables can be changed by using variable function. explain use of update strategy transformation Maintain the history data and maintain the most recent changes data. non additive. To execute a Work let.e. If a certain set of task has to be reused in many workflows then we use work lets. Optimize transformations. Mapping parameter values remain constant. Optimize datatype conversions. Is it possible through Informatica? If so. They r after taking loans relocated in to another place that time i feel to difficult maintain both previous and current addressesin the sense i am using scd2This is an simple example of complex mapping I have an requirement where in the columns names in a table (Table A) should appear in rows of target table (Table B) i.• ○ ○ ○ ○ ○ Configure single-pass reading. converting columns to rows. I involved in construct a 1 data ware houseMany customer is there in my bank project. This makes the process simple. If we need to increment the attribute value by 1 after every session run then we can use mapping variables In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run. It's contain primary key Different types of fact tables: Additive. What is difference between dimension table and fact table and what are different dimension tables and fact tables In the fact table contain measurable data and fewer columns and many rows. What are mapping parameters and variables in which situation we can use it If we need to change certain attributes of a mapping after every time the session is run. what is meant by complex mapping. The use of work let in a workflow is similar to the use of mapplet in a mapping. So we use mapping parameters and variables and define the values in a parameter file. Eliminate transformation errors. Optimize expressions. it has to be placed inside a workflow.

T.key_1 = b. L 3. max(decode( bcode. null )) a_code. If the Power Center Server cannot finish processing and committing data within the timeout period. Stop the outer most bacth\ Abort:---You can issue the abort command . A 2. bcode. null )) l_code from a. If the server cannot finish processing and committing data with in 60 sec . Nothing by using performance recovery option Can we run a group of sessions without using workflow manager ya Its Possible using pmcmd Command with out using the workflow Manager run the group of session. 'T'. 'L'. bcode char(1).000 records in to the target How can u load the records from 10001 th record when u run the session next time in informatica 6. b where a. bcode. A. table b values 1T 1A 1G 2A 2T 2L 3A and output required is as 1. except it has a timeout period of 60 seconds. it is similar to stop command except it has 60 second time out . max(decode( bcode. A the SQL query in source qualifier should be select key_1. it kills the DTM process and terminates the session. 'A'. null )) t_code. what is the difference between stop and abort The Power Center Server handles the abort command for the Session task like the stop command. max(decode( bcode. T. if the batch is part of nested batch.bkey_a group by key_1 / If a session fails after loading of 10.Key-1 char(3). bcode.1? Simple solution. table A values _______ 1 2 3 Table B bkey-a char(3). stop: _______If the session u want to stop is a part of batch you must stop the batch.

. The details about it are as follows: 1. Now management people monthly usage.. sales organization. So your dimension tables can be Customer (customer id. It has a full range of reporting on web also in windows. 3. state etc) Sales rep sales rep number.1 and Abinitio There is a lot of difference between Inforrmatica an Abinitio In Ab Initio we r using 3 parllalisim but Informatica using 1 parllalisim In Ab Initio no scheduling option we can scheduled manully or pl/sql script but informatica contains 4 scheduling options Ab Inition contains co-operating system but informatica is not . call details etc)you can follow star and snow flake schema in this case. Give at least four reasons for the selecting the organization. Now go to business management team they can ask for metrics out of billing process for their use. (For example Telecommunication. an insurance company. u can create 2 dimensional report and also cubes in here. name. Can you please tell me what should be those 15 questions to ask from a company.customer id... name. Numberrate plan:rate plan codeAnd Fact table can be:Billing details(bill #. may be the prime candidate for this) 2. This information is required to build data warehouse.. billing metrics. minutes used. banks. Depend upon the granularity of your data. say a telecom company? First of all meet your sponsors and make a BRD (business requirement document) about their expectation from this data warehouse (main aim comes from them)..What is difference between lookup cache and uncached lookup? Can i run the mapping with out starting the informatica server? The difference between cache and uncached lookup is when you configure the lookup transformation cache lookup it stores all the lookup table data in the cache when the first input record enter into the lookup transformation. idsalesorg: sales ord idBill dimension: Bill #. Identify a large company/organization that is a prime candidate for DWH project. city. in cache lookup the select statement executes only once and compares the values of the input record with the values in the cache but in uncached lookup the select statement executes for each input record entering into the lookup transformation and it has to connect to database each time entering the new record I want to prepare a questionnaire.Bill date. rate plan to perform sales rep and channel performance analysis and rate plan analysis. Can i start and stop single session in concurrent batch? Just right click on the particular session and going to recovery option or by using event wait and event rise What is Micro Strategy? Why is it used for? Can any one explain in detail about it? Micro strategy is again an BI tool which is a HOLAP. Prepare a questionnaire consisting of at least 15 non-trivial questions to collect requirements/information about the organization.. What is difference b/w Informatica 7.basically a reporting tool..For example they need customer billing process.

it reads the query and decides which will the best possible way for executing the query. How to append the records in flat file (Informatica) ? Where as in Data stage we have the options i) overwrite the existing file ii) Append existing file This is not there in Informatica v 7.then go to transformation statistics there we can see number of rows in source and target. the optimzer runs the query. For the first time. Use: If the table you are trying to query is already analysed.) When ever you process any SQL query in Oracle. Its about to be shipping in the market. what oracle engine internally does is. what are partition points? Partition points mark the thread boundaries in a source pipeline and divide the pipeline into stages.. bcz the third has some disadvantages.then What CBO does is. Oracle will go with full table scan. If u had to split the source level key going into two separate tables. it basically calculates the cost of each path and the analyses for which path the cost of execution is less and then executes that path so that it can optimize the query execution. Since informatica does not gurantee keys are loaded properly(order!) into those tables.0 where u can append to a flat file. So depending on the number of rules which are to be applied. What is cost based and rule based approaches and the difference Cost based and rule based approaches are the optimization techniques which are used in related to databases. 1. if table is not analysed. If the table is not analysed . But heard that it’s included in the latest version 8. So in this process. One as surrogate and other as primary. error related data) in a report format When your workflow gets completed go to workflow monitor right click the session . 2. cost based Optimizer (CBO): If a SQL query can be executed in 2 different ways ( like may have path 1 and path2 for same query). Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these two techniques. What are the different ways you could handle this type of situation? foreign key what is the best way to show metadata(number of rows at source. then oracle will go with CBO. if we go for session properties we can see errors related to data . Rule base optimizer(RBO): this basically follows the rules which are needed for executing a query. target and each transformation level.Ramp time is very quickly in Ab Initio campare than Informatica Ab Initio is userfriendly than Informatica What is mystery dimension? Also known as Junk Dimensions Making sense of the rogue fields in your fact table. Oracle follows these optimization techniques. where we need to optimize a SQL query. the Oracle follows RBO.

No errors will be perform With out using Updatestrategy and sessons options.You can select these details from the repository table. now you can update the rows in the target table Could anyone please tell me what are the steps required for type2 dimension/version data mapping. you can use the view REP_SESS_LOG to get these data Two relational tables are connected to SQ transformation. This file is mainly used in Cognos Impromptu tool after creating a imr ( report) we save the imr as IQD file which is used while creating a cube in power play transformer. all the incoming rows will be set with update flag. sampling? Cleansing:---TO identify and remove the retundacy and inconsistency sampling: just smaple the data throug send the data from source to target What is IQD file? IQD file is nothing but Impromptu Query Definition. it's change the rows into coloums and columns into rows Normalization:To remove the retundancy and inconsitecy How do I import VSAM files from source to target. how we can do the update our target table? Soln1: You can use this by using "update override" in target properties Soln2: In session properties. Normalizer: It is a transormation mainly using for cobol sources.In data source type we select Impromptu Query Definetion.you can change it in the session general properties -.finally call the procedure in informatica with the help of stored procedure transformation What is data merging. Do I need a special plugin .Treate source rows as :update so. what are the possible errors it will be thrown? We can connect two relational tables in one sq Transformation. There is an option insert update insert as update update as update like that by using this we will easily solve Soln3: By default all the rows in the session is set as insert flag . Differences between Normalizer and Normalizer transformation. how can we implement it Go to mapping designer in it go for mapping select wizard in it go for slowly changing dimension Here u will find a new window their u need to give the mapping name source table target table and type of slowly changing dimension then if select finish slowly changing dimension 2 mapping is created go to ware designer and generate the table then validate the mapping in mapping designer save it to repository run the session in workflow manager later update the source table and re run again u will find the difference in target table How to import oracle sequence into Informatica. Create one procedure and declare the sequence inside the procedure. data cleansing.

enter precision as 8 and scale as 3 and width as 10 for fixed width flat file In a sequential Batch how can we stop single session? . you will find versioning->Check In. 3)we can increase the chache size of the lookup If you are workflow is running slow in informatica. If you have four lookup tables in the workflow How do you troubleshoot to improve performance? There r many ways to improve the mapping which has multiple lookups. can anyone explain error handling in informatica with examples so that it will be easy to explain the same in the interview. Then click ok button. the flat file wizard helps in configuring the properties of the file so that select the numeric column and just enter the precision value and the scale. For version control in ETL layer using informatica. we can resolve the errors. Precision includes the scale for examples if the number is 98888. 1> First save the changes or new implementations. load summary.X. these r new rows only the new rows will come to mapping and the process will be fast .target.In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL What is the procedure or steps implementing versioning if you are already in version7. first of all after doing anything in your designer mode or workflow manager. Where do you start trouble shooting and what are the steps you follow? SOLN1: when the work flow is running slowly you have to find out the bottlenecks in this order target source mapping session system SOLN2: work flow may be slow due to different reasons one is alpha characters in decimal data check it out this and due to insufficient length of strings check with the SQL override How do you handle decimal places while importing a flatfile into informatica? while importing the flat file. so by seeing the errors encountered during the session running.. go to the session log file there we will find the information regarding to the session initiation process.654. (b) Dedicate the second one to update : source=target. 1) We can create an index for the lookup table if we have permissions(staging area). Leave the information you have done like "modified this mapping" etc. Any gotcha\'s or precautions.. errors encountered. In that window at the lower end side.. Where do you start trouble shooting and what are the steps you follow? If you are workflow is running slow in informatica. A window will be opened. do the following steps. There will be a pop up window.. these r existing rows only the rows which exists allready will come into the mapping.. 2>Then from navigator window.. 2) Divide the lookup mapping into two (a) dedicate one for insert means: source . right click on the specific object you are currently in..

because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table Can we use aggregator/active transformation after update strategy transformation? We can use. Because in Data warehousing historical data should be maintained. not with the value of 51.if there is a parameter file for the mapping variable it uses the value in the parameter file not the value+1 in the repositoryfor example assign the value of the mapping variable as 70.we have a task called wait event using that we can stop. why dimenstion tables are denormalized in nature ?. in workflow manager start-------->session.. for example i ran a session and in the end it stored a value of 50 to the repository. it should start with the value of 70..in othere words higher preference is given to the value in the parameter file how to use mapping parameters and what is their use . right clickon the session u will get a menu. the variable value will be saved to the repository after the completion of the session and the next time when u run the session. if u maintain primary key it won't allow the duplicate records with same employee id. But we can use passive transformation Can any one comment on significance of oracle 9i in informatica when compared to oracle 8 or 8i. in that go for persistant values. so all the dimensions are marinating historical data. lob allow only 9i not 8i and more over list partinition is there in 9i only in the concept of mapping parameters and variables.. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column). to maintain historical data means suppose one employee details like where previously he worked. i hope ur task will be done SOLN2: it takes value of 51 but u can override the saved variable in the repository by defining the value in the parameter file.next time when i run the session. run the session. we start using raise event. how to do this. there u will find the last value stored in the repository regarding to mapping variable.. all details should be maintain in one table. then remove it and put ur desired one. the server takes the saved variable value in the repository and starts assigning the next value of the saved value. they are de normalized. but the update flag will not be remain. and now where he is working. i mean how is oracle 9i advantageous when compared to oracle 8 or 8i when used in informatica it's very easy Actually oracle 8i not allowed user defined data types But 9i allows and then blob. SOLN1: u can do onething after running the mapping..

SOLN2: first dimenstion tables need to be loaded.Mapping parameters and variables make the use of mappings more flexible and also it avoids creating of multiple mappings. you can also use as for example consider price and quantity and total as a variable we can make a sum on the total_amt by giving sum (total_amt) variable port is used to break the complex expression into simpler and also it is used to store intermediate values What is difference between IIF and DECODE function. if their parameter is not present it uses the initial value which is assigned at the time of creating the variable How to delete duplicate rows in flat files source is any option in informatica Use a sorter transformation . Variable port is used when we mathematical calculations are required. How does the server recognise the source and target databases? By using ODBC connection. data type once defined the variable/parameter is in the any expression for example in SQ transformation in the source filter properties tab. Outport is used when data is mapped to next transformation. just enter filter condition and finally create a parameter file to assign the value for the variable / parameter and configure the session properties.Give in detail? SOLN1: we use the 2 wizards (i. Inport represents data is flowing into transformation. Variable port.. Outport.1\Server) These bad files can be imported as flat a file in source then thro' direct mapping we can load these files in desired format. What is the use of incremental aggregation? Explain me in brief with an example. Don’t think that fact table’s r different in case of loading. if u want to lookup data on multiple tables at a time u can do one thing join the tables which u want then lookup that joined table. however the final step is optional. it is general mapping as we do for other tables.. then according to the specifications the fact tables should be loaded.if it is relational.if is flat file FTP connection.. SOLN2: During the execution of workflow all the rejected rows will be stored in bad files (where your informatica server get installed C:\Program Files\Inforrmatica Power Center 7.e) the getting started wizard and slowly changing dimension wizard to load the fact and dimension tables. What is the procedure to load the fact table. . How to lookup the data on multiple tabels. informatica provieds lookup on joined tables How to retrieve the records from a rejected file. in that u will have a "distinct" option make use of it .see we can make sure with connection in the properties of session both sources & targets What are variable ports and list two situations when they can be used? We have mainly three ports Inport.by using these 2 wizards we can create different types of mappings according to the business requirements and load into the star schemas(fact and dimension tables). explane with syntax or example SOLN1: there is one utility called "reject Loader" where we can find out the reject records and able to refine and reload the rejected records. Its a session option when the informatica server performs incremental aggregation it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally for performance we will use it. and choose type as parameter/variable. specifications will play important role for loading the fact. it helps in adding incremental data mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping ----> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$..

SALARY3.. Star schema--De-Normalized dimensions Snow Flake Schema-. . It reduces database and informatica server performance" The restriction is only on the database side. Question is " if you make N number of tables to participate at a time in processing what is the position of your database. SALARY1. Informatica server Object is must.(in case of etl informatica is concerned) What is the limit to the number of sources and targets you can have in a mapping As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping. you might as well use the Rank transformation. I organization point of view it is never encouraged to use N number of tables at a time.. SALES > 99 AND SALES < 200. BONUS))). IIF( SALES < 200. beacaz we can use dynamic cache with it . The following shows how you can use DECODE instead of IIF : SALES > 0 and SALES < 50. Unconn lookup don't have that facility but in some special cases we can use Unconnected. SALARY2. check out the help file on how to use it.You can use nested IIF statements to test multiple conditions. also connected loop up can send multiple columns in a single row. SALES > 199. SALARY2. SALARY3. IIF( SALES < 50. IIF( SALES < 100. The following example tests for various conditions and returns 0 if sales is zero or negative: IIF( SALES > 0. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favorable I think the better one is connected look up. 0 ) You can use DECODE instead of IIF in many cases. SALES > 49 AND SALES < 100. BONUS) in Dimensional modeling fact table is normalized or denormalized?in case of star schema and incase of snow flake schema? No concept of normailzation in the case of star schema but in the case of snow flack schema dimension table must be normalized. target.Normalized dimensions which is better among connected lookup and unconnected lookup transformations in informatica or any other ETL tool? When you compared both basically connected lookup will return more values and unconnected returns one value conn lookup is in the same pipeline of source and it will accept dynamic caching. SALARY1. expressions should be available min 1 break point should be available for debugger to debug your session. Source. lookups. how many concurrent threads r u allowed to run on the db server? which objects are required by the debugger to create a valid debug session? Initially the session should be valid session. DECODE may improve readability. where as unconnected is concerned it has a single return port. since this is informatica. what is the procedure to write the query to list the highest salary of three employees? SELECT sal FROM (SELECT sal FROM my_table ORDER BY sal DESC) WHERE ROWNUM < 4.

In Designer while creating Update Strategy Transformation uncheck "forward to next transformation". Assume appropriate value wherever required.000 rows it will commit into target. Delete all the source qualifiers. Explain the commit points for Source based commit and Target based commit. If any rejected rows are there automatically it will be updated to the session log file. u will find sql query in that u can write ur sqls . Right click on the source qualifier u will find EDIT click on it. The input data is in one format and target is in another format. So for every 6. Add a common source qualifier for all.000 rows. For example a field X has some values and other with Null values and assigned to target field where target field is not null column.000. The best way to find out bottlenecks is writing to flat file and see where the bottle neck is . Suppose session is configured with commit interval of 10. it commits the data into target when ever the buffer fills Let us assume that the buffer size is 6. How to join two tables without using the Joiner Transformation SOLN1: It possible to join the two or more tables by using source qualifier. How do we estimate the number of partitions that a mapping really requires? Is it dependent on the machine configuration? It depends upon the informatica version we r using suppose if we r using informatica 6 it supports only 32 partitions where as informatica 7 supports 64 partitions Can Informatica be used as a Cleansing Tool? If yes give example of transformations that can implement a data cleansing routine. But provided the tables should have relationship. Yes. Click on the properties tab. If that happened total aggregation we need to execute on informatica also. When u drag n drop the table u will getting the source qualifier for each table. Identifying bottlenecks in various components of Informatica and resolving them. i. so it will take more time to process aggregation compared to the database. Here I am explaining why we need to use informatica. Update or insert files are known by checking the target file or table only..We are using Update Strategy Transformation in mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica. How do you decide whether you need it do aggregations at database level or at Informatica level? It depends upon our requirement only If you have good processing database you can create aggregation table or view at database level else its better to use informatica. what ever it may be informatica is a third party tool. Source based commit will commit the data into target based on commit interval so for every 10. Target based commit will commit the data into target based on buffer size of the target. we can change the format in expression. It depends upon performance again else we can use expression to cleansing data. No necessary to process entire values again and again unless this can be done if nobody deleted that cache files. In database we don't have Incremental aggregation facility. but in Informatica an option we called "Incremental aggregation" which will help you to update the current values with current values +new values. inside an expression we can assign space or some constant value to avoid session failure. we can use Informatica for cleansing data some time we use stages to cleansing the data.000 rows and source has 50.000 rows it commits the data.e. We can assign some default values to the target to represent complete set of data in the target.

1? The main difference between informatica 5.SOLN2: joiner transformation is used to join n (n>1) tables from same or different databases. In ver 7x u have the option of looking up (lookup) on a flat file. Filter transformation filters the rows that are not flagged and passes the flagged rows to the Update strategy transformation how to create the staging area in your database A Staging area in a DW is used as a temporary space to hold all the records from the source system.1).0 Features in 7. but source qualifier transformation is used to join only n tables from same database SOLN3: use Source Qualifier transformation to join tables on the SAME database. Note: you can only join 2 tables with Joiner Transformation but you can join two tables from different databases. its a system field ).2 and 7. whats the diff between Informatica powercenter server. you can specify the user-defined join. the db2 date format is "yyyymmdd" where as sysdate in oracle will give "dd-mm-yy" so conversion of db2 date formate to local database date formate is compulsary..1 is that in 6.1 and 6. U can write to XML target. but this is not valid (PMParser: Missing Operator). also between Versions 6.2 and Informatica 7.1 are : 1. Union and custom transformation .1. Versioning LDAP authentication Support of 64 bit architectures Differences between Informatica 6. In a filter expression we want to compare one date field with a db2 system field CURRENT DATE. What are the Differences between Informatica Power Center versions 6. So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab.1 they introduce a new thing called repository server and in place of server manager(5. repositoryserver and repository? Power center server contains the scheduled runs at which time data should load from source to target Repository contains all the definitions of the mappings done in designer. you can do also in Source Qualifier. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options. Any select statement you can run on a database.2 and 5. other wise u will get that type of error Use Sysdate or use to_date for the current date what does the expression n filter transformations do in Informatica Slowly growing target wizard? EXPESSION transformation detects and flags the rows from source. Our Syntax: datefield = CURRENT DATE (we didn't define it by ports. Under its properties tab. Can someone help us.. they introduce workflow manager and workflow monitor.

see Calling a Stored Procedure From an Expression. We ca move mapping in any web application 7. Version controlling 8. Pass parameters to the stored procedure and receive multiple output parameters. We can export independent and dependent rep objects 6. such as when a specific port does not contain a null value. Note: To get multiple output parameters from an unconnected Stored Procedure transformation. Run nested stored procedures. Pass parameters to the stored procedure and receive a single output parameter. such as pre. Lookup on flat file 3. mormal load) Compare Data Warehousing Top-Down approach with Bottom-up approach in top down approch: first we have to build dataware house then we will build data marts. Normal Load and Bulk load If the database supports bulk load option from Inforrmatica then using BULK LOAD for intial loading the tables is recommended. Unconnected Run a stored procedure based on data that passes through the mapping.or postsession. Run a stored procedure before or after your session. Unconnected Connected or Unconnected Connected or Unconnected Unconnected Unconnected Discuss which is better among incremental load. We can use pmcmdrep 5. the data mart that is first build will remain as a proff of concept for the others. Unconnected Unconnected Run a stored procedure every time a row passes through the Stored Procedure Connected or transformation.2. Data profilling What is the difference between connected and unconnected stored procedures. you must create variables for each output parameter. Run a stored procedure once during your mapping.(incremental load concept is differnt dont merge with bulk load. what are the difference between view and materialized view? . in bottom up approach: first we will build data marts then data warehuse. which will need more crossfunctional skills and timetaking process also costly. Grid servers working on different operating systems can coexist on same server 4. less time as compared to above and less cost. What is the difference between summary filter and detail filter summary filter can be applied on a group of rows that contain a common value where as detail filters can be applied on each and every rec of the data base. Depending upon the requirment we should choose between Normal and incremental loading strategies If supported by the database bulk load can do the loading faster than normal load. Call multiple times within a mapping. For details.

Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse. A materialized view provides indirect access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data

can we modify the data in flat file?

Just open the text file with notepad, change what ever you want (but datatype should be the same)

how to get the first 100 rows from the flat file into the target?
SOLN1: task ----->(link) session (workflow manager) double click on link and type $$source sucsess rows(parameter in session variables) = 100 it should automatically stops session. SOLN2: 1. Use test download option if you want to use it for testing. 2. Put counter/sequence generator in mapping and perform it.

can we lookup a table from a source qualifer transformation-unconnected lookup
No. we can't do. I will explain you why. 1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query. 2) source qualifier don't have any variables feilds to utalize as expression. what is a junk dimension A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades. What is the difference between Narmal load and Bul... Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed.Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases. compartivly Bulk load is pretty faster than normal load.

At the max how many tranformations can be us in a mapping?
There is no such limitation to use this number of transformations. But in performance point of view using too many transformations will reduce the session performance. My idea is "if needed more tranformations to use in a mapping its better to go for some stored procedure." Waht are main advantages and purpose of using Normalizer Transformation in Informatica? Narmalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data

How do u convert rows to columns in Normalizer? could you explain us?? Normally, its used to convert columns to rows but for converting rows to columns, we need an aggregator and expression and little effort is needed for coding. Denormalization is not possible with a Normalizer transformation.

Discuss the advantages & Disadvantages of star & snowflake schema?
In a star schema every dimension will have a primary key. In a star schema, a dimension table will not have any parent table. Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. Whereas hierachies are broken into separate tables in snow flake schema. These hierachies helps to drill down the data from topmost hierachies to the lowermost hierarchies. star schema consists of single fact table surrounded by some dimensional table.In snowflake schema the dimension tables are connected with some subdimension table. In starflake dimensional ables r denormalized,in snowflake dimension tables r normalized. star schema is used for report generation ,snowflake schema is used for cube. The advantage of snowflake schema is that the normalized tables r easier to maintain.it also saves the storage space. The disadvantage of snowflake schema is that it reduces the effectiveness of navigation across the tables due to large no of joins between them. what is a time dimension? give an example.

Time dimension is one of important in Datawarehouse. Whenever u genetated the report , that time u access all data from thro time dimension. eg. employee time dimension Fields : Date key, full date, day of wek, day , month,quarter,fiscal year

What r the connected or unconnected transforamations?
Connected transformation is a part of your data flow in the pipeline while unconnected Transformation is not. much like calling a program by name and by reference. use unconnected transforms when you wanna call the same transform many times in a single mapping An unconnected transformation cant be connected to another transformation. but it can be called inside another transformation. uncondition transformation are directly connected and can/used in as many as other transformations. If you are using a transformation several times, use unconditional. You get better performance.

How can U create or import flat file definition in to the warehouse designer?
U can create flat file definition in warehouse designer.in the warehouse designer,u can create new target: select the type as flat file. save it and u can enter various columns for that created target by editing its properties.Once the target is created, save it. u can import it from the mapping designer. U can not create or import flat file defintion in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.

What r the tasks that Loadmanger process will do?

Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process. Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again. Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session. Creating log files: Loadmanger creates logfile contains the status of session.

How do you transfert the data from data warehouse to flatfile?
You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool.

Diff between informatica repositry server & informatica server
Informatica Repository Server:It's manages connections to the repository from client application. Informatica Server:It's extracts the source data,performs the data transformation,and loads the transformed data into the target Router transformation

A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group. What are 2 modes of data movement in Informatica Server?The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server. a) Unicode - IS allows 2 bytes for each character and uses additional byte for each nonascii character (such as Japanese characters) b) ASCII - IS holds all data in a single byte The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.

How to read rejected data or bad data from bad file and reload it to target?
correction the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator.

Explain the informatica Architecture in detail
Informatica server connects source data and target data using native odbc drivers again it connect to the repository for running sessions and retriveing metadata information source------>informatica server--------->target

Starts the DTM to run sessions. writer. Runs pre-session shell commands.the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks. 2. while ensuring data integrity throughout the execution process. the PowerCenter Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users.monitor ¢Õ ¢Õdesigner w. Verifies connection object permissions. 2. the Load Manager performs the following tasks: 1. Runs pre-session stored procedures and SQL. Runs post-session shell commands. Sends post-session email.| | REPOSITORY repository←Repository→Repository ser. When the PowerCenter Server runs a session. Sends post-session email if the DTM terminates abnormally.adm. Runs post-session stored procedures and SQL. Runs sessions from master servers. 4. and load data. What is Load Manager? While running a Workflow.When the PowerCenter Server runs a workflow. 8. Runs workflow tasks. Reads the parameter file and expands workflow variables.? .transform. 11. Creates and expands session variables. As the amount of data within an organization expands and real-time demand for information grows. 4. 3. Validates session code pages if data code page validation is enabled. Distributes sessions to worker servers. Locks the workflow and reads workflow properties. 5. and transformation threads to extract. 6. the DTM performs the following tasks: 1. Creates and runs mapping. 8.f. Fetches session and mapping metadata from the repository. What is Data cleansing. 9. 10. ¢Õ source←informatica server→target control server -------------¢Õ w. 5. Creates the workflow log file. 3. 6. 7. 7.manager how can we partition a session in Informatica? The Informatica® PowerCenter® Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning.f. Creates the session log file. Checks query conversions if data code page validation is disabled. reader.. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks.

Design it in the transformation developer. out-of-date.A transformation is repository object that pass data to the next stage(i.U add an instance of it to maping. or formatted incorrectly. A passive transformation does not change the number of rows that pass through it.When u need to incorporate this transformation into maping.its instances automatically reflect these changes. To provide support for Mainframes source data. 2.which files r used as a source definitions?COBOL Copy-book filesWhere should U place the flat file to import the flat file defintion to the designer? There is no such restrication to place the source file. What r the reusable transforamtions?Reusable transformations can be used in multiple mappings. Reimport the definitionWhich transformation should u need while using the cobol sources as source defintions?Normalizer transformaiton which is used to normalize the data. Once U promote a standard transformation to reusable status.e to the next transformation or target) with/with out modifying the dataWhat r the active and passive transforamtions?An active transforamtion can change the number of rows that pass through it. This is nothing but polising of data. In performance point of view its better to place the file in server local src folder. So we need to polish this data. if you need path please check the server properties availble at workflow manager.Since the instance of reusable transforamation is a pointer to that transforamtion. Other typical example can be Addresses.Promote a standard transformation from the mapping designer.What r the methods for creating reusable transforamtions?Two methods 1. such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.Then we can create a mapplet which contains a series of Lkp transformations to find each dimension key and use it in each fact table mapping instead of creating the same Lkp logic in each mapping.U can demote it to a standard transformation at any time.U can change the transforamation in the transformation developer. It doesn't mean we should not place in any other folder. U can promote it to the status of reusable transformation. An active transformation can change the number of rows that pass through it.Later if U change the definition of the transformation . What is the maplet? For Ex:Suppose we have several fact tables that require a series of dimension keys. incomplete.all instances of it inherit the changes. Edit the definition 2. The other may store it as MALE and FEMALE. if we place in server src folder by default src will be selected at time session creation How many ways you can update a relational source defintion and what r they?Two ways 1.U can revert it to the original reusable transformation properties by clicking the revert button. redundant.Since cobol sources r oftenly consists of Denormailzed data. Transformations can be active or passive.After U add a transformation to the mapping . such as a Filter transformation that removes rows that do not meet the filter condition. We might need a address cleansing to tool to have the customers addresses in clean and neat form.modifies or passes data.This feature can save U great deal of work.What r the unsupported repository . The all sub systesms maintinns the customer address can be different. clean it before it is add to Datawarehouse.A passive transformation does not change the number of rows that pass through it. If u change the properties of a reusable transformation in mapping. For example of one of the sub system store the Gender as M and F. what is a transforamation?It is a repostitory object that generates.The process of finding and removing or correcting data that is incorrect.

• Mappings. If the Informatica Server requires more space. U need matching keys to join two relational sources in source qualifier transformation.the informatica server creates index and data caches in memory to process the transformation. Definitions of database objects (tables. What r the diffrence between joiner transformation and source qualifier transformation?U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation. We can use mapping parameters or variables in any transformation of the same maping or mapplet in which U have created maping parameters or variables. views.Because reusable tranformation is not contained with any maplet or maping.5 style Look Up functions XML source definitions IBM MQ source definitions• Source definitions. Definitions of database objects or files that contain the target data.If the informatica server requires more space. When you run a workflow that uses an Aggregator transformation.U can join relatinal sources . These are the instructions that the Informatica Server uses to transform and move data. use a sorter before the aggregator 2.it stores overflow values in cache files.a maping variable represents a value that can change throughout the session. A session is a type of task that you can put in a workflow. Unlike a mapping parameter.What r the mapping paramaters and maping variables?Maping parameter represents a constant value that U can define before running a session. it stores overflow values in cache files. • Multi-dimensional metadata. Transformations that you can use in multiple mappings. • Mapplets. Pre or post session stored procedures Target defintions Power mart 3. • Sessions and workflows.Where as u doesn’t need matching keys to join two sources. donot forget to check the option on the aggregator that tell the aggregator that the input is sorted on the same keys as group by.objects for a mapplet?COBOL source definition Joiner transformations Normalizer transformations Non reusable sequence generator transformations.Can u use the maping parameters or variables created in one maping into any other reusable transformation?Yes.The informatica server saves the value of maping variable to the repository at the end of session run and uses that value next time U run the session. Target definitions that are configured as cubes and dimensions.U declare and use the parameter in a maping or maplet. transforming. When u use the maping parameter . the Informatica Server creates index and data caches in memory to process the transformation. • Reusable transformations.A mapping parameter retains the same value throughout the entire session. How can U improve session performance in aggregator transformation? use sorted input: 1. A set of source and target definitions along with transformations containing business logic that you build into the transformation. • Target definitions. the key order is also very important What is aggregate cache in aggregator transforamtion?The aggregator stores data in the aggregate cache until it completes aggregate calculations.Can U use the maping parameters or variables created in one maping into another maping?NO. synonyms) or files that provide source data. Two relational sources should come from same datasource in sourcequalifier. A workflow is a set of instructions that describes how and when to run tasks related to extracting. A set of transformations that you can use in multiple mappings.When u run a session that uses an aggregator transformation.Then define the value of parameter in a parameter file for the session. Sessions and workflows store information about how and when the Informatica Server moves data. and loading data. Each session corresponds to a single mapping.

2. You can specify a default value if the target database does not handle NULLs. 4. The naming convention for Joiner transformations is JNR_TransformationName.what r the settiings that u use to cofigure the joiner transformation?• Master and detail source • Type of join • Condition of the join the Joiner transformation supports the following join types.all rows from both master and detail ( matching or non matching) follw this 1. Select the Joiner transformation. Either input pipelines contains a connected or unconnected Sequence Generator transformation. 3. Select and drag all the desired input/output ports from the second source into the Joiner transformation. 5. Both input pipelines originate from the same Normalizer transformation. . This description appears in the Repository Manager. You can edit this property later. Change the master/detail relationship if necessary by selecting the master source in the M column. Tip: Designating the source with fewer unique records as master increases performance during a join. 8.all master rows and only matching rows from detail Full outer -.which r coming from diffrent sources also. 6. click OK. The Designer configures the second set of source fields and master fields by default. Either input pipelines contains an Update Strategy transformation. In the Mapping Designer. since the fields in one of the sources may be empty. Keep in mind that you cannot use a Sequence Generator or Update Strategy transformation as a source to a Joiner transformation. Double-click the title bar of the Joiner transformation to open the Edit Transformations dialog box. Certain ports are likely to contain NULL values.only matching rows from both master and detail Master outer -. Drag all the desired input/output ports from the first source into the Joiner transformation. Add default values for specific ports as necessary. Enter a description for the transformation.all detail rows and only matching rows from master Detail outer -. Click any box in the M column to switch the master/detail relationship for the sources. which you set in the Properties tab: • Normal (Default) • Master Outer • Detail Outer • Full Outer What r the join types in joiner transformation? Normal (Default) -. 7. choose Transformation-Create. The Designer creates the Joiner transformation. Enter a name.In which condtions we can not use joiner transformation(Limitaions of joiner transformation)?Both pipelines begin with the same original data source. Both input pipelines originate from the same Joiner transformation. Both input pipelines originate from the same Source Qualifier transformation. Select the Ports tab. Select the Condition tab and set the condition. The Designer creates input/output ports for the source fields in the Joiner as detail fields by default. making it easier for you or others to understand or remember what the transformation does.

For example. 4. 3. Connected lookup Unconnected lookup Persistent cache Re-cache from database Static cache Dynamic cache Shared cache Differences between connected and unconnected lookup? Connected lookup Unconnected lookup Receives input values diectly from the Receives input values from the result of a lkp expression in pipe line.synonym. After building the caches. 1. 12.What r the types of lookup caches?Persistent cache: U can save the lookup cache files and reuse them the next time the informatica server processes a lookup transformation configured to use the cache.Why use the lookup transformation ?To perform the following tasks. if your source table includes employee ID. The master and detail ports must have matching datatypes. Choose Repository-Save to save changes to the mapping. Perform a calculation. Many normalized tables include values used in a calculation. but you want to include the employee name in your target table to make your summary data easier to read. The Joiner transformation only supports equivalent (=) joins: 10.9. Click OK. such as gross sales per invoice or sales tax. You can add multiple conditions.It compares the lookup transformation port values to lookup table column values based on the look up condition. but not the calculated value (such as net sales). Get a related value. Does not support user defiend default values What is meant by lookup caches?The informatica server builds a cache in memory when it processes the first row af a data in a cached look up transformation. . Click the Add button to add a condition. 11. the Joiner transformation reads records from the detail source and perform joinswhat is the look up transformation?Use lookup transformation in u’r mapping to lookup data in a relational table. You can use a Lookup transformation to determine whether records already exist in the target. U can use a dynamic or static cache Cache includes all lookup columns used in the maping Support user defined default values U can use a static cache. Cache includes all lookup out put ports in the lookup condition and the lookup/return port. Informatica server queries the look up table based on the lookup ports in the transformation. 5.The informatica server stores condition values in the index cache and output values in the data cache. the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows. Update slowly changing dimension tables. a another transformation. What r the types of lookup? 1. 2.view. 2. What r the joiner caches?When a Joiner transformation occurs in a session.It allocates memory for the cache based on the amount u configure in the transformation or session properties. Select the Properties tab and enter any additional settings for the transformations.

A Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group.If the input row out-ranks a stored row.when the lookup condition is true. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group.What is the Rankindex in Ranktransformation?The Designer automatically creates a RANKINDEX port for each Rank transformation. This indicates that the row is not in the cache or target table.u can create a look up transformation to use dynamic cache. Dynamic cache U can insert rows into the cache as u pass to the target The informatica server inserts rows into cache when the condition is false. For example.It caches the lookup table and lookup values in the cache for each row that comes into the transformation.the informatica server compares an inout row with rows in the datacache.the informatica server does not update the cache while it prosesses the lookup transformation. use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task. U can pass these rows to the target table Which transformation should we use to normalize the COBOL and relational sources?Normalizer Transformation.the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string.the normalizer transformation automatically appears.What r the types of groups in Router transformation?Input group Output group The designer copies property information from the input ports of the input group to create a set of output ports for each output group.creating input and output ports for every column in the source. U can share unnamed cache between transformations in the same maping.Recache from database: If the persistent cache is not synchronized with he lookup table.The informatica server stores group information in an index cache and row data in a data cache. informatica server returns the default value for connected transformations and null for unconnected transformations. if you create a Rank transformation that ranks the top 5 salespersons for each quarter.How the informatica server sorts the string values in Ranktransformation?When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder. Static cache: U can configure a static or readonly cache for only lookup table. Shared cache: U can share the lookup cache between multiple transactions.Difference between static cache and dynamic cache Static cache U can not insert or update the cache The informatica server returns a value from the lookup table or cache when the condition is true.By default informatica server creates a static cache. Dynamic cache: If u want to cache the target table and insert new rows into cache and the target. a Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. Two types of output groups User defined groups . When the condition is not true. When U drag the COBOL source in to the mapping Designer workspace. If you need to test the same input data based on multiple conditions. the rank index numbers the salespeople from 1 to 5:What is the Router transformation?A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data.the informatica server replaces the stored row with the input row.If U configure the seeion to use a binary sort order. U can configure the lookup transformation to rebuild the lookup cache. However.What r the rank caches?During the session .The informatica server dynamically inerts data to the target table.

you can instruct the Informatica Server to . In PowerCenter and PowerMart. • Select only distinct values from the source. • Specify an outer join rather than the default inner join. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. transformations. the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. A target load order group is the collection of source qualifiers.If u have the multiple source qualifiers connected to the multiple targets.What is the status code?Status code provides error handling for the informatica server during the session. When you configure a session.The stored procedure issues a status code that notifies whether or not stored procedure completed sucessfully. Two sources should have matching data types. What is the default join that source qualifier provides?Inner equi join. you set your update strategy at two different levels: • Within a session. you need to connect it to a Source Qualifier transformation. • Filter records when the Informatica Server reads source data. the Informatica Server adds an ORDER BY clause to the default SQL query. What is source qualifier transformation? What r the tasks that source qualifier performs? When you add a relational or a flat file source definition to a mapping.U can designatethe order in which informatica server loads data into the targets. • Join data originating from the same source database. what is update strategy transformation ? The model you choose constitutes your update strategy. For example.Why we use stored procedure transformation? A Stored Procedure transformation is an important tool for populating and maintaining databases. • Specify sorted ports. Database administrators create stored procedures to automate time-consuming tasks that are too complicated for standard SQL statements What r the types of data that passes between informatica server and stored procedure?3 types of data Input/Out put parameters Return Values Status code. If you specify a number for sorted ports. If you choose Select Distinct.It only used by the informatica server to determine whether to continue running the session or stop. and targets linked together in a mapping. If you include a user-defined join. the Informatica Server adds a WHERE clause to the default query. The Joiner transformation supports the following join types. What is the target load order?U specify the target loadorder based on source qualifiers in a maping. If you include a filter condition.This value can not seen by the user. how to handle changes to existing rows. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. • Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. you might use a custom query to perform aggregate calculations or execute a stored procedure.Default group U can not modify or delete default groups. which you set in the Properties tab: • • • • Normal (Default) Master Outer Detail Outer Full Outer What r the basic needs to join two sources in a source qualifier?Two sources should have primary and Foreign key relation ships. the Informatica Server replaces the join information specified by the metadata in the SQL query.

When you configure a session. instead of updating the records in the target they are inserted as new records. Slowly Growing target : Loads a slowly growing fact or dimension table by inserting new rows. update. Update else Insert: This option enables informatica to flag the records either for update if they are old or insert.the informatica server ignores all update strategy transformations in the mapping. or reject. What r the types of maping wizards that r to be provided in Informatica?Simple Pass through Slowly Growing Target Slowly Changing the Dimension Type1 Most recent values Type2Full History Version Flag Date Type3 Current and one previous What r the types of maping in Getting Started Wizard?Simple Pass through maping : Loads a static fact or dimension table by inserting all rows. treat all records as inserts). update. delete. you can instruct the Informatica Server to either treat all records in the same way (for example. If u do not choose data driven option setting. Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. delete or reject. or reject. Changes are tracked in the target table by versioning the primary key and creating a version number for . you use the Update Strategy transformation to flag records for insert. Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table.What is the default source option for update stratgey transformation?Data driven. all rows contain current dimension data. In other words. you use the Update Strategy transformation to flag rows for insert. Use this mapping to load new data when existing data does not require updates. update. or use instructions coded into the session mapping to flag records for different database operations. if they are new records from source. • Within a mapping.What r the options in the target session of update strategy transsformatioin?Insert Delete Update Update as update Update as insert Update esle insert Truncate table Update as Insert: This option specified all the update records from source to be flagged as inserts in the target. Within a mapping.What r the mapings that we use for slowly changing dimension table? Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. delete. Describe two levels in which update strategy transformation sets?Within a session.either treat all rows in the same way (for example. treat all rows as inserts). or use instructions coded into the session mapping to flag rows for different database operations.What is Datadriven?The informatica server follows instructions coded into update strategy transformations with in the session maping determine how to flag records for insert. Within a mapping. Use this mapping when you want to drop all existing data from your table before loading new data. Within a mapping. In the Type 1 Dimension mapping.

Creates threads to initialize the session. write.In addition it creates a flag value for changed or new dimension. Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.Define maping and sessions? Maping: It is a set of source and target definitions linked by transformation objects that define the rules for transformation.U can use multiple CPU’s to process a session concurently.what are the meta data of source U import?Source name Database location Column names .0?U can use command line arguments for a session or batch.transformation language or underlying tables in the repository. and sends post-session email when the session completes.Explained in previous question.This maping also inserts both new and changed dimensions in to the target.What r the new features of the server manager in the informatica 5.what is polling?It displays the updated information about the session in the monitor window. Session : It is a set of instructions that describe how and when to move data from source to targets. Version numbers and versioned primary keys track the order of changes to each dimension. Parallel data processing: This feature is available for powercenter only.If we use the informatica server on a SMP system. With a meta data reporter. And updated dimensions r saved with the value 0. Flag indiactes the dimension is new or newlyupdated.While importing the relational source defintion from database.Can u generate reports in Informatcia? It is a ETL tool. but you can generate metadata report.each dimension in the table. The monitor window displays the status of each session when U poll the informatica server. creates the DTM process. Rows containing changes to existing dimensions are updated in the target.Recent dimensions will gets saved with cuurent flag value 1.And changes r tracked by the effective date range for each version of each dimension.and post-session operations. Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions. The DTM process.This allows U to change the values of session parameters. you could not make reports from here. the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updatesWhat r the different types of Type2 dimension maping?Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number. Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target.and mapping parameters and maping variables. and handle pre. read.u can access information about U’r repository with out having knowledge of sql.How can u recognise whether or not the newly added rows in the source r gets insert in the target ?In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date RangeWhat r two types of processes that informatica runs the session? Load manager Process: Starts the session. Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. that is not going to be used for business analysis What is metadata reporter?It is a web based application that enables you to run reports againist repository metadata.Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server?Informatica server manager. and transform data. Process session data using threads: Informatica server runs the session in two processes.And newly added dimension in source will inserted into target with a primary key. When updating an existing dimension.

Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline.It also creates an error log for error messages.it creates the DTM process.These files will be created in informatica home directory.Datatypes Key constraints What r the designer tools for creating tranformations?Mapping designer Tansformation developer Mapplet designerHow many ways u create ports?Two ways 1.Why u use repository connectivity?When u edit.informatica server directly communicates the repository to check whether or not the session and users r valid.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently.Drag the port from another transforamtion 2.U can choose to merge the targets. Install the informatica server on a machine with multiple CPU’s.U choose the datamovement in the informatica server configuration settings.informatica server reads multiple files concurently.Master thread creates and manges all the other threads. Pre and post session threads: This will be created to perform pre and post session operations. and load for each partition in parallel.For loading the data informatica server creates a seperate file for each partition(of a source file). Writer thread: It will be created to load data to the target.schedule the sesion each time.I creates the master thread.What r the different threads in DTM process?Master thread: Creates and manages all other threads Maping thread: One maping thread will be creates for each session.Two types of datamovement modes avialable in informatica. Session log file: Informatica server creates session log file for each session.Why we use partitioning the session in informatica? Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target.It reads data from source. Transformation thread: It will be created to tranform data.Informatica server reads multiple partitions of a single source concurently. For XML and file sources.Click the add buttion on the ports tab. Reader thread: One thread will be created for each partition of a source. ASCII mode Uni code mode. transformation.It writes information about session into log files such as initialization process.How the informatica server increases the session performance through partitioning the source?For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.creation of sql commands for reader and writer . Informatica server can achieve high performance by partitioning the pipleline and performing the extract .Fectchs session and maping information.All the metadata of sessions and mappings will be stored in repository.DTM is to create and manage the threads that carry out the session tasks.server. To achieve the session partition what r the necessary tasks u have to do?Configure the session to partition source data.log). What is DTM process?After the loadmanger performs validations for session.What r the data movement modes in informatcia?Datamovement modes determines how informatcia server handles the charector data.What r the out put files that the informatica server creates during the session running?Informatica server log: Informatica server(on unix) creates a log for all status and error messages(default name: pm.

u should have to copy that maping first before u copy the sessionIn addition.the indicator file contains a number to indicate whether the row was marked for insert. If target folder or repository is not having the maping of copying session .Session detail include information such as table name.number of rows written or rejected. associated source.errors encountered and load summary.When the informatica server marks that a batch is failed?If one of session is configured to "run if previous completes" and that previous session failsWhat is a command that used to run a batch?pmcmd is used to start a batch.What r the different options used to configure the sequential batches?Two options . Session detail file: This file contains load statistics for each targets in mapping.The control file contains the information about the target flat file such as data format and loading instructios for the external loader.One if the session completed sucessfully the other if the session fails.delete or reject.threads. If u have sessions with source-target dependencies u have to go for sequential batch to start the sessions one after another.update.To genarate this file select the performance detail option in the session property sheet. This will automatically copy the mapping. you can copy the workflow from the Repository manager. Indicator file: If u use the flat file as a target.U can configure the informatica server to create indicator file.U can create two different messages.the informatica server creates the target file based on file prpoerties entered in the session property sheet. Control file: Informatica server creates control file and a target file when U run a session that uses the external loader. By using copy session wizard u can copy a session in a different folder or repository.targets and session to the target folder. output file: If session writes to a target file.Can u copy the session to a different folder or repository?Yes. Reject file: This file contains the rows of data that the writer does notwrite to targets.For each target row.Batches r two types Sequential: Runs sessions one after the other Concurrent: Runs session at same time. Aggreagtor transformation Joiner transformation Rank transformation Lookup transformationIn which circumstances that informatica server creates Reject files?When it encounters the DD_Reject in update strategy transformation. How many number of sessions that u can create in a batch?Any number of sessions.For the following circumstances informatica server creates index and datacache files.If u have several independent sessions u can use concurrent batches. Whch runs all the sessions at the same time.U can view this file by double clicking on the session in monitor window Performance detail file: This file contains information known as session performance details which helps U where performance can be improved. Cache files: When the informatica server creates memory cache it also creates cache files. Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.But that target folder or repository should consists of mapping of that session.The amount of detail in session log file depends on the tracing level that u set. Violates database constraint Filed in the rows was truncated or overflowed.What is batch and describe about types of batches?Grouping of session is known as batch.

Reject file name : Use this parameter when u want to change the name or location of session reject files between session runs. FileSource : To access the remote source file U must configure the FTP connection to the host machine before u create the session.U can configure session properties to merge these target fileswhat r the transformations that restricts the partitioning of sessions?Advanced External procedure tranformation and External procedure transformation: This transformation contains a check box on the properties tab to allow partitioning.represent values U might want to change between sessions such as database connections or source files. use the backslash (\) with the dollar sign ($).What r the session parameters?Session parameters r like maping parameters. Hetrogenous : When U’r maping contains more than one source type. This ensures that the machine where the variable is defined expands the server variable.Following r user defined session parameters. pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile '$PMRootDir/myfile. U can define the following values in parameter file Maping parameters Maping variables session parameters For Windows command prompt users.What is difference between partioning of relatonal target and partitioning of file targets?If u parttion a session with a relational target informatica server creates multiple connections to the target database to write target data concurently.By setting the option always runs the session.A parameter file is a file created by text editor such as word pad or notepad.create a new independent batch and copy the necessary sessions into the new batch.Run the session only if previous session completes sucessfully.in case of concurrent batch we cant do like this. If the name includes spaces. Server manager also allows U to create userdefined session parameters.u need to configure database connection to the datasource. enclose the file name in double quotes: -paramfile ”$PMRootDir\my file.the server manager creates a hetrogenous session that displays source options for all types. Database connections Source file names: use this parameter when u want to change the name or location of session source file between session runs Target file name : Use this parameter when u want to change the name or location of session target file between session runs.txt” Note: When you write a pmcmd command that includes a parameter file located on another machine. Always runs the session. the parameter file name cannot have beginning or trailing spaces.In a sequential batch can u run the session if previous session fails?Yes. If u want to start batch that resides in a batch.txt' How can u access the remote source into U’r session?Relational source: To acess relational source which is situated in a remote place .What is parameter file?Parameter file is to define the values for parameters and variables used in a session.Can u start a batches with in a batch?U can not.If u partition a session with a file target the informatica server creates one target file for each partition. Aggregator Transformation: If u use sorted ports u can not parttion the assosiated source .How can u stop a batch?By using server manager or pmcmd.Can u start a session inside a batch idividually?We can start our required session only in case of sequential batch.

Unicode mode takes 2 bytes to store a character.Increase the session performance by following. Aviod transformation errors to improve the session performance. If U’r session contains filter transformation .targets and informatica server to improve session performance.Because ASCII datamovement mode stores a character value in one byte. If the sessioin containd lookup transformation u can improve the session performance by enabling the look up cache.u can use incremental aggregation to improve session performance. U can run the multiple informatica servers againist the same repository.Moving target database into server system may improve session performance. optimizing the query may improve performance. Thus network connections ofteny affect on session performance.Distibuting the session load to multiple informatica servers may improve session performance.So concurent batches may also increase the session performance. Data generally moves across a network at less than 1 MB per second. .To do this go to server manger .choose server configure database connections. If u r target consists key constraints and indexes u slow the loading of data. If a session joins multiple source tables in one Source Qualifier.Joiner Transformation : U can not partition the master source for a joiner transformation Normalizer Transformation XML targets. Relational datasources: Minimize the connections to sources .So aviod netwrok connections. In some cases if a session contains a aggregator transformation . Also. move those files to the machine that consists of informatica server. Run the informatica server in ASCII datamovement mode improves the session performance. Flat files: If u’r flat files stored on a machine other than the informatca server. We can improve the session performance by configuring the network packet size. Running a parallel sessions by using concurrent batches will also reduce the time of loading the data. Staging areas: If u use staging areas u force informatica server to perform multiple datapasses.Performance tuning in Informatica?The goal of performance tuning is optimize session performance so sessions run during the available load window for the Informatica Server.create that filter transformation nearer to the sources or u can use filter condition in source qualifier. single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes. Removing of staging areas may improve session performance.which allows data to cross the network at one time. The performance of the Informatica Server is related to network connections.To improve the session performance in this case drop constraints and indexes before u run the session and rebuild them after completion of session. Partittionig the session improves the session performance by creating multiple connections to sources and targets and loads data in paralel pipe lines. whereas a local disk moves data five to twenty times faster.

transforming.• Standalone repository.client tools use. • Mappings. A repository that functions individually. Whole transformation logic will be hided in case of maplet.A reusable transformation is a single transformation that can be reusable. and product version. • Sessions and workflows. These are the instructions that the Informatica Server uses to transform and move data. The global repository can contain common objects to be shared throughout the domain through global shortcuts. A session is a type of task that you can put in a workflow. used by the Informatica Server and Client tools. Each session corresponds to a single mappingWhat is power center repository?The PowerCenter repository allows you to share metadata across repositories to create a data mart domain. • Reusable transformations. synonyms) or files that provide source data.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.Because they must group data before processing it. and connect strings for sources and targets.) The centralized repository in a domain. permissions and privileges. and loading data.But it is transparent in case of reusable transformation.Where as we can make them as a reusable transformations.Define informatica repository?The Informatica repository is a relational database that stores information. a group of connected repositories. Use repository manager to create the repository. If u create a variables or parameters in maplet that can not be used in another maping or maplet. We can not include source definitions in reusable transformations.Aggreagator. sessions indicating when you want the Informatica Server to perform the transformations. A set of transformations that you can use in multiple mappings. views. • Multi-dimensional metadata.normalizer transformations in maplet. Each domain can contain one global repository. A workflow is a set of instructions that describes how and when to run tasks related to extracting. and a number of local repositories to share the global metadata as needed. • Global repository. The repository also stores administrative information such as usernames and passwords. Transformations that you can use in multiple mappings.Rank and joiner transformation may oftenly decrease the session performance . • Mapplets. In a data mart domain. Definitions of database objects or files that contain the target data. • Target definitions.To improve session performance in this case use sorted ports option. • Local repository. you can create a single global repository to store metadata used across an enterprise. We cant use COBOL source qualifier. or metadata. (PowerCenter . (PowerCenter only. Metadata can include information such as mappings describing how to transform source data.Thsea tables stores metadata in specific format the informatica server. unrelated and unconnected to other repositories.What r the types of metadata that stores in repository?Following r the types of metadata that stores in the repository Database connections Global objects Mappings Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source definitions Target defintions Transformations• Source definitions.joiner.What is difference between maplet and reusable transformation?Maplet consists of set of transformations that is reusable.The Repository Manager connects to the repository database and runs the code needed to create the repository tables. Sessions and workflows store information about how and when the Informatica Server moves data. A set of source and target definitions along with transformations containing business logic that you build into the transformation.But we can add sources to a maplet. Definitions of database objects (tables. Target definitions that are configured as cubes and dimensions.

· Truncate the target tables and run the session again if the session is not recoverable. If the source changes only incrementally and you can capture changes. you apply captured changes in the source to aggregate calculations in a session. This allows the Informatica Server to update your target incrementally.U need data base connection to import the stored procedure in to u’r maping.Use performing recovery to load the records from where the session fails.How can u load the records from 10001 th record when u run the session next time?As explained above informatcia server has 3 methods to recovering the sessions. and Informatica Server configuration. rather than forcing it to process the entire source and recalculate the same calculations each time you run the session.What is tracing level and what r the types of tracing level?Tracing level represents the amount of information that informatcia server writes in a log file.But it is not preferable to work with that remote source directly by using remote connections . Correct the errors. and then complete the session. Customized repeat: Informatica server runs the session at the dats and times secified in the repeat dialog box. If a session fails after loading of 10. Different options of scheduling Run only on demand: server runs the session only when user starts session explicitly Run once: Informatica server runs the session only once at a specified date and time.Explain about perform recovery?When the Informatica Server starts a recovery session.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time. Each local repository in the domain can connect to the global repository and use objects in its shared folders. it reads the OPB_SRVR_RECOVERY table and notes the row ID of the last row committed to . Types of tracing level Normal Verbose Verbose init Verbose dataWhat is difference between stored procedure transformation and external procedure transformation?In case of storedprocedure transformation procedure will be compiled and executed in a relational data source.No need to have data base connection in case of external procedure transformation. The method you use to complete the session depends on the properties of the mapping. you can configure the session to process only those changes.000 records in to the target. · Consider performing recovery if the Informatica Server has issued at least one commit.Explain about Recovering sessions?If you stop a session or if an error causes a session to stop. session.) A repository within a domain that is not the global repository. But you have to Configure FTP Connection details IP address User authentication what is incremantal aggregation?When using incremental aggregation.What r the scheduling options to run a sesion?U can shedule a session to run at a given time or intervel. Run every: Informatica server runs the session at regular intervels as u configured.only.Where as in external procedure transformation procedure or function will be executed out side of data source.How can u work with remote database in informatica?did u work directly by using remote connections?To work with remote datasource u need to connect it with remote connections.or u can manually run the session. Use one of the following methods to complete the session: · Run the session again if the Informatica Server has not issued a commit.Instead u bring that source into U r local machine where informatica server resides. You can work with remote.Ie u need to make it as a DLL to access in u r maping. refer to the session and error logs to determine the cause of failure.

you might want to truncate all targets and run the batch again. To recover sessions using the menu: 1.How to recover the standalone session?A standalone session is a session that is not nested in a batch. To recover sessions using pmcmd: 1. The Informatica Server then reads all sources again and starts processing from the next row ID. Perform recovery is disabled in the informatica server configuration. If a concuurent batche contains multiple failed sessions. You must enable Recovery in the Informatica Server setup before you run a session so the Informatica Server can create and/or write entries in the OPB_SRVR_RECOVERY table.Copy the failed session using Operations-Copy Session. If the maping consists of sequence generator or normalizer transformation. If a standalone session fails. 2. Run the session from the beginning when the Informatica Server cannot run recovery or when running recovery might result in inconsistent data. recover the failed session as a standalone session. select Server Requests-Start Session in Recovery Mode from the menu. In the Server Manager. when a session does not complete.Follow the steps to recover a standalone session. you need to truncate the target tables and run the session from the beginning. These options are not available for batched sessions.If i done any modifications for my table in back end does it reflect in informatca warehouse or maping desginer or source analyzer?NO. . the Informatica Server bypasses the rows up to 10. 2. select Perform Recovery. 3.In the Server Manager.Drag the copied session outside the batch to be a standalone session. Perform Recovery is disabled in the Informatica Server setup. 4. and click OK. Select Server Requests-Stop from the menu. stop the session. If the sources or targets changes after initial session fails.On the Log Files tab. when you run recovery. 4. With the failed session highlighted. and click OK. The Informatica Server completes the session and then runs the rest of the batch. Informatica is not at all concern with back end data base.From the command line.Run the session.After the batch completes. Use the Perform Recovery session property To recover sessions in sequential batches configured to stop on failure: 1. 2. if a session in a concurrent batch fails and the rest of the sessions complete successfully.It displays u all the information that is to be stored in repository. open the session property sheet. To recover a session in a concurrent batch: 1. 3. If u change the partition information after the initial session fails. From the command line. the Informatica Server attempts to recover the previous session. you can run recovery starting with the failed session. How to recover sessions in concurrent batches?If multiple sessions in a concurrent batch fail. 2. If you do not configure a session in a sequential batch to stop on failure. and the remaining sessions in the batch complete. you can run recovery using a menu command or pmcmd.001. By default.000 and starts loading with row 10. you can recover the session as a standalone session. 3.the target database.How can u recover the session in sequential batches?If you configure a session in a sequential batch to stop on failure. if the Informatica Server commits 10. start recovery. highlight the session you want to recover.How can u complete unrcoverable sessions?Under certain circumstances.000 rows before the session fails. open the session property sheet.What r the circumstances that infromatica server results an unreciverable session?The source qualifier transformation does not use sorted ports. 5. the next time you run the session.Delete the standalone copy.Clear Perform Recovery. If you do not clear Perform Recovery.If want to reflect back end changes to informatica screens. However. For example.

What is Code Page Compatibility?Compatibility between code pages is used for accurate data movement when the Informatica Sever runs in the Unicode data movement mode. then there will not be any data loss.After draging the ports of three sources(sql server. MAX. count. If you are not interested to use joins at source qualifier level u can add some joins sepratly. Subset . What are various types of Aggregation? Various types of aggregation are SUM. Delete.oracle. Relational.And u have to replace the existing files with imported files. You can use a Connected Lookup with dynamic cache on the target What are Aggregate transformation? Aggregator transform is much like the Group by clause in traditional SQL. it also contains additional characters not contained in the other code page. used for analyzing the factual measures of one or more cubes. AVG. u must select the Japanese code page of source data.e. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. If the code pages are identical. this particular transform is a connected/active transform which can take the incoming data form the mapping pipeline and group them based on the group by ports specified and can calculated aggregate funtions like ( avg. COUNT. PERCENTILE. Superset . It represents all data queried from the source. Update (else Insert). What are Dimensions and various types of Dimensions? set of level properties that describe a specific aspect of a business. customer and .. MIN. Update (as Update). can u map these three ports directly to target?NO. the target code page must be a superset of the source code page.tc) for each of those groups. MEDIAN. FIRST. For accurate data movement. Egs. stddev.. LAST. Geography. what is a source qualifier? It is a transformation which represents the data Informatica server reads from source. and VARIANCE.again u have to import from back end to informatica by valid connection. If you are importing Japanese data into mapping. If you don't use join means not only diffrent sources but homegeous sources are show same error. From a performanace perspective if your mapping has an AGGREGATOR transform use filters and sorters very early in the pipeline if there is any need for them. which use that dimension. Update (as Insert). One code page can be a subset or superset of another.Unless and until u join those three ports in source qualifier u cannot map them directly if u drag three hetrogenous sources and populated to target without any join means you are entertaining Carteisn product. What is Code Page used for? Code Page is used to identify characters that might be in different languages. What are Target Types on the Server?Target Types are File. and Truncate Table How do you identify existing rows of data in the target table using lookup transformation? Can identify existing rows of data using unconnected lookup transformation. Loader and MQ.informix) to a single source qualifier. sum.. XML and ERP What are Target Options on the Servers?Target Options for File Target type are FTP File. STDDEV. There are no target options for ERP target type Target Options for Relational are Insert. time.A code page is a subset of another code page when all characters in the code page are encoded in the other code page.A code page is a superset of another code page when it contains the character encoded in the other code page.

The master thread creates and manages all other threads. what is ODS (operation data source) ANS1: ODS .Run Session One after the Other. . which is called the master thread.concurrent .Run Session At The Same Time. Informatica Client applications can contain the following types of metadata extensions: • • Vendor-defined.Mapping thread . The primary purpose of the DTM process is to create and manage threads that carry out the session tasks. it creates the DTM process. For example. When Informatica server writes messages to the session log it includes thread type and thread ID. With Lookup transformation. ETL Questions and Answers what is the metadata extension? Informatica allows end users and partners to extend the metadata stored in the repository by associating information with individual objects in the repository. You can view and change the values of vendor-defined metadata extensions. we can use either the server manager or the command line program pmcmd to start or stop the session.tRANSFORMATION THREAD .A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets.Pre and Post Session Thread-One Thread each to Perform Pre and Post Session Operations. but you cannot create.Batches . edit.One Thread to Each Session.Main thread of the DTM process. · The DTM allocates process memory for the session and divide it into buffers. Creates and manages all other threads. you can store your contact information with the mapping.It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server. or redefine them.We can use unconnected lookup transformation to determine whether the records already exist in the target or not. After creating the session. You associate information with repository metadata using metadata extensions. we can accomplish the following tasks: Get a related value-Get the Employee Name from Employee table based on the Employee IDPerform Calculation. User-defined. and view user-defined metadata extensions. This is also known as buffer memory. Fetches Session and Mapping Information..product. Third-party application vendors create vendor-defined metadata extensions. Following are the types of threads that DTM creates: Master thread . The DTM process is the second process associated with the session run.WRITER THREAD-One Thread for Each Partition if target exist in the source pipeline write to the target.One or More Transformation Thread For Each Partition. What is Data Transformation Manager?After the load manager performs validations for the session. the DTM creates a set of threads for each partition to allow concurrent processing. It creates the main thread.reader thread-One Thread for Each Partition for Each Source Pipeline. You create user-defined metadata extensions using PowerCenter/PowerMart. delete. Update slowly changing dimension tables . You can also change the values of user-defined extensions.Operational Data Store. There Are Two Types Of Batches : Sequential . What is Session and Batches?Session . delete. when you create a mapping. You can create. Why we use lookup transformations?Lookup Transformations can access data from relational tables that are not sources in mapping. · If we partition a session.

Workflow . For eg : $$ABC is defined in the infomatica mapping and the value for this variable is defined in the file called abc.ODS Comes between staging area & Data Warehouse.a task associated with a mapping to define the connections and other configurations for that mapping.represents the flow and transformation of data from source to taraget. workflow. The data is ODS will be at the low level of granularity. emails and sessions. mapplet? Mapping . you can override the SQL in the source qualifier to join with the lookup table to perform the lookup.Contains live data. Workflow .txt as [foldername_session_name] ABC='hello world" In the session properties u can give in the parameter file name field abc. Worklet . session. not snapshots. What are the different Lookup methods used in Informatica? In the lookup transormation mainly 2 types 1)connected 2)unconnected lookup Connected lookup: 1)It recive the value directly from pipeline 2)it iwill use both dynamic and static 3)it return multiple value 4)it support userdefined value Unconnected lookup:it recives the value :lkp expression 2)it will be use only dynamic 3)it return only single value 4)it does not support user defined values What are parameter files ? Where do we use them? Parameter file is any text file where u can define a value for the parameter defined in the informatica session. emails and sessions.a group of transformations that can be called within a mapping.a group of transformations that can be called within a mapping. What is the difference between Power Center & Power Mart? Power Mart is designed for: Low range of warehouses .wide tactical decision making. Mapping . ie.controls the execution of tasks such as commands.a task associated with a mapping to define the connections and other configurations for that mapping.a workflow that can be called within a workflow.represents the flow and transformation of data from source to taraget. unconnected lookup You cannot lookup from a source qualifier directly.controls the execution of tasks such as commands. Worklet . Mapplet .a workflow that can be called within a workflow.When the informatica sessions runs the values for the parameter is fetched from the specified file. worklet. this parameter file can be referenced in the session properties. Session . However.and has minimal history retained can we lookup a table from source qualifier transformation. Once data was poopulated in ODS aggregated data will be loaded into into EDW through ODS. Mapplet . ANS2: An updatable set of integrated operational data used for enterprise. Session .txt What is a mapping.

. OLAp tools are as follows. Can Informatica load heterogeneous targets from heterogeneous sources? yes! it loads from heterogeneous sources.Where delta or difference between target and source data is dumped at regular intervals. Server Manager 4. reduces work involved with purging of old data. What are the modules in Power Mart? 1. Power Mart Designer 2. Partitioning is done to break up a large table into smaller.e if we will go with the View concept in DB in that we only store query and once we call View it extract data from DB. Two types of partitioning are: 1. there are further 2 techniques:Refresh load .But In materialized View data is stored in some temp tables. What is Full load & Incremental or Refresh load? Full Load is the entire data dump load taking place the very first time.Where the existing data is truncated and reloaded completely. 2. Server 3. What is partitioning? What are the types of partitioning? Partitioning is a part of physical data warehouse design that is carried out to improve performance and simplify stored-data management. independently-manageable components because it: 1. Cognos Business Objects What are snapshots? What are materialized views & where do we use them? What is a materialized view log? Materialized view is a view in wich data is also stored in some temp table. Repository . Vertical partitioning (reduces efficiency in the context of a data warehouse).Name a few The various ETL tools are as follows. 2. reduces work involved with addition of new data. Informatica Datastage Business Objects Data Integrator Abinitio.only for local repositories mainly desktop environment. Gradually to synchronize the target data with source data. Timestamp for previous delta load has to be maintained.i. Incremental . Power mart is designed for: High-end warehouses Global as well as local repositories ERP support. Horizontal partitioning. What are the various tools? .

the tables that are to be extracted from various sources.consists of the analytical server 3. Keeping the OLTP data intact is very important for both the OLTP and the warehouse. Repository Manager What is a staging area? Do we need it? What is the purpose of a staging area? Staging area is place where you hold temporary tables on data warehouse server.consists of the database 2. We basically need staging area to hold the data . When addressing a table some dimension key must reflect the need for a record to get extracted. Foolproof would be adding an archive flag to record which gets reset when record changes What are the various transformation available? Aggregator Transformation Expression Transformation Filter Transformation Joiner Transformation Lookup Transformation Normalizer Transformation Rank Transformation Router Transformation Sequence Generator Transformation Stored Procedure Transformation Sorter Transformation Update Strategy Transformation XML Source Qualifier Transformation Advanced External Procedure Transformation External Transformation What is a three tier data warehouse? Three tier data warehouse contains three tier such as bottom tier. before loading the data into warehouse A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. Data tier . Staging tables are connected to work area or fact tables.bottom tier . Used for data cleansing and validation using First Logic. 2. If we attempt to load data directly from OLTP. How to determine what records to extract? Data modeler will provide the ETL developer. Staging area is a temp schema used to 1. Do Flat mapping i. date >= 1st of current mth) or a transaction flag (e. middle tier and top tier.tier that interacts with the end-user . it might mess up the OLTP because of format changes between a warehouse and OLTP. Mostly it will be from time dimension (e. 1. Middle tier contains two types of servers.g.g.e dumping all the OLTP data in to it without applying any business rules pushing data into staging will take less time because there is no business rules or transformation applied on it.5. MOLAP server Top tier deals with presentation or visualization of the results . Order Invoiced Stat). Presentation tier . The 3 tiers are: 1. Bottom tier deals with retrieving related data’s or information from various information repositories by using SQL. ROLAP server 2.middle tier . Application tier . and perform data cleansing and merging .

This helps you to extract the data from different ODS/Database.Do we need an ETL tool? When do we go for the tools in the market? ETL Tools are meant to extract. 1.Ignore . T. These row indicators or of four types D-valid data. transform and load the data into Data Warehouse for decision making. Can we use procedural logic inside Inforrmatica If yes how if now how can we use external procedural logic in Inforrmatica? We can use External Procedure Transformation to use external procedures. Their values can change automatically between sessions. we can check why a record has been rejected and this bad file contains first column a row indicator and second column a column indicator. loading the records and reviewing them (default values) Rejection of records either at the database due to constraint key violation or the informatica server when writing data into target table These rejected records we can find in the bad file folder where a reject file will be created for a session. else you no need any ETL How can we use mapping variables in Informatica? Where do we use them? After creating a variable. Als they can be used in source qualifier filter. N-null data. On top of it. maintaining the code placed a great challenge among the programmers. data profiling. the above mentioned ETL process was done manually by using SQL code created by programmers. debugging and loading into data warehouse when compared to the old method. Both COM and Inforrmatica Procedures are supported using External procedure Transformation . complex coding and more work hours. we can use it in any expression in a mapping or a mapplet. O-overflowed data. 3. This task was tedious and cumbersome in many cases since it involved many resources. Before the evolution of ETL Tools. transformation. Normally ETL Tool stands for Extraction Transformation Loader 2. If you have a requirement like this you need to get the ETL tools. And depending on these indicators we can changes to load data successfully to target. What are the various methods of getting incremental records or delta records from the source systems getting incremental records from source systems to target can be done by using incremental aggregation transformation Techniques of Error Handling . Rejecting bad records to a flat file . These difficulties are eliminated by ETL Tools since they are very powerful and they offer many advantages in all stages of ETL process starting from extraction.Truncated data. data cleansing. user defined joins or extract overrides and in expression editor of reusable transformations.

Output . For more information about specifying pre-session and post-session shell commands What is Informatica Metadata and where is it stored? Informatica Metadata contains all the information about the source tables. the transformations.or post-session shell command for a Session task. while a passive transformation does not change the number of rows and passes through the same number of rows that was given to it as input. Standalone Command task. such as a Filter transformation that removes rows that do not meet the filter condition. such as an Expression transformation that performs a calculation on data and passes all rows through the transformation Active transformations Advanced External Procedure Aggregator Application Source Qualifier Filter Joiner Normalizer Rank Router Update Strategy Passive transformation Expression External Procedure Maplet. You can use a Command task anywhere in the workflow or worklet to run shell commands. You can call a Command task as the pre. so that it will be useful and easy to perform transformations during the ETL process.1 How do we call shell scripts from Inforrmatica? You can use a Command task to call the shell scripts.Input Lookup Sequence generator XML Source Qualifier Maplet . 2. Transformations can be active or passive. A passive transformation does not change the number of rows that pass through it. Pre.Can we override a native sql query within Informatica? Where do we do it? How do we do it? we can override a sql query in the sql override property of a source qualifier What is latest version of Power Center / Power Mart? Power Center 7. target tables. in the following ways: 1.and post-session shell command. An active transformation can change the number of rows that pass through it. The Informatica Metadata is stored in Informatica repository What are active transformation / Passive transformations? An active transformation can change the number of rows as output after a transformation.

this can extend the overall development time.Note: When you create a materialized view using the FAST option you will need to create a view log on the master tables(s) as shown below:SQL> CREATE MATERIALIZED VIEW LOG ON emp. or FORCE).dept_no = d.Sub query Materialized ViewsThe following statement creates a sub query materialized view based on the emp and dept tables located on the remote database:SQL> CREATE MATERIALIZED VIEW mv_empdeptAS SELECT * FROM emp@remote_db eWHERE EXISTS (SELECT * FROM dept@remote_db d WHERE e.When do we Analyze the tables? How do we do it? When the data in the data warehouse changes frequently we need to analyze the tables.FAST ClauseThe FAST refreshes use the materialized view logs (as seen above) to send the rows that have changed from master tables to the materialized view. Tool based ETL provides maintainability. SQL> CREATE MATERIALIZED VIEW LOG ON emp. FORCE is the default.PRIMARY KEY and ROWID ClauseWITH PRIMARY KEY is used to create a primary key materialized view i.e. depending on the skill level of the team. Materialized view log created. Oracle will perform a fast refresh if one is possible or a complete refresh otherwise. It also reduces the learning curve on the team. To use the PRIMARY KEY clause you should have defined PRIMARY KEY on the master table or else you should use ROWID based materialized views.Materialized view log created.SQL> CREATE MATERIALIZED VIEW mv_emp_pk REFRESH FAST START WITH SYSDATE NEXT SYSDATE + 1/48 WITH PRIMARY KEY AS SELECT * FROM emp@remote_db. However.dept_no)REFRESH CLAUSE[refresh [fast| complete|force] [on demand | commit] [start with date] [next date] [with {primary key|rowid}]]The refresh option specifies: a. Compare ETL & Manual development? There are pros and cons of both tool based ETL and hand-coded ETL.Refresh Method . Materialized view created. PRIMARY KEY is the default option. The time and interval at which the view is to be refreshed Refresh Method . ease of development and graphical view of the flow. If you request a complete refresh.Rowid Materialized ViewsThe following statement creates the row id materialized view on table emp located on a remote database:SQL> CREATE MATERIALIZED VIEW mv_emp_rowid REFRESH WITH ROWID AS SELECT * FROM emp@remote_db. It is also good when the sources and targets are in the same environment. The refresh method used by Oracle to refresh data in materialized view b. Handcoded ETL is good when there is minimal transformational logic involved.FORCE ClauseWhen you specify a FORCE clause.Materialized views are not eligible for fast refresh if the defined subquery contains an analytic function.You should create a materialized view log for the master tables if you specify the REFRESH FAST clause. Analyze tables will compute/update the table statistics. the materialized view is based on the primary key of the master table instead of ROWID (for ROWID clause).COMPLETE ClauseThe complete refresh re-creates the entire materialized view. Oracle performs a complete refresh even if a fast refresh is possible. COMPLETE. Whether the view is primary key based or row-id based c.Refresh Method . that will help to boost the performance of your SQL. Materialized view log created. Primary Key Materialized ViewsThe following statement creates the primary-key materialized view on the table emp located on a remote database. If you do not specify a refresh method (FAST.Primary key materialized views allow materialized view master tables to be reorganized without affecting the eligibility of the .

such as a Filter transformation that removes rows that do not meet the filter condition. What is tracing level and what are the types of tracing levels? Tracing level represents the amount of information that informatcia server writes in a log file. It should evaluate to a future point in time.Part 15 What are active and passive transformations? Transformations can be active or passive. Materialized view created.In the above example. such as an Expression transformation that performs a calculation on data and passes all rows through the transformation.materialized view for fast refresh. A passive transformation does not change the number of rows that pass through it. Rowid materialized views should have a single master table and cannot contain any of the following: • Distinct or aggregate functions • GROUP BY Subqueries . Types of tracing level: Normal Verbose Verbose init Verbose data . the first copy of the materialized view is made at SYSDATE and the interval at which the refresh has to be performed is every two days. Informatica Training in Bangalore. The NEXT clause specifies the interval between refreshesSQL> CREATE MATERIALIZED VIEW mv_emp_pk REFRESH FAST START WITH SYSDATE NEXT SYSDATE + 2 WITH PRIMARY KEY AS SELECT * FROM emp@remote_db. Joins & Set operations Timing the refreshThe START WITH clause tells the database when to perform the first replication from the master table to the local base table. An active transformation can change the number of rows that pass through it. Marathahalli Top of Form Bottom of Form Informatica Interview Questions .

How can you say that union Transormation is Active transformation? By Definition. In union transformation the number of rows resulting from union can be different from the actual number of rows. Reader thread: One thread will be created for each partition of a source. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . It reads data from source. If any rejected rows are there automatically it will be updated to the session log file.Fectchs session and mapping information. It consists of data from dimension table (Primary Key's) and Fact table has foreign keys and measures. Pre and post session threads: This will be created to perform pre and post session operations. . Writer thread: It will be created to load data to the target. Is a fact table normalized or de-normalized? A fact table is always DENORMALISED table. Transformation thread: It will be created to transform data. Active transformation is the transformation that changes the number of rows that pass through it.Part 14 What are the different threads in DTM process? Master thread: Creates and manages all other threads Mapping thread: One mapping thread will be creates for each session. If we are using Update Strategy Transformation in a mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica? In Designer while creating Update Strategy Transformation uncheck "forward to next transformation".

Delete all the source qualifiers. Normal Load and Bulk load? It depends on the requirement. When you configure the session the load manager maintains list of list of sessions and session start times.Update or insert files are known by checking the target file or table only. Which is better among incremental load. Add a common source qualifier for all. Locking and reading the session: When the informatica server starts a session load manager locks the session from the repository.loadmanager reads the parameter file and verifies that the session level parameters are declared in the file Verifies permission and privileges: When the session starts load manger checks whether or not the user have privileges to run the session. Click on the properties tab and then you will find sql query in that you can write your sql. What is the difference between summary filter and detail filter? Summary filter can be applied on a group of rows that contain a common value. How to join two tables without using the Joiner Transformation? It’s possible to join the two or more tables by using source qualifier. When you drag and drop the tables you will be getting the source qualifier for each table. Locking prevents starting the session again and again. Right click on the source qualifier you will find EDIT. . Reading the parameter file: If the session uses a parameter files. click on it. But provided the tables should have relationship. When you start a session load manger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process. What are the tasks that Load manger process will do? Manages the session and batch scheduling: When you start the informatica server the load manager launches and queries the repository for a list of sessions configured to run on the informatica server. Whereas detail filters can be applied on each and every red of the data base. Otherwise Incremental load can be better as it takes only that data which is not available previously on the target.

Sessions and workflows: Sessions and workflows store information about how and when the Informatica Server moves data. and loading data. A workflow is a set of instructions that describes how and when to run tasks related to extracting. synonyms) or files that provide source data. Multi-dimensional metadata: Target definitions that are configured as cubes and dimensions. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . transforming. . A session is a type of task that you can put in a workflow.Creating log files: Load manger creates log file contains the status of session. How to delete duplicate rows in flat files source? Use a sorter transformation. Mappings: A set of source and target definitions along with transformations containing business logic that you build into the transformation. Mapplets: A set of transformations that you can use in multiple mappings.Part 13 What is Router transformation? Router transformation allows you to use a condition to test data. It is similar to filter transformation. These are the instructions that the Informatica Server uses to transform and move data. views. It allows the testing to be done on one or more conditions. Target definitions: Definitions of database objects or files that contain the target data. Each session corresponds to a single mapping. in this you will have a "distinct" option make use of it. Reusable transformations: Transformations that you can use in multiple mappings. What type of metadata is stored in repository? Source definitions: Definitions of database objects (tables.

0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions .Can you use aggregator/active transformation after update strategy transformation? You can use aggregator after update strategy. then the deleted rows will be subtracted from this aggregator transformation. It allocates memory for the cache based on the amount you configure in the transformation or session properties. say you had flagged some rows to be deleted and you had performed aggregator transformation for all rows. What is the difference between dimension table and fact table and what are different dimension tables and fact tables? Fact table contain measurable data. Can you use the mapping parameters or variables created in one mapping into any other reusable transformation? Yes. Because reusable transformation is not contained with any maplet or mapping. Semi additive Dimensions table contain textual description of data. Additive 2. Non additive 3. say you are using SUM function. The informatica server stores condition values in the index cache and output values in the data cache. contains primary key Different types of fact tables: 1. It contains primary key. Create normal one and promote it to reusable What is Code Page used for? Code Page is used to identify characters that might be in different languages. once you perform the update strategy.Part 12 What is meant by lookup cache? The informatica server builds a cache in memory when it processes the first row at a data in a cached look up transformation. The problem will be. If you are . using transformation developer 2. What are reusable transformations? You can design using two methods: 1.

3) Cache includes all lookup columns used in the mapping. 2) You can use a static cache. 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . Can you use a session Bulk loading options and during this time can you make a recovery to the session? If the session is configured to use in bulk mode it will not write recovery information to recovery tables. Run every: Informatica server runs the session at regular intervals as u configured. So Bulk loading will not perform the recovery as required. Customized repeat: Informatica server runs the session at the dates and times specified in the repeat dialog box. or you can manually run the session.importing Japanese data into mapping. A parameter file is a file created by text editor such as word pad or notepad. 4) Does not support user defined default values. 2) you can use a dynamic or static cache. Different options of scheduling: Run only on demand: server runs the session only when user starts session explicitly.Part 11 What are the scheduling options to run a session? A session can be scheduled to run at a given time or intervel. Unconnected lookup: 1) Receives input values from the result of a lkp expression in a another transformation. You can define the following values in parameter file: . 3) Cache includes all lookup output ports in the lookup condition and the lookup/return port. Run once: Informatica server runs the session only once at a specified date and time. What is parameter file? Parameter file is to define the values for parameters and variables used in a session. you must select the Japanese code page of source data. 4) Support user defined default values. What are the differences between connected and unconnected lookup? Connected lookup: 1) Receives input values directly from the pipe line.

Target file name: Use this parameter when you want to change the name or location of session target file between session runs.and post-session operations. and handle pre.Use pivot function in oracle What are the basic needs to join two sources in a source qualifier? Basic need to join two sources using source qualifier: 1) Both sources should be in same database 2) The should have at least one column in common with same data types 0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . What are the session parameters? Session parameters are like mapping parameters. write. . Reject file name: Use this parameter when you want to change the name or location of session reject files between session runs.Part 10 What are two types of processes that informatica runs the session? Load manager Process: Starts the session. Following are user defined session parameters: Database connections Source file names: Use this parameter when you want to change the name or location of session source file between session runs. and transform data. that represent values you might want to change between sessions such as database connections or source files. and sends post-session email when the session completes. We can use normalizer transformation or 2. Server manager also allows you to create user defined session parameters. By setting the option always runs the session.Mapping parameters mapping variables session parameters. read. The DTM process: Creates threads to initialize the session. How can you transform row to a column? 1. creates the DTM process. In a sequential batch can you run the session if previous session fails? Yes.

If we need to change the parameter value then we needs to edit the parameter file. 2. This makes the process simple. What is the method of loading 5 flat files of having same structure to a single target and which transformations I can use? Two Methods. Violates database constraint Field in the rows was truncated or overflown.What are mapping parameters and variables in which situation we can use it ? If we need to change certain attributes of a mapping after every time the session is run. But value of mapping variables can be changed by using variable function. What is the default join that source qualifier provides? Inner equi join. In the Informatica it is a transformation that uses same stored procedures which are stored in the database. And those are stored and compiled at the server side. So we use mapping parameters and variables and define the values in a parameter file. Mapping parameter values remain constant. What is the difference between Stored Procedure (DB level) and Stored proc trans (INFORMATICA level) ? Why should we use SP trans ? First of all stored procedures (at DB level) are series of SQL statement. if you don't want to use the stored procedure then you have to create expression transformation and do all the coding in it. 1. Stored procedures are used to automate timeconsuming tasks that are too complicated for standard SQL statements. In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run. In which circumstances that informatica server creates Reject files? When it encounters the DD_Reject in update strategy transformation. Use union transformation to combine multiple input files into a single target. . it will be very difficult to edit the mapping and then change the attribute. Then we could edit the parameter file to change the attribute values. Write all files in one directory then use file repository concept (don’t forget to type source file type as indirect in the session). If we need to increment the attribute value by 1 after every session run then we can use mapping variables.

Variable port.for duplicate record: condition: falg = 'Y' 2.0 comments Email This BlogThis! Share to Twitter Share to Facebook Share to Google Buzz Informatica Interview Questions . T1 T2 10 A 10 A 20 C 20 C 30 D which transformation can be used to load data into target? 40 E Step1: sort the source data based on the unique key.'Y'. This is a scenario in which the source has 2 columns 10 A 10 A 20 C 30 D 40 E 20 C and there should be 2 targets one to show the duplicate values and another target for distinct rows. Variable port is used when we mathematical calculations are required. Out port is used when data is mapped to next transformation. Output.Part 9 What are variable ports and list two situations when they can be used? We have mainly tree ports Import. Import represents data is flowing into transformation.'N') prev_col1 = col1 Router: 1. For distinct Records condition flag = 'N' What r the types of lookup caches? 1) Static Cache 2) Dynamic Cache 3) Persistent Cache 4) Reusable Cache . Expression: Flag= iif(col1 =prev_col1.

Ideally we should schedule these loads when server is not very busy (meaning when no other loads are running). Informatica Interview Questions . If we do not select this Informatica server will ignore updates and it only inserts rows. We get this error while using too large tables.Part 8 Is sorter an active or passive transformation? What happens if we uncheck the distinct option in sorter? Will it be under active or passive transformation? Sorter is an active transformation. 3) If we have mappings loading multiple target tables we have to provide the Target Load Plan in the sequence we want them to get loaded. 5) We might get some poor performance issues while reading from large tables. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks. in the session properties you have to select Treat Source Rows: Data Driven.5) Shared Cache What are the real times problems that generally come up while doing/running mapping/any transformation? Explain with an example? Here are few real time examples of problems while running informatica mappings: 1) Informatica uses OBDC connections to connect to the databases. Your mappings will fail in this case and you will get database connectivity error. In update strategy target table or flat file which gives more performance? Why? Pros: Loading. while ensuring data integrity throughout the execution process. 2) If you are using Update strategy transformation in the mapping. Merging operations will be faster as there is no index concept and Data .The database passwords (production) is changed in a periodic manner and the same is not updated at the Informatica side. the Power Center Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users. Because this distinct option eliminates the duplicate records from the table. if you don't check the distinct option it is considered as a passive transformation. As the amount of data within an organization expands and real-time demand for information grows. Sorting. All the source tables should be indexed and updated regularly. How can we partition a session in Informatica? Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning. 4) Error: Snapshot too old is a very common error when using Oracle tables.

As there is no indexes.txt? Note: When you write a pmcmd command that includes a parameter file located on another machine. The Informatica Server runs the workflow using the parameters in the file you specify.Part 7 Define informatica repository? Infromatica Repository: The informatica repository is at the center of the informatica suite. enclose the file name in double quotes: -paramfile ?$PMRootDirmy file. What is parameter file? When you start a workflow. What is the difference between constraint base load ordering and target load plan ? Constraint based load ordering Example: Table 1---Master Take 2---Detail If the data in Table-1 is dependent on the data in Table-2 then Table-2 should be loaded first.txt' Informatica interview questions . the parameter file name cannot have beginning or trailing spaces. This ensures that the machine where the variable is defined expands the server variable. The informatica client and server access the repository to save and retrieve .will be in ASCII mode. You create a set of metadata tables within the repository database that the informatica application and tools access. use the backslash () with the dollar sign ($). enclose the parameter file name in single quotes: -paramfile '$PMRootDir/myfile. For UNIX shell users. Pmcmd startworkflow -UV USERNAME -PV PASSWORD -s SALES: 6258 -f east -w wSalesAvg -paramfile '$PMRootDir/myfile. In Informatica this feature is implemented by just one check box at the session level. you can optionally enter the directory and name of a parameter file. while lookups speed will be lesser. If the name includes spaces. Cons: There is no concept of updating existing records in flat file. In such cases to control the load order of the tables we need some conditional loading which is nothing but constraint based load.txt' For Windows command prompt users.

(Such as violation of not null constraint.) If one rectifies the error in the data present in the bad file and then reloads the data in the target. If we join two tables without a common key we will end up in a Cartesian Join.e. Informatica interview questions . value error. updating etc.Part 6 Explain error handling in informatica with examples? There is one file called the bad file which generally has the format as *. Each domain can contain one global repository. . The column indicators contain information regarding why the column has been rejected. Connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. then the table will contain only valid data. All data entering the transformation through the input ports affects the stored procedure. It either runs before or after the session.). but we still need a common key from both tables. a group of connected repositories. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure. insertion. Joiner can be used to join tables from difference source systems where as Source qualifier can be used to join tables in the same database. deletion. unrelated and unconnected to other repositories. What is the difference between connected and unconnected stored procedures? Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. The global repository can contain common objects to be shared throughout the domain through global shortcuts.) The centralized repository in a domain. Global repository: (Power Center only. How can you improve session performance in aggregator transformation? One way is supplying the sorted input to aggregator transformation. The row indicators signify what operation is going to take place (i. or the results of a stored procedure sent as an output parameter to another transformation. or is called by an expression in another transformation in the mapping. We definitely need a common key to join two tables no mater they are in same database or difference databases.metadata. overflow etc. In situations where sorted input cannot be supplied. we need to configure data cache and index cache at session/transformation level to allocate more space to support aggregation.bad and it contains the records rejected by informatica server. What are the difference between joiner transformation and source qualifier transformation? Joiner Transformation can be used to join tables from heterogeneous (different sources). There are two parameters one for the types of row and other for the types of columns. What is power center repository? Standalone repository: A repository that functions individually.

Unicode mode: In this mode informatica server sorts the data as per the sorted order in session. Example unconnected lookup uses static cache. or use instructions coded into the session mapping to flag rows for different database operations. Dynamic Cache: The cache is updated as to reflect the update in the table (or source) for which it is referring to. In Power Center and Power Mart.) A repository within a domain that is not the global repository. Within a mapping. treat all rows as inserts). you set your update strategy at two different levels: Within a session. delete. Explain difference between static and dynamic cache with one example? Static Cache: Once the data is cached. update. it will not change. How the informatica server sorts the string values in Rank transformation? We can run informatica server either in UNICODE data moment mode or ASCII data moment mode. how to handle changes to existing rows. Each local repository in the domain can connect to the global repository and use objects in its shared folders.Local repository. (Ex. you can instruct the Informatica Server to either treat all rows in the same way (for example. the informatica server ignores all update strategy transformations in the mapping. . What is Data driven? The informatica server follows instructions coded into update strategy transformations with in the session mapping determine how to flag records for insert. delete or reject. or reject. (Power Center only. What is update strategy transformation? The model you choose constitutes your update strategy.data transfer manager. load manager/rs.reader. update. Explain Informatica server Architecture? Informatica server. If you do not choose data driven option setting. When you configure a session. First load manager sends a request to the reader if the reader is ready to read the data from source and dump into the temp server and data transfer manager manages the load and it send the request to writer as per first in first out process and writer takes the data from temp server and loads it into the target. Within a mapping. ASCII Mode: In this mode informatica server sorts the date as per the binary order. connected lookup).temp server and writer are the components of informatica server. you use the Update Strategy transformation to flag rows for insert.

This is called Qualitative testing. This is called Quantitative testing.Create session on the mapping and then run workflow.When do you use an unconnected lookup and connected lookup? Or what is the difference between dynamic and static lookup? Or Why and when do we use dynamic and static lookup? In static lookup cache. but it saves time as informatica does not need to connect to your database every time it needs to lookup. Once the session is succeeded then right click on session and go for statistics tab. you go and query the database to get the lookup value for each record which needs the lookup. What are the output files that the informatica server creates during the session run What are the output files that the informatica server creates during the session run? Informatica server log: Informatica server(on Unix) creates a log for all status and .This is what a developer will do in Unit Testing. Qualitative testing Steps: 1. so remember to select only those columns which are needed. Steps: 1. Static lookup cache adds to the session run time. Also remember that static lookup eats up space. If any data is not loaded according to the DATM then go and check in the code and rectify it. First validate the mapping 2. How do we do unit testing in informatica? How do we load data in informatica? Unit testing in informatica are of two types 1. you can decide on this. you cache all the lookup data at the starting of the session.There you can see how many numbers of source rows are applied and how many number of rows loaded in to targets and how many number of rows rejected. In dynamic lookup cache. Depending on how many rows in your mapping needs a lookup. If once rows are successfully loaded then we will go for qualitative testing.Take the DATM (DATM means where all business rules are mentioned to the corresponding source columns) and check whether the data is loaded according to the DATM in to target table. Quantitative testing 2.

The amount of detail in session log file depends on the tracing level that you set. For each target row.You can create two different messages. Indicator file: If you use the flat file as a target. errors encountered and load summary. Cache files: When the informatica server creates memory cache it also creates cache files. the informatica server creates the target file based on file properties entered in the session property sheet.log).server. you can configure the informatica server to create indicator file. the indicator file contains a number to indicate whether the row was marked for insert. Session log file: Informatica server creates session log file for each session. Post session email: Post session email allows you to automatically communicate information about a session run to designated recipents. These files will be created in informatica home directory. It writes information about session into log files such as initialization process. delete or reject. Reject file: This file contains the rows of data that the writer does not write to targets. One if the session completed successfully the other if the session fails.error messages(default name: pm. Output file: If session writes to a target file. creation of sql commands for reader and writer threads. Control file: Informatica server creates control file and a target file when you run a session that uses the external loader. For the following circumstances informatica server creates index and data cache . Session detail file: This file contains load statistics for each target in mapping. number of rows written or rejected you can view this file by double clicking on the session in monitor window. Performance detail file: This file contains information known as session performance details which helps you where performance can be improved. To generate this file select the performance detail option in the session property sheet. It also creates an error log for error messages. update. Session detail include information such as table name. The control file contains the information about the target flat file such as data format and loading instructions for the external loader.

What can you do to increase performance or explain Performance tuning in Informatica? What can you do to increase performance or explain Performance tuning in Informatica? The goal of performance tuning is to optimize session performance so sessions run during the available load window for the Informatica Server.So aviod . the flat file source supports only number data type(no decimal and integer).Increase the session performance by following: The performance of the Informatica Server is related to network connections.Number data type port . Data generally moves across a network at less than 1 MB per second. When the informatica server performs incremental aggregation. Hence decimal is taken care.files: Aggregator transformation Joiner transformation Rank transformation Lookup transformation How do you handle decimal places while importing a flat file into informatica? While importing flat file definition just specify the scale for a numeric data type. Thus network connections often affect on session performance. In the mapping.it changes the rows into columns and columns into rows Normalization: To remove the redundancy and inconsistency What is the target load order? You specify the target load order based on source qualifiers in a maping. you can designate the order in which informatica server loads data into the targets. Differences between Normalizer and Normalizer transformation? Normalizer: It is a transormation mainly used for Cobol sources. In the SQ associated with that source will have a data type as decimal for that number port of the source. it passes new source data through the mapping and uses historical chache data to perform new aggregation caluculations incrementaly. For performance we will use it. What is the use of incremental aggregation? Explain in brief with an example? It’s a session option.If you have the multiple source qualifiers connected to the multiple targets. Source .decimal datatype. whereas a local disk moves data five to twenty times faster. Integer is not supported.SQ .

move those files to the machine that consists of informatica server. So concurent batches may also increase the session performance. If your target consists key constraints and indexes you slow the loading of data. targets and informatica server to improve session performance.Distibuting the session load to multiple informatica servers may improve session performance.netwrok connections. If a session joins multiple source tables in one Source Qualifier. Partitioning the session improves the session performance by creating multiple connections to .Moving target database into server system may improve session performance. choose server configure database connections. To improve the session performance in this case drop constraints and indexes before you run the session and rebuild them after completion of session. Also. We can improve the session performance by configuring the network packet size. Relational datasources: Minimize the connections to sources. optimizing the query may improve performance. Because ASCII datamovement mode stores a character value in one byte. To do this go to server manger.Removing of staging areas may improve session performance. single table select statements with an ORDER BY or GROUP BY clause may benefit from optimization such as adding indexes. Running parallel sessions by using concurrent batches will also reduce the time of loading the data. Staging areas: If you use staging areas you force informatica server to perform multiple datapasses. Run the informatica server in ASCII datamovement mode improves the session performance.Unicode mode takes 2 bytes to store a character. which allows data to cross the network at one time. You can run the multiple informatica servers’ againist the same repository. Flat files: If your flat files stored on a machine other than the informatca server.

Rank and joiner transformation may often decrease the session performance . It contains joins in depth. If the session contained lookup transformation you can improve the session performance by enabling the look up cache. Aviod transformation errors to improve the session performance. create that filter transformation nearer to the sources or you can use filter condition in source qualifier. The drilling down data from top most hierarchies to the lowermost hierarchies can be done. Differences: • • A dimension table will not have parent table in star schema. Explain the difference between star and snowflake schemas? Star schema: A highly de-normalized technique. Every dimension table is associated with sub dimension table.sources and targets and loads data in paralel pipe lines. where as hierarchies are split into different tables in snow flake schema. whereas snow flake schemas have one or more parent tables. you can use incremental aggregation to improve session performance. where the schema is inclined slightly towards normalization. What is snow flake scheme design in database? Snow flake schema is one of the designs that are present in database design.Because they must group data before processing it. If your session contains filter transformation. The reason is that. A star schema has one fact table and is associated with numerous dimensions table and depicts a star. Aggreagator. If the dimensional table is split into many tables. the tables split further. The dimensional table itself consists of hierarchies of dimensions in star schema. To improve session performance in this case use sorted ports option. What is the difference between view and materialized view? . then the snow flake design is utilized. In some cases if a session contains an aggregator transformation. Snow flake schema: The normalized principles applied star schema is known as Snow flake schema. Snow flake schema serves the purpose of dimensional modeling in data warehousing.

a view does not have data of itself. performing calculations etc. It has added performance improvements (To bump up systems performance. data of a materialized view is stored. Microsoft Word files. the data is not stored in the database. E. On the other hand. 4. Running all services on one machine is still possible. Informatica has added "push down optimization" which moves data transformation processing to the native relational database I/O engine whenever it is most appropriate. Junk dimension has unrelated attributes.g. 3. The process of grouping random flags and text attributes in dimension by transmitting them to a distinguished sub dimension is related to junk dimension. The column (dimension) which is a part of fact table but does not map to any dimension. What is the difference between Informatica 7. It is derived from a fact table. seamless fail over.PDF documents. Management is centralized. Hence. What is junk dimension? A single dimension is formed by lumping a number of small dimensions. 2. When a view is created. eliminating single points of failure.A view is created by combining data from different tables. employee_id What is conformed fact and conformed dimensions use for? Conformed fact in a warehouse allows itself to have same name in separate tables.0 and 8. It provides high availability. even redundantly. This dimension is called a junk dimension. The data stored by calculating it before hand using queries.0? The architecture of Power Center 8 has changed a lot: 1. Whereas.) . Client Tools access the repository via that centralized machine. that means services can be started and stopped on nodes via a central web interface. of course. email. resources are distributed dynamically. The Repository Service and Integration Service (as replacement for Rep Server and Informatica Server) can be run on different computers in a network (so called nodes). Conformed dimensions can be used across multiple data marts. It has a support for unstructured data which includes spreadsheets. Materialized view usually used in data warehousing has data. What is degenerate dimension table? A degenerate table does not have its own dimension table. This data helps in decision making. 7. Any dimension table that is used by multiple fact tables can be conformed dimensions. 6. These conformed dimensions have a static structure. The data is created when a query is fired on the view. They can be compared and combined mathematically. PC8 is service-oriented for modularity. presentations and . scalability and flexibility. 5.

Hence. User defined functions 15.1.1. transformation and loading. Data warehousing merges data from multiple sources into an easy and complete form. data in a warehouse comes from the transactions. Data from various resources extracted and organized in the data warehouse selectively for analysis and accessibility. Midstream SQL transformation has been added in 8. not in 8. my fact table will store the actual measure (of resources) while my Dimension table will store the task and resource details.1. dimension table in a data warehouse contains fields used to describe the data in fact tables. What actually is required to create a data warehouse can be considered as Data Warehousing. 14.g. 12. That means extracting data from different sources such as flat files. Data mining is the process of correlations.8. Explain the difference between data mining and data warehousing? Data mining is a method for comparing large amounts of data for the purpose of finding patterns. cleansing. and matching capabilities. What are fact tables and dimension tables? As mentioned. 10. If I want to know the number of resources used for a task. Data warehousing is the central repository for the data of several business systems in an enterprise. e. Informatica has now added more tightly integrated data profiling. the relation between a fact and dimension table is one to many. transforming this data depending on the application’s need and loads this data into data warehouse. Dynamic configuration of caches and partitioning 13. Fact table in a data warehouse consists of facts and/or measures. 11. Ability to write a Custom Transformation in C++ or Java. What is ETL process in data warehousing? ETL stands for Extraction. 9. A dimension table can provide additional and descriptive information (dimension) of the field of a fact table. Informatica has added a new web based administrative console. The nature of data in a fact table is usually numerical. What is Data warehousing? A data warehouse can be considered as a storage area where interest specific or relevant data is stored irrespective of the source. patterns by shifting through large data repositories using pattern recognition techniques. databases or XML data. Data mining is normally used for models and forecasting. On the other hand. . PowerCenter 8 release has "Append to Target file" feature. Java transformation is introduced.

OLAP environments view the data in the form of hierarchical cube. Materialized view usually used in data warehousing has data. where the schema is inclined slightly towards normalization. OLAP stands for OnLine Analytical Processing. What are cubes? Multi dimensional data is logically represented by Cubes in data warehousing. This data helps in decision making. Explain the difference between star and snowflake schemas? Star schema: A highly de-normalized technique.What is an OLTP system and OLAP system? OLTP stands for OnLine Transaction Processing. Snow flake schema serves the purpose of dimensional modeling in data warehousing. the tables split further. The dimension and the data are represented by the edge and the body of the cube respectively. Business data analysis and complex calculations on low volumes of data are performed by OLAP. A cube typically includes the aggregations that are needed for business intelligence queries. The dimensional table itself consists of hierarchies of dimensions in star schema. Snow flake schema: The normalized principles applied star schema is known as Snow flake schema. Applications that supports and manges transactions which involve high volumes of data are supported by OLTP system. a view does not have data of itself. A star schema has one fact table and is associated with numerous dimensions table and depicts a star. The reason is that. then the snow flake design is utilized. What is the difference between view and materialized view? A view is created by combining data from different tables. whereas snow flake schemas have one or more parent tables. An insight of data coming from various resources can be gained by a user with the support of OLAP. If the dimensional table is split into many tables. performing calculations etc. On the other hand. OLTP is based on client-server architecture and supports transactions across networks. where as hierarchies are split into different tables in snow flake schema. It contains joins in depth. Differences: • • A dimension table will not have parent table in star schema. The drilling down data from top most hierarchies to the lowermost hierarchies can be done. What is snow flake scheme design in database? Snow flake schema is one of the designs that are present in database design. Every dimension table is associated with sub dimension table. The data stored by . Hence.

They can be fast as they allow users to filter the most important pieces of data from different legacy applications. What is Virtual Data Warehousing? A virtual data warehouse provides a compact view of the data inventory. They can be compared and combined mathematically. Conformed dimensions can be used across multiple data marts. On the other hand . The process of grouping random flags and text attributes in dimension by transmitting them to a distinguished sub dimension is related to junk dimension. the data is not stored in the database. employee_id What is conformed fact and conformed dimensions use for? Conformed fact in a warehouse allows itself to have same name in separate tables. It is associated with Business Intelligence Systems What is the difference between dependent and independent data warehouse? A dependent data warehouse stored the data in a central data warehouse. What is degenerate dimension table? A degenerate table does not have its own dimension table. When a view is created. It uses middleware to build connections to different data sources.calculating it before hand using queries. E. It contains Meta data.g. What is active data warehousing? An Active data warehouse aims to capture data continuously and deliver real time data. Any dimension table that is used by multiple fact tables can be conformed dimensions. This dimension is called a junk dimension. The column (dimension) which is a part of fact table but does not map to any dimension. It is derived from a fact table. The data is created when a query is fired on the view. data of a materialized view is stored. They provide a single integrated view of a customer across multiple business lines. What is junk dimension? A single dimension is formed by lumping a number of small dimensions. These conformed dimensions have a static structure. Whereas. Junk dimension has unrelated attributes.

that models an ER diagram represents the entire businesses or applications processes. planning strategies. Logical models are used to explore domain concepts. Data mining is used to examine or explore the data using queries. What are various methods of loading Dimension tables? Conventional load: Here the data is checked for any table constraints before loading. The warehouse has data coming from varied sources. an ER model will have both logical and physical model. finding meaningful patterns etc. An example of this can be city of an employee. OLAP tool helps to organize data in the warehouse using multidimensional models. The Dimensional model will only have physical model. Particular data may belong to some specific community (group of people) or genre. The Primary keys of fact dimensional table are the foreign keys of fact tables. Define the term slowly changing dimensions (SCD)? SCD are dimensions whose data changes very slowly. OLAP is Online Analytical processing that can be used to analyze and evaluate data in a warehouse. Data mining helps in reporting. What is Data Mart? Data mart stores particular data that is gathered from different sources.independent data warehouse does not make use of a central data warehouse. What is the difference between ER Modeling and Dimensional Modeling? ER modeling. It then defines a relationship between these entities. Difference between data modeling and data mining? Data modeling aims to identify all entities that have data. This dimension will change very slowly. These queries can be fired on the data warehouse. While Physical models are used to explore database design. This is to say. it can be used to convert a large amount of data into a sensible form. Describe the foreign key columns in fact table and dimension table? The primary keys of entity tables are the foreign keys of dimension tables. Data marts can be used to focus on specific business needs. Conceptual models are typically used to explore high level business concepts in case of stakeholders. logical or Physical data models. What is the difference between OLAP and data warehouse? A data warehouse serves as a repository to store historical data that can be used for analysis. Direct or Faster load: The data is directly loaded without checking for any constraints. The row of this data in the dimension can be . Data models can be conceptual. This diagram can be segregated into multiple Dimensional models.

It resembles a star. Explain the use lookup tables and Aggregate tables? An aggregate table contains summarized view of data. What is the difference between star and snowflake schema? Star Schema: A de-normalized technique in which one fact table is associated with several dimension tables. Dimension tables contain attributes or smaller granular data. What is data cleaning? How can we do that? Data cleaning is the process of identifying erroneous data. Snow Flake Schema: A star schema that is applied with normalized principles is known as Snow flake schema. like identifying conforming dimensions. Fact table contains the fact or the actual data. profit margin is a non-additive fact for it has no meaning to add them up for the account level or the day level. The fact table in start schema will have foreign key references of dimension tables. Usually numerical data is stored with multiple columns and many rows. BUS schema has conformed dimension and standardized definition of facts. What is real time data-warehousing? In real time data-warehousing. It reflects the businesses real time information. OR the change can be tracked What is a Star Schema? A star schema comprises of fact and dimension tables. consistency.either replaced completely without any track of old record OR a new row can be inserted. The facts can be useful if there are changes in dimensions. allow updating of records based on the lookup condition. the state of the business at that time will be returned. the warehouse is updated every time the system performs a transaction. Lookup tables. using the primary key of the target. Every dimension table is associated with sub dimension table. Define non-additive facts? The facts that can not be summed up for the dimensions present in the fact table are called nonadditive facts. typos etc. The data is checked for accuracy. This means that when the query is fired in the warehouse. For example. Define BUS Schema? A BUS schema is to identify the common dimensions across business processes. Data cleaning Methods: .

standard deviation. employee_perfomance_weekly can be considered lower levels of granularity. They work well with data that has a lower cardinality which means the data that take fewer distinct values. The advantages of Bitmap indexes are: They have a highly compressed structure. E. What is a level of Granularity of a fact table? A fact table is usually designed at a low level of Granularity.g.Used to detect syntax errors.Parsing . they don’t really have facts or any information but are more commonly used for tracking some information of an event. or clustering algorithms etc are used to find erroneous data. There are 3 basic types of cardinality: . range. Bitmap indexes have a significant space and performance advantage over other structures for such data. The Disadvantage of Bitmap indexes is: The overhead on maintaining them is enormous. Tables that have less number of insert or update operations can be good candidates. Eg. Employee performance is a very high level of granularity. Statistical Methods.This process gets rid of duplicate entries. Duplicate elimination . Bitmap indexes are useful in the data warehousing applications. Data Transformation . To find the number of leaves taken by an employee in a month. making them fast to read.Confirms that the input data matches in format with expected data. What is Bit Mapped Index? Bitmap indexes make use of bit arrays (bitmaps) to answer queries by performing bitwise logical operations. What is the purpose of Fact less Fact Table? Fact less tables are so called because they simply contain keys which refer to the dimension tables. Employee_performance_daily. What is Data Cardinality? Cardinality is the term used in database relations to denote the occurrences of data on either side of the relation. Their structure makes it possible for the system to combine multiple indexes together so that they can access the underlying table faster.values of mean. Hence. This means that we need to find the lowest level of information that can store in a fact table.

1: M relationship.High data cardinality:Values of a data column are very uncommon. e.1: M mandatory relationship The Characteristic Cardinality .1:1 relationship The Possession Cardinality .: email ids and the user names Normal data cardinality:Values of a data column are somewhat uncommon but never unique.g.: flag statuses: 0/1 Determining data cardinality is a substantial aspect used in data modeling.1:0 relationships The Physical Segment Cardinality .g.0: M relation The Child Cardinality . This is used to determine the relationships Types of cardinalities: The Link Cardinality .g.0:0 relationships The Sub-type Cardinality .0: M relationship The Paradox Cardinality . .: A data column containing LAST_NAME (there may be several entries of the same last name) Low data cardinality:Values of a data column are very usual. e. e.