You are on page 1of 6

1) How do you execute datastage job from command line prompt? Ans) using dsjob command as follows.

. Dsjob run-jobstatus projectname jobname 2) Functionality of Link Partitioner and Link collector? Ans) Link Partitioner: It actually splits data into various partitions or data flows using various partitions methods. Link Collector: It collects the data coming from partitions, merges it into a single data flow and loads to target.

3) Differentiate primary key and partition key? Ans) Primary key is a combination of unique and not null. It can be a collection of key values called composite primary key. Partition key is a just a part of primary key. There are several methods of partition like hasd,DB2, and Random etc. while using hash partition we specify the partition key.

4) Containers usage and types? Ans) container is a collection of stages used for the purpose of reusability. There are 2 types of containers. a) Local Container: job specific b) Shared Container: Used in any job within a project. We can create the jobs. We can compile the job. We can run the job. We can declare stage variable in transform, we can call routines,transform,macros,functions. We can write constraints. 5) what is surrogate key? Ans) it is a 4-byte integer which replaces the transaction/business/OLTP key in the dimension table. We can store up to 2 billion record.

6) why we need surrogate key? Ans) it is used for integrating the data may help better for primary key. Index maintenance, joins, table size, key updates, disconnected inserts and partitioning.

Explain the types of Dimension tables? Ans) Conformed Dimension: if a dimension tble is connected to more thn one fact table, the granularity that is defined in the dimenstion table is common across between the fact tables. Junk Dimension: the dimenstion table, which contains only flags. Monster Dimension: if rapidly changes in dimension are knows as monster dimension. De-generative dimenstion: it is line item oriented fact table design. What are stage variables? Stage variables are declaratives in transformer stage used to store values. Stage variables are active at the run time. (because memory is allocated at the run time). What is Sequencer? It sets the execution of server jobs. What are active and passive stages? Active Stage: active stage model the flow of data and provide mechanisms for combining data streams,aggregating data and converting data from one data type to another. EG: Transformer,Aggregator,Sort,Row Merger etc. Passive Stage: A passive stage handles access to database for the extraction or writing of data. Eg: IPC stage,File types,universe,unidata,DRS stage etc.

What is ODS? Operational data store is a staging area where data can be rolled back. What are Macros? They are built from data stage functions and do not require arguments. A number of macros are provided in the JOBCONTROL>H file to facilitate fetting information about the current job, and links ans sgage belonsg to the current job . These can be used in expressions ( for example for use in transformer stage) , job control routines, filenames and table names, and before/after subroutines.

What index is created on data warehouse? Bitmap index is created in DWH. What is container? A container is a group of stages and links. Containers enable you to simplify and modularize your server job designs by replacing complex areas of the diagram with a single container stage. You can also use shared containers as a way of incorporating server job functionality into pararel jobs. Data stage provies two types of conainer: Local Containers: these care created within a job and are only accessible by that job. A local container is edited in a tabbed page of the jobs diagram window. Shared Containers: these are created separately and are stored in the repository in the same way that jobs are. These are two types of shared container. What are stage variables,Derivations and constants? Stage variables: An intermediate processing variable that retains value during read and doesnt pass the value into target column.

Constraint: conditions that are either true or false that specifies flow of that with a link. Derivation : expression that specifies value to be passed onto the target column. What is Hash file stage and what is ue used for ? Used for lookups. It is like a reference table. It is also used in-place of ODBC,OCI tables for betther performance. What is modules and splitting in dynamic hashed file? In a hashed file, the size of the file keeps changing randomly. If the size of the file increases it is called as modulus. If th size of the file decreases it is called as splitting. Types of views in data stage director? There are 3 types of views in datastage director. Job view: dates of jobs compiled. Log view: status of job last run. Status view: wrning messges,event messages, program generated messages. Did you prameterize the job or hard coded the values in the jobs? Always parmeterized the job. Either the values are coming from job properties or from a parameter manager. There is no way you will hard-code some parameters in your jobs. The often parametersized variles ina job are : DB DSN name, username, password, dates W.R.T for the data to be looked against at.

Tell me one situation from your last project, where you had faced problem and how did did you solve it ? Ans) 1) the jobs in which data is read directily from OCI stages are running extremely slow. I had to stage the data before sending to the transfer to make the jobs run faster. 2) the job aborts in the middle of loading some 500,000 rows. Have an option either cleaning/deleting the loaded data from then run the fixed job or run the job again from the row the job had aborted. To make sure the load is proper we opted the former. What are routines and where /how are they writeen and have you writeen any routines before? Ans) Routines are stored in the routines branch of the data stage repository, where you can create,view or edit. The following are different types of routines: 1) transform functions 2) before- after job subroutines 3) job control routines How did you handle an Aborted sequencer? And) in almost all cases we have to delete the data inserted by this from DB manually and fix the job and then run the job again. What are sequencers? Ans) sequencers are job control programs that execute other jobs with preset job parameters. Why do we have to load the dimenstional tables first,then fact tables? Ans) As we load the dimensional tables the keys (primary) are generated and these keys (primary) are foreign keys in fact tables. How many places you can call routines? Ans) Four places you can call:

1) Transform of routine a) Date transformation b) Upstring transformation 2) Transform of the Before & After sunroutines 3) XML transformation 4) Web base transformation What is the difference between the filter stage and switch stage? Ans) there are two main differences, and probably some minor ones as well. The two main differences are as follows. 1) the filter stage can send input row to more than one output link. The switch stage cannot the C switch construct has an implicit break in every case. 2) The switch stage is llimited to 128 output links; the filter stage can have a theretically unlimited number of output links.(Note: this is not a challenge!) How do you eliminate duplicate rows? Ans) data stage provides us with a stage remove duplicates in enterprise edition. Using that stage we can eliminate the duplicates based on a key column. Is there a mechanism available to export/import individual data stage ETL jobs from the UNIX command line? Try dscmdexport and dscmdimport . wont handle the individual job requirement. You can only export

You might also like