1)Data Stage Architecture client components:-DS designer,DS administrator,DS manager,DS director 2) How do you create a project?

ans>>> through datastage administrator you can also set the some properties at project level. 3) How many projects can you create maximum? ans>>depends on licensing 4) How do you create users and give the permission 5) What are the permissions available in Administrator 6) Is it possible an operator to view the full log information 7) Tell me the type of jobs (Active or passive also odbc and plugins) ans>> server jobs,parallel jobs,mainframe jobs and job sequence 8)How do you lookup through seq file ans>> not possible 9)What is the Stage variable, ans>> An intermediate processing variable that retains value during read and doe snt pass the value into target column. 10)What does a constraint do ans>> constraint is a condition that evaluates to either true or false. It deter mines flow of data on input link. 11) What is the derivation do ans>> Derivation is an expression that specifies value to be passed on to the ta rget column. 12) Tell me the sequence of execution (StageVariable, Constraint, Derivation) ans>> StageVariable, Constraint, Derivation 13) Why do you use hash file ans>> hash file is used in server jobs only.It is used for lookup and to remove duplicates. 14) Difference between hash file and seq file ans>> 15) Name some type of seq file ? 16) What is the size of your hash file ?????? 32-bit and 64-bit....for 32-bit it is 2GB 17) how do we calculate the size of our hash files 18) What is hash algorithm 19) How many types are available in hash file ans>> static hash file,dynamic hash file 20) Which type of hash file do you used, why ans>> dynamic hash file 21) How do you create a hash file ans >> throguh designer we can create hash file... we need to use hash file stage for that. 22) How do you specify the hash file 23) Is it possible to view the records in hash file through any editor, if yes w hich editor 24) What is the extension of a hash file 25) Is it possible to create a hash file contains all the columns in a normal se q file(with out key columns)

data file grows dynamically.data file is of fixed size while in the case of dynamic hash file. Is it possible to control other job through a stage. if yes how.specify ur condtition which will evaluate to true or fa lse. where do you define it ans>> envt variables are like global variables which can be used across the proj ect Environment variable can be defined in the Administrator.we can check the constraint.. 38) What is routine check.. no t job control coding... there is data file and overflow file is used. 27) Tell me the different types of stages ans>> basically there are two types of stages in datatage active stages and passive stages in active stage some kind of processing is done like sorting. (Explain with some exa mples) 33) Difference between job parameter.filtering... 39) Different types of routine . where do you define it ans>> job parameter are parameters through which we can pass the details/values are run-time.. you will get tabular format with link name. what is it use ans>> no 36) How do you attach a job in job control check. 35) Have you written job control.. Job parameter is one which through which one can override project wide set defau lts and Can be applicable to the particular job Stage Variable is one which is locally executed for the active stage (explain wi th some examples) 34) While running a job.In the case of s tatic hash file.constraint and derivation.aggregati on while passive stages are those in which procesing is not done like eq file. where as static hash file does not beyond the specified file. environment variable and stage variable ans>> Environment variable is one through which one can define project wide defa ults. and what are the stages supported it clarify. In both scenarios.. It can be defined through the designer at job level (explain with some examples) 32) What is the environment variable.Doubkle click on the constraint. 31) What is the job parameter.odbc 29) Is it possible to check constraint at Active stages. you will get stage variable. 30) Where do you define the constraint? ans>> in the transformer output link..constraint...ans>> no 26) Difference between static and dynamic hash file ans>> Dynamic hash file allocates dynamically memory size..... In the constraint column..otherwise.abort after rows as column heading.. 37) How do you set a job parameter in job control check. if yes how ans>> through transformer stage..

The Job Resources dialog box appears.. what are the ways to s top the job ans>> Through Director.. 40) What is the use of routine check 41) Where the routines are stored .and through Cleanup resources in the director. can have reject link also 45) Is it possible to join more than two seq file using merge stage..through data stage director 55) A job is running. Viewing Logs.left outer. The job status changes to Compiled and no evidence will remain that . 56) What is the use of log file ans>> You can check out how the job is executed. parallel merge stage more than one input link(one master link and more than one reference links).check 53) What does the director do ans>> Job Locks.If the job failed then you can also check for errors.check.????? check 42) How many windows are shown in DS designer. . from which you can view and clean up the resources of the selected job: Cleanup resource allows one to remove locks or / kill the jobs When you clear a job s status file you reset the status records associated with all stages in that job. but I would like to stop the job. choose Job ä Clear Status File from the menu bar.Ru n the jobs. Job Report in XML.righ outer.. 54) Have you schedule the job.. how ans>> yes. To clear the job status file. 57) Describe cleanup resource and clear status file ans>> Cleanup resources features only applies to server jobs. Scheduling. what are they ans>> Designer window.. if no is th ere any stage to solve this 46) Name all the join type ans>> inner join. if yes how ans>> yes. . The Cleanup Resour ces command lets you: View and end job processes View and release the associated locks go to Choose Job ä Cleanup Resources from the menu bar.pallete 44) What is the use of merge stage ans>> Merge stage is used to merge two input sets and produce as one or more ou tput set server merge stage A Merge stage is a passive stage that can have no input links and one or more output links.full outer join 47) How do you extract data from database? ans>> Through ODBC and OCI 48) Name all the update action ans>> there are 8 update actions (found on the input link of ODBC stage) 49) In job control which language is used to write ans>> BASIC 51) Is it possible to run a job in DS designer if yes how ans>> yes 52) Is it possible to lookup a lookup hash file. repository. Job Resources..

.....or dsjob.the job has ever run.like u sed to create or edit routines .. we can have multiple annotation s in a job and they can be copied to other jobs as well where as we can have only one Description Annotation per job and they cannot be co pied into other jobs 77) What are the advantages of Description Annotation ans>> Description Annotation is gets automatically reflected in the Manager and director 78) What are the various types of compilation and run time errors u have faced ? ans>> 79) explain the allow stage write cache option for hash files and what are its im plications.export and import of jobs or the entire project 73) What is the use of table definition in Manager 74) What is the difference between local container and shared container ans>>a local container can be used within the job itself and does not appear in the repository window...?????? ans>> 67) What is the difference between run and validate a job ans>> run is to execute the job whereas validate is to check for errors like fil e exists odbc connections. Whereas shared containers are available throughout the project and appear in the repository window 75) what are containers ans>> containers are a collection of group stages and links which can be reused (s hared container) 76) Difference between Annotation and Description Annotation ans>>Annotations are short or long descriptions.. where is it stored 71) How do you write a routine ans>> we can write a routine by going to the routine category in ds manager and selecting create routine option routines are written in basic language 72) What is the use of release a job ans>> releasing a job is significant to clean up the resources of a job which is locked or idle 68) What is the use of DS manager a) Ds manager is used to edit and manage the contents of the repository.table definations.. 58) Situations wherein there is a need to clear the status file 59) Is it enabled in DS director if not how to enable it ans>> a)by default clean up resource and clear status file is not enabled in the director one can enable it through administrator by checking the Enable Job administrator in Director in general tab 61) How do you find the no of rows per second in DS director ans>> in designer we can do it by choosing view performance statistics but in dire ctor it is through tools-New monitor 62) How do you know the job status ans>> through director .? ans>> it caches the hash file in memory should no use this option when we are rea ding and writing to the same hash file 80) What are the caching properties while creating hash files? 81) Where do u specify the size of ur hash file . 69) How do you import export the project ans>> Through data stage manager 70) What is a Meta data.. fatal message ans>> Warnings do not abort the job where as fatal messages abort the job 65) What is a phantom error how do u resolve it ?.status 63) What is the difference between warning.intermediate existence of hash files ..

very fast. These (assuming that they don't combine into single processes) can form a pipel ine so that. 13)Is look-up stage returns multi-rows or single rows? 14)Why we need sort stage other than sort-merge collective method and perform so rt option in the stage advanced properties? . Behind Orchestrate Framework. each of every stage would be converted to correspo nding operator within OSH. Ochestrate provides the OSH framework. non-combinability? 11)What are schema files? 12)Why we need datasets rather than sequential files? Ans: A sequential file as a source or target needs to be repartitioned as it is (as the name suggests) a single sequential stream of data.e. Do you understand the relationship between stages and Orchestrate operators? Ess entially each stage generates an operator. it might have the form Code: Op1 < DataSet1 | op2 | op3 | op4 | op5 | op6 > DataSet2 Very slick. transform.1)Lookup Stage :Is it Persistent or non-persistent?(What is happening behind the scene) Ans: Look up stage is non-persistent 2)Is Pipeline parallelism in PX is same what Interprocessesor does in Server? Ans: Yes and no. etc. Best Regards Brian 9)Can we give node allocations i. Hello Pradeep. if you examine the generated OSH. Due to many input lin ks and output links supported in Generic stage. Pipeline parallelism in parallel jobs is much more complete. A dataset can be saved across nodes using the partitioning method selected so it is always faster when used as a source or target. So once you assign certain operator in Generic stage. 6)What is the symbol we will get when we are using round robin partitioning meth od? 7)If we check the preserve partitioning in one stage and if we don t give any part itioning method which partition method it will use? 8)What is orchestrate? Ans: Orchestrate was product from Torrent before being bought over by Ascential. The IPC stage buffers data so that the next process (or next stage in the same process) can pick it up. You could check the option "Generated OSH visible for Parallel jobs in All p rojects" in Administrator->Projects->Properties->Parallel tab in order to observ e the OSH code generated in Designer->Job Properties->Generated OSH tab as you c ompile your job.. for one stage 4 nodes and for next stage 3 n odes? 10)What is combinability. 3)How can we maintain the partitioning in Sort stage? 4)Where we need partitioning (In processing or some where) 5)If we use SAME partitioning in the first stage which partitioning method it wi ll take ans>> it will maintain the partitions done in previous stage as it is. But the option name and option values with the operator you assigned in Generic stage always bring out design overhead. copy. it appear s to utilize that stage your specified operator points to. export. you could achieve multiple opera tions just in one stage. which has the UNIX command line interfac e. and th en the OSH will be executed by Orchestrate Framework to process your ETL process es. such as import.

. not for scd 2.so it would be faste r.while sort-merge is a colletion method 15)For surrogate key generator stage where will be the next value stored? 16)When actually re-partition will occur? 17)In transformer stage can we give constraints? Ans: Yes.ans >> sort can be performed on each partitions separately. We Can give 18)what is a constraint in the Advanced tab? 19)What is the diff between Range and Range Map partitioning? ans>>range is one of the method of partitioning while range map partitioning is What is the difference between Job Control and Job Sequence What is the max size of Data set stage? (PX) no limit? ***how to develop the SCD using LOOKUP stage? we can impliment SCD by using LOOKUP stage. Second Normal Form .\\ Data warehousing questions: 1)What's A Data warehouse 2)What is ODS? 3)What is a dimension table? 4)What is a lookup table? 5) Why should you put your data warehouse on a different system than your OLTP s ystem? 6) What are the various Reporting tools in the Market? 7)What is Normalization. Why you need Modify Stage? Modify Stage is used for the purpose of Datatype Change. What are the errors you expereiced with data stage ans>>Here in datastage there are some warnings and some fatal errors will come i n the log file. logfile must be cleared with no warnings also. Third Normal Fo rm? 8) What is Fact table? 9) What are conformed dimensions? 10) What are the Different methods of loading Dimension tables? 11)What is conformed fact? 12)What does level of Granularity of a fact table signify? . but it is for only scd1. what are the main diff between server job and parallel job in datastage? server jobs: few stages logical intensive does not use MPP systems. If there is any fatal error means the job got aborted but if there are any warni ngs are there means the job not aborts but we have to handle those warnings also. First Normal Form.

13) What is the Difference between OLTP and OLAP? 14) What is SCD1 . One simple options would be to select the option "CombinalbeOperator" options to False in Transformer stage . Its a behaviour of Datastage to combine possible operators in to one and compile the Combined code so that. One point I could thin k of is. You can optionally avoid with the couple of options.? What is hybrid slowly changing dimension? what is junk dimension? what is the difference between junk dimension and degene rated dimension? can a dimension table contains numeric values? What is the difference between view and materialized view? What is surrogate key ? where we use it expalin with examples As I mentioned earliar. Usually all Active operators are combinable.. during run time it act as just single operator which performs all the combined operators. SCD2 . I have limited knowledge on this. SCD3? 15) Why are OLTP database designs not generally a good idea for a Data Warehouse ? 16) What is BUS Schema? 17) What is real time data-warehousing? 18) What are Semi-additive and factless facts and in which scenario will you use such kinds of fact tables? 19) Differences between star and snowflake schemas? 20) What is a Star Schema? 21) What is a general purpose scheduling tool? 22) What are Data Marts? 23) How are the Dimension tables designed? 24) What are non-additive facts? 25) What type of Indexing mechanism do we need to use for a typical datawarehous e? 26) What Snow Flake Schema? 27) What are Aggregate tables? 28)What is Dimensional Modelling? Why is it important ? 29) Why is Data Modeling Important? 30) What is data mining? 31) What is ETL? 32) What is ER Diagram? 33) Which columns go to the fact table and which columns go the dimension table? 34) What are modeling tools available in the Market? 35) How do you load the time dimension? 36) What is a CUBE in datawarehousing concept? 37) What is data validation strategies for data mart validation after loading pr ocess ? 38) what is the datatype of the surrgate key ? 39) What is degenerate dimension table? 40) What are the methodologies of Data Warehousing.? What is a linked cube? What is the main difference between Inmon and Kimball philosophies of data wareh ousing? What is Data warehosuing Hierarchy? What is the main differnce between schema in RDBMS and schemas in DataWarehouse. the resource consumption of Each stages can be minimized by clubbing al l in a single stage and enabling 'CombinableOperator'. ..

0 i. It is useful for debugging purpose.. Datastage used Orchestrate with Datastage XE (Be ta version of 6. correct or not? Orchestrate itself is an ETL tool with extensive parallel processing capabilitie s and running on UNIX platform. Now Data stage has purchased Orchestrate and integrated it with Datastage XE and released a new version Datastage 6.e Parallel Extender. . So as you mentioned. Dear Deepak.0) to incorporate the parallel processing capabilities. maybe one stage we've seen in the Palette would be compiled with multiple operators behind at runtime.

Sign up to vote on this title
UsefulNot useful