This action might not be possible to undo. Are you sure you want to continue?
Create a Excel spreadsheet with new and old names. Export the whole project as a dsx. Write a Perl program, which can do a simple rename of the strings looking up the Excel file. Then import the new dsx file probably into a new project for testing. Recompile all jobs. Be cautious that the name of the jobs has also been changed in your job control jobs or Sequencer jobs. So you have to make the necessary changes to these Sequencers. 2)Does the selection of 'Clear the table and Insert rows' in the ODBC stage send a Truncate
statement to the DB or does it do some kind of Delete logic. Ans: There is no TRUNCATE on ODBC stages. It is Clear table blah blah and that is a
delete from statement. On an OCI stage such as Oracle, you do have both Clear and Truncate options. They are radically different in permissions (Truncate requires you to have alter table permissions where Delete doesn't).
3)Tell me one situation from your last project, where you had faced problem and How did u solve it? Ans: A. The jobs in which data is read directly from OCI stages are running extremely
slow. I had to stage the data before sending to the transformer to make the jobs run faster. B. The job aborts in the middle of loading some 500,000 rows. Have an option either cleaning/deleting the loaded data and then run the fixed job or run the job again from the row the job has aborted. To make sure the load is proper we opted the former. 4) The above might rise another question: Why do we have to load the dimensional tables first,
then fact tables: ans: As we load the dimensional tables the keys (primary) are generated and these keys
(primary) are Foreign keys in Fact tables. 5) How will you determine the sequence of jobs to load into data warehouse?
First we execute the jobs that load the data into Dimension tables, then Fact tables, then load the Aggregator tables (if any). 6) What are the command line functions that import and export the DS jobs? Ans: A. dsimport.exe- imports the DataStage components. B. dsexport.exe- exports the DataStage components. 7) What is the utility you use to schedule the jobs on a UNIX server other than using Ascential
Director? Ans: Use crontab utility along with dsexecute() function along with proper parameters
"AUTOSYS": Thru autosys u can automate the job by invoking the shell script written to schedule the datastage jobs. 3ans:) "Control_M Scheduling Tool": Thru Control_M u can automate the job by invoking the shell script written to schedule the datastage jobs.
8) What will you in a situation where somebody wants to send you a file and use that file as an
input or reference and then run job. Ans: A. Under Windows: Use the 'WaitForFileActivity' under the Sequencers and then
run the job. May be you can schedule the sequencer around the time the file is expected to arrive. B. Under UNIX: Poll for the file. Once the file has start the job or sequencer depending on the file. 9) Read the String functions in DS
2. Try to have the constraints in the 'Selection' criteria of the jobs itself. 3. ] length ] string [ delimiter. 11. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts. Minimise the usage of Transformer (Instead of this use Copy. This will eliminate the unnecessary records even getting in before joins are made. 15. modify. make sure that there is not the functionality required in one of the standard routines supplied in the sdk or ds utilities categories. Tuning should occur on a job-by-job basis. Try not to use a sort stage when you can use an ORDER BY clause in the database. 2. 14. Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries 8. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using Hash/Sequential files for optimum performance also for data recovery in case job aborts. Use the power of DBMS. 12. 7. 10. Row Generator) Use SQL Code while extracting the data .ans: Functions like  -> sub-string function and ':' -> concatenation operator Syntax: string [ [ start. 4. Used sorted data for Aggregator. Bulk loaders are generally faster than using ODBC or OLE. 16. Constraints are generally CPU intensive and take a significant amount of time to process. Tuned the 'Project Tunables' in Administrator for better performance. Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution of the jobs. Filter. Before writing a routine or a transform. This may be the case if the constraint calls routines or external macros but if it is inline code then the overhead will be minimal. Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance of jobs 6. instance. Make every attempt to use the bulk loader for your particular database. Removed the data not used from the source as early as possible in the job. Using a constraint to filter a record set is much slower than performing a SELECT … WHERE…. Secondanswer: 1. 13. 5. repeats ] 10) Do u know about METASTAGE? Ans: in simple terms metadata is data about data and metastge can be anything like DS(dataset.etc) 11) What are other Performance tunings you have done in your last project to increase the performance of slowly running jobs? Ans: 1.sq file. 9. updates and selects. If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs in parallel.
37. IPC Stage that is provided in Server Jobs not in Parallel Jobs Check the write cache of Hash file. Staged the data coming from ODBC/OCI/DB2UDB stages or any database on the server using Hash/Sequential files for optimum performance also for data recovery in case job aborts. Try to have the constraints in the 'Selection' criteria of the jobs itself. Used sorted data for Aggregator. Don\'t use more than 7 lookups in the same transformer. Use Preload to memory option in the hash file output. Tuned the 'Project Tunables' in Administrator for better performance. 11. 34. 33. 7. If the same hash file is used for Look up and as well as target. Sorted the data as much as possible in DB and reduced the use of DS-Sort for better performance of jobs Removed the data not used from the source as early as possible in the job. 12. Before writing a routine or a transform. 29. 32. 13) How did you handle reject data? . If an input file has an excessive number of rows and can be split-up then use standard logic to run jobs in parallel. 21. Handle the nulls Minimise the warnings Reduce the number of lookups in a job design Use not more than 20stages in a job Use IPC stage between two passive stages Reduces processing time Drop indexes before data loading and recreate after loading data into tables Gen\'ll we cannot avoid no of lookups if our requirements to do lookups compulsory. 30. 22. Use ANALYZE. 5. This would also minimize overflow on the hash file. Reduce the width of the input record . Try not to use a sort stage when you can use an ORDER BY clause in the database. This may be the case if the constraint calls routines or external macros but if it is inline code then the overhead will be minimal. Use Write to cache in the hash file input. This will improve the performance. Write into the error tables only after all the transformer stages. 12) How did you handle an 'Aborted' sequencer? Ans:) n almost all cases we have to delete the data inserted by this from DB manually and fix the job and then run the job again. Make sure your cache is big enough to hold the hash files. 17. 35. There is no limit for no of stages like 20 or 30 but we can break the job into small jobs then we use dataset Stages to store the data. 27. 4. Also. Make every attempt to use the bulk loader for your particular database. This will eliminate the unnecessary records even getting in before joins are made. 25. break the input into multiple threads and run multiple instances of the job.FILE or HASH.remove the columns that you would not use. 15. Using a constraint to filter a record set is much slower than performing a SELECT … WHERE…. 28. Worked with DB-admin to create appropriate Indexes on tables for better performance of DS queries Converted some of the complex joins/business in DS to Stored Procedures on DS for faster execution of the jobs. 13. 18. 19. Tuned the OCI stage for 'Array Size' and 'Rows per Transaction' numerical values for faster inserts. 20. Use the power of DBMS. check the order of execution of the routines. If possible. 36. 14.HELP to determine the optimal settings for your hash files. introduce new transformers if it exceeds 7 lookups. 24. Constraints are generally CPU intensive and take a significant amount of time to process. 8. 26.3. Bulk loaders are generally faster than using ODBC or OLE. updates and selects. 23. 9. 16. Tuning should occur on a job-by-job basis. disable this Option. 6. 10. Cache the hash files you are reading from and writting into. If the hash file is used only for lookup then \"enable Preload to memory\". make sure that there is not the functionality required in one of the standard routines supplied in the sdk or ds utilities categories. 31.
where you can create. Youspecify the category when you create the routine. Link Collector . DataStage has a number of built-intransform functions which are located in the Routines ➤ Examples➤ Functions branch of the Repository. These are functions that you can use whendefining custom transforms. you can specify asubroutine to run before or after the job. DataStage has a number of built-in before/after subroutines. Using the Routinedialog box. If NLS is . Rejected data is typically bad data like duplicates of Primary keys or nullrows where data is expected. You can also define your ownbefore/after subroutines using the Routine dialog box. Thefollowing program components are classified as routines:• Transform functions. These functionsare stored under the Routines branch in the Repository.• Custom UniVerse functions. You can also defineyour own transform functions in the Routine dialog box. DataStage has a number of built-intransform functions which are located in the Routines ➤ Examples➤ Functions branch of the Repository. or before or after an activestage.where you can create. view or edit. 15) What are Routines and where/how are they written and have you written any routines before? Ans: Routines are stored in the Routines branch of the DataStage Repository. view.which are located in the Routines ➤ Built-in ➤ Before/Afterbranch in the Repository. The following are different types of routines: 1) Transform functions 2) Before-after job subroutines 3) Job Control routines Ans) RoutinesRoutines are stored in the Routines branch of the DataStage Repository. Using the Routinedialog box.Used for partitioning the data. These are specialized BASIC functionsthat have been defined outside DataStage. When designing a job. you can get DataStage to create a wrapper that enablesyou to call these functions from within DataStage. Youspecify the category when you create the routine.• Before/After subroutines.• Custom UniVerse functions. If NLS is enabled.Used for collecting the partitioned data. So Reject link has to be defined every Output link you wish to collect rejected data. 14) If worked with DS6. or edit them using the Routine dialog box. Ans3): RoutinesRoutines are stored in the Routines branch of the DataStage Repository. you can get DataStage to create a wrapper that enablesyou to call these functions from within DataStage. These functionsare stored under the Routines branch in the Repository. where you can create. or edit them using the Routine dialog box.0 and latest versions what are Link-Partitioner and Link-Collector used for? Ans: Link Partitioner . you can specify asubroutine to run before or after the job. or before or after an activestage.which are located in the Routines ➤ Built-in ➤ Before/Afterbranch in the Repository. You can also defineyour own transform functions in the Routine dialog box. When designing a job. These are specialized BASIC functionsthat have been defined outside DataStage. DataStage has a number of built-in before/after subroutines.• Before/After subroutines. You can also define your ownbefore/after subroutines using the Routine dialog box. view. Thefollowing program components are classified as routines:• Transform functions. These are functions that you can use whendefining custom transforms.Ans: Typically a Reject-link is defined and the rejected data is loaded back into data warehouse.
the transforms appear under the DSTransform… command on the Suggest Operand menu. When using the Expression Editor.• Transform functions• Custom UniVerse functions• ActiveX (OLE) functionsExpressionsAn expression is an element of code that defines a value. 17) What are OConv () and Iconv () functions and where are they used? Ans: IConv() . After import. but you can specify your owncategory when importing the functions. which arestored in the Repository and can be used by other DataStage jobs. The word “function” isapplied to many components in DataStage:• BASIC functions. The word“expression” is used both as a specific part of BASIC syntax. view or edit them using the Transform dialogbox.Converts a string to an internal storage format OConv() . This creates awrapper that enables you to call the functions. although called “functions.DataStage is supplied with a number of built-in transforms (which youcannot edit). and the expression that performs the transformation. Transforms specify the type of data transformed. Job controlroutines are specified in the Job control page on the Job Properties dialogbox.Converts an expression to an output format.Programming in DataStage 9-5you can access the BASIC functions via the Function… commandon the Suggest Operand menu.The following items. DataStage functions begin with DS to distinguish themfrom general BASIC functions. Job control routines are not stored under the Routines branch in theRepository. You can also define your own custom transforms. . the type it is transformedinto. they all appear under the DS Routines… command on theSuggest Operand menu. key expressions and constraints inTransformer stages• Defining a custom transformIn each of these cases the DataStage Expression Editor guides you as towhat programming elements you can insert into the expression. Such functions aremade accessible to DataStage by importing them. Such a routine is usedto set up a DataStage job that controls other DataStage jobs.• ActiveX (OLE) functions.FunctionsFunctions take arguments and return a value. such functions are located in the Routines ➤Class name branch in the Repository. youcan view and edit the BASIC wrapper using the Routine dialogbox.you can access the DataStage BASIC functions via the DS Functions…command on the Suggest Operand menu. You can use ActiveX (OLE) functions asprogramming components within DataStage.When using the Expression Editor. These are special BASIC functionsthat are specific to DataStage.9-4 Ascential DataStage Designer Guideyou should be aware of any mapping requirements when usingcustom UniVerse functions.enabled.A special case of routine is the job control routine.TransformsTransforms are stored in the Transforms branch of the DataStage Repository.• DataStage BASIC functions.” are classified as routinesand are described under “Routines” on page 9-3. If a function uses data in a particularcharacter set.When using the Expression Editor. Areas ofDataStage where you can use such expressions are:• Defining breakpoints in the debugger• Defining column derivations. When using the Expression Editor.where you can create. These are mostly used in job controlroutines. By default. all of these components appear underthe DS Routines… command on the Suggest Operand menu. it is your responsibility to map the data to and fromUnicode. When using the ExpressionEditor. These are one of the fundamental buildingblocks of the BASIC language. and todescribe portions of code that you can enter when defining a job.
5) In case if you are just upgrading your DB from Oracle 8i to Oracle 9i there is tool on DS CD that can do this for you. This step is for moving DS from one machine to another. I have taken in doing so: 1) Definitely take a back up of the whole project(s) by exporting the project as a .defineoconvformat)) 18) Explain the differences between Oracle8i/9i? ans: mutliproceesing. ocnv(iconv(datecommingfromi/pstring. version 7. 21) Did you Parameterize the job or hard-coded the values in the jobs? . 6) Do not stop the 6. OCI tables for better performance.740 u can use this 740 in derive in ur own format by using oconv.SOMEXYZ(seein help which is iconvformat).0 install process collects project information during the upgrade. In general we use Type-30 dynamic Hash files. suppose u want to change mm/dd/yyyy to dd/mm/yyyy now u will use iconv and oconv. 20) What is Hash file stage and what is it used for? Ans: Used for Look-ups.databases more dimesnionsal modeling 19) What are Static Hash files and Dynamic Hash files? Ans: As the names itself suggest what they mean. There is NO rework (recompilation of existing jobs/routines) needed after the upgrade.databases more dimesnionsal modeling ans: Wrote:mutliproceesing. 3) After installing the new version import the old project(s) and you have to compile them all again. It is like a reference table.0 server before the upgrade.e only datastage can understand example :. You can use 'Compile All' tool for this. if so tell us some the steps you have taken in doing so? Ans: Yes.dsx file 2) See that you are using the same parent folder for the new version also for your old jobs using the hard-coded file path to work. The following are some of the steps. 4) Make sure that all your DB DSN's are created with the same name as old one's.iconv is used to convert the date into into internal format i.X.date comming in mm/dd/yyyy format datasatge will conver this ur date into some number like :. The Data file has a default size of 2Gb and the overflow file is used if the data exceeds the 2GB size. It is also used in-place of ODBC. We can also use the Hash File stage to avoid / remove dupilcate rowsby specifying the hash key on a particular fileld 21) Have you ever involved in updating the DS versions like DS 5.
The often Parameterized variables in a job are: DB DSN name. username.T for the data to be looked against at.Always parameterized the job. Ans: . dates W. password.R. There is no way you will hard–code some parameters in your jobs. Either the values are coming from Job Properties or from a ‘Parameter Manager’ – a third part tool.