1]. Dimension Modelling types along with their significance? Data Modelling is Broadly classified into 2 types.
a) E-R Diagrams (Entity - Relatioships). b) Dimensional Modelling. 2]. Dimensional modelling is again sub divided into 2 types? a)Star Schema - Simple & Much Faster. Denormalized form. b)Snowflake Schema - Complex with more Granularity. More normalized form. 3]. Importance of Surrogate Key in Data warehousing? Surrogate Key is a Primary Key for a Dimension table. Most importance of using it is it is independent of underlying database. i.e Surrogate Key is not affected by the changes going on with a database. 4]. Differentiate Database data and Data warehouse data? Data in a Database is a) Detailed or Transactional b) Both Readable and Writable. c) Current. 5]. What is the flow of loading data into fact & dimensional tables? Fact table - Table with Collection of Foreign Keys corresponding to the Primary Keys in Dimensional table. Consists of fields with numeric values. Dimension table - Table with Unique Primary Key. Load - Data should be first loaded into dimensional table. Based on the primary key values in dimensional table, the data should be loaded into Fact table. 6]. Orchestrate Vs Datastage Parallel Extender? Orchestrate itself is an ETL tool with extensive parallel processing capabilities and running on UNIX platform. Datastage used Orchestrate with Datastage XE (Beta version of 6.0) to incorporate the parallel processing capabilities. Now Datastage has purchased Orchestrate and integrated it with Datastage XE and released a new version Datastage 6.0 i.e Parallel Extender. 7]. Differentiate Primary Key and Partition Key? Primary Key is a combination of unique and not null. It can be a collection of key values called as composite primary key. Partition Key is a just a part of Primary Key. There are
several methods of partition like Hash, DB2, Random etc..While using Hash partition we specify the Partition Key. 8]. How do you execute datastage job from command line prompt? Using "dsjob" command as follows. dsjob -run -jobstatus projectname jobname 9]. What are Stage Variables, Derivations and Constants? Stage Variable - An intermediate processing variable that retains value during read and doesnt pass the value into target column. Derivation - Expression that specifies value to be passed on to the target column. Constant - Conditions that are either true or false that specifies flow of data with a link. 10]. What is the default cache size? How do you change the cache size if needed? Default cache size is 256 MB. We can incraese it by going into Datastage Administrator and selecting the Tunable Tab and specify the cache size over there. 11]. What are types of Hashed File? Hashed File is classified broadly into 2 types. a) Static - Sub divided into 17 types based on Primary Key Pattern. b) Dynamic - sub divided into 2 types i) Generic ii) Specific. Default Hased file is "Dynamic - Type Random 30 D" 12]. Containers : Usage and Types? Container is a collection of stages used for the purpose of Reusability. There are 2 types of Containers. a) Local Container: Job Specific b) Shared Container: Used in any job within a project. 13]. Compare and Contrast ODBC and Plug-In stages? ODBC : a) Poor Performance. b) Can be used for Variety of Databases. c) Can handle Stored Procedures.
Plug-In: a) Good Performance. b) Database specific.(Only one database) c) Cannot handle Stored Procedures. 14]. How to run a Shell Script within the scope of a Data stage job? By using "ExcecSH" command at Before/After job properties. 15]. How to handle Date convertions in Datastage? Convert a mm/dd/yyyy format to yyyy-dd-mm? We use a) "Iconv" function - Internal Convertion. b) "Oconv" function - External Convertion. Function to convert mm/dd/yyyy format to yyyy-dd-mm is Oconv(Iconv(Filedname,"D/MDY[2,2,4]"),"D-MDY[2,2,4]") 16]. Types of Parallel Processing? Parallel Processing is broadly classified into 2 types. a) SMP - Symmetrical Multi Processing. b) MPP - Massive Parallel Processing. 17]. What does a Config File in parallel extender consist of? Config file consists of the following. a) Number of Processes or Nodes. b) Actual Disk Storage Location. 18]. Functionality of Link Partitioner and Link Collector? Link Partitioner : It actually splits data into various partitions or data flows using various partition methods . Link Collector : It collects the data coming from partitions, merges it into a single data flow and loads to target. 19]. What is Modulus and Splitting in Dynamic Hashed File? In a Hashed File, the size of the file keeps changing randomly. If the size of the file increases it is called as "Modulus". If the size of the file decreases it is called as "Splitting". 20]. Types of vies in Datastage Director?
There are 3 types of views in Datastage Director a) Job View - Dates of Jobs Compiled. b) Log View - Status of Job last run c) Status View - Warning Messages, Event Messages, Program Generated Messages. 21]. What are the difficulties faced in using DataStage ? or what are the constraints in using DataStage ?
22]. How do you eliminate duplicate rows? delete from from table name where rowid not in(select max/min(rowid)from emp group by column name) 23]. How do we do the automation of dsjobs?
24]. What r XML files and how do you read data from XML files and what stage to be used?
25]. How do you catch bad rows from OCI stage?
26]. Why do you use SQL LOADER or OCI STAGE?
27]. Suppose if there are million records did you use OCI? if not then what stage do you prefer?
28]. How do you populate source files?
29]. How do you pass filename as the parameter for a job?
30]. How do you pass the parameter to the job sequence if the job is running at night?
31]. What happens if the job fails at night?
32]. What is SQL tuning? how do you do it ?
33]. What is project life cycle and how do you implement it?
34]. How will you call external function or subroutine from datastage?
35]. How do you track performance statistics and enhance it?
36]. How do you do oracle 4 way inner join if there are 4 oracle input files?
37]. What is the order of execution done internally in the transformer with the stage editor having input links on the lft hand side and output links?