Professional Documents
Culture Documents
Capgemini Interview Questions Answers
Capgemini Interview Questions Answers
11.
If we use SAME partitioning in the first stage which
partitioning method it will take?
Ans:
DataStage uses Round robin when it partitions the data initially.
12.
What is the use of modulus partitioning?
Ans:
If key column is an integer, then we will use modulus. Of course, we can use
hash partition as well but performance wise modulus is better because
depending on the hash code hash partition will send the data to different
nodes .Hash requires more time to process the data.
13.
What are the types of transformers used in DataStage PX?
Ans:
Transformers are 2 types. a. Basic Transformer b. Parallel transformer
Difference:
A Basic transformer compiles in "Basic Language" whereas a Normal
Transformer compiles in "C++".
Basic transformer does not run on multiple nodes whereas a Normal
Transformer can run on multiple nodes giving better performance.
Basic transformer takes less time to compile than the Normal Transformer.
Usage:
A basic transformer should be used in Server Jobs.
Normal Transformer should be used in Parallel jobs as it will run on multiple
nodes here giving better performance.
14. What are Performance tunings you have done in your last project to increase the
performance of slowly running jobs?
Ans:
The Peek stage is a Development/Debug stage. It can have a single input link
and any number of output links. The Peek stage lets you print record column
values either to the job log or to a separate output link as the stage copies
records from its input data set to one or more output data sets, like the Head
stage and the Tail stage. The Peek stage can be helpful for monitoring the
progress of your application or to diagnose a bug in your application.
16.
What is row generator? When do you use it?
Ans:
The Row Generator stage produces a set of mock data fitting the specified
metadata.
This is useful where we want to test our job but have no real data available to
process.
17.
What is RCP? How it is implemented?
Ans:
DataStage is flexible about Meta data. It can cope with the situation where
Meta data isnt fully defined. You can define part of your schema and specify
that, if your job encounters extra columns that are not defined in the Meta
data when it actually runs, it will adopt these extra columns and propagate
them through the rest of the job. This is known as runtime column
propagation (RCP).
This can be enabled for a project via the DataStage Administrator, and set for
individual links via the Outputs Page Columns tab for most stages or in the
Outputs page General tab for Transformer stages. You should always ensure
that runtime column propagation is turned on.
RCP is implemented through Schema File.
The schema file is a plain text file contains a record (or row) definition.
18.
What are Stage Variables, Derivations and Constants?
Ans:
25.
What is meant by Junk dimension?
Ans:
Junk dimensions are dimensions that contain miscellaneous data (like flags
and indicators) that do not fit in the base dimension table.
26.
What is meant by Degenerated dimension?
Ans:
A degenerate dimension is data that is dimensional in nature but stored in a
fact table.
Degenerate Dimension:
This is nothing but dimension data stored within fact tables.
Example: If you have a dimension that has Order Number and Invoice number
fields and have one-to-one relational ship with fact table.
In such case, you may want to go with one table with billion records instead
of two tables with billion records.
You would consider storing these fields within fact itself instead of keeping it
27.
What type of data you are getting?
Ans:
Customers data only.
28.
I have created a dataset on 4 node configuration file. Tell me
total how many files will be crated? What are those?
Ans:
Total 5 files will be created. Those are 1 descriptor file and 4 dataset files.
29.
I have created a dataset on 2 node configuration file. Can I use
the same dataset on 4 node configuration file?
Ans:
Yes. We can do that. But vice versa is not possible means if you create a
dataset on 4 node configuration file and try to reuse the same dataset on 2
node configuration file, Job will get executed without any error, but you will
not get the expected data in output.
30.
What value would be listed on the datasets when the column
value is "NULL"?
Ans: Dataset will show NULL when there is a null in the data.
Oracle Interview Questions & Answers
31.
What is meant by Referential Integrity?
Ans: Referential integrity is used to maintain relationship between tables and
to maintain consistent data in tables that means not duplicated data.
32.
How do you connect to the oracle server?
Ans: By creating dsn name, we can connect to the server.
33.
What is the difference between Union and Union All?
Ans: Union sorts the combined set and removes duplicates where as Union
All does not remove duplicates.
Union All is faster than Union because Unions duplicates elimination required
sorting operation which takes time.
34.
What is the difference between Delete and Truncate?
Ans:
a. Delete is a DML command where as Truncate is DDL command.
b. We can write WHERE clause in Delete operation where as we cannot
write Where clause in Truncate operation.
c. We can rollback the data in Delete where as we cannot rollback the
data in Truncate.
44.
How to fins 5th highest salary from a table?
Ans: Select min (sal) as high5 from EMP where sal in (select distinct top 5 sal
from EMP order by sal desc);
45.
How to find the nth salary from a table?
Ans: Select distinct (e1.sal) from EMP e1 where &N= (Select count (distinct
(e2.sal)) from EMP e2 where e1.sal=e2.sal);
46.
Tell me the syntax of decode statement?
Ans: The decode function has the same functionality of If-then-else
statement.
Decode (expression, search, result [search, result] [, default]
Expression is the value to compare. Search is the value that is
compared against expression. Result is value returned, if the
expression equals to search.
Example:
Select
ename,decode(empid,1000,IBM,2000,Microsoft,3000,capgemini,
tcs) as result from emp;
The above decode statement is equivalent to the following IF-THENELSE statement:
IF empid = 1000 THEN
result := 'IBM';
ELSIF empid = 2000 THEN
result := 'Microsoft';
ELSE
result := 'Capgemini';
END IF;
Note:
And also prepare on SQL queries which are in SQL Question &
Answers document.
Prepare Unix commands as well (diff b/w find and grep and how to
delete a dataset by using Orchadmin command)
Prepare on the below question as well:
1. Merge Statement (insert, delete and update in one sql)
2. Ref table - 5lakhs records. Primary table - 50000 records then go for sparse
lookup. If it is vice versa then normal lookup
3. Unix command to run a datastage job