Interview Questions ETL

1
Interview Questions ETL
1.Tell me about urself
2.explain about the project architecture
3.tell me about how u validate the data tell some business scenarios
4.tell me about _ve testing and +ve testing according to etl
5.difference between delete and truncate and drop
6.at what scenarios we use delete and truncate
7.duplicate record find out
8.if u have DBA permission how will u create a back up table
9.Delete duplicate records.
10.if u don't have proper documents like stmdoc than how will u validate the data?what is ur
approach?
11.severity and priority examples in ur project all possible scenario s should explain
12.defect life cycle
13.if the defect is not going to complete in the current sprint what status will change?
14.self join query she given one scenario
15.some Unix commands
2nd round
about urself
2 .what are the validation s u will do
3.which schema ur using in dwh
Difference between star schema and snow flake
4 why mostly used star y can't snowflake
5.some sql querie
6.some general questions
-------------------------------------------*********************--------------------------------------
Project Architecture
2.Severity and Priority with manual and ETL testing for all combinations of severity an Priority
3.once data loaded into ur environment what test u do.
4.Level of testing
2
5.Defect life cycle
6.why we set 'deferred' status and why it can't be fixed in the current sprint.
7.diff b/w db and dwh and DM
8.some queries
9.some SQL function with an explanation and example.
10.Team size(in a project)
11.Cast function
12.snow flake and Star schema ,which is used most and why it is used instead of another one
2nd round
1.Defect life cycle
2.Given some lengthy business scenario then asked which type u will use,type 1 or type2 and why
and if it is any type then tell me the test cases
-----------------------------------------***************************-------------------------------------------------
1.select count(*) from emp
If count is not fetching what u will do in terradata
2.find the 16th highest sal in physics department
3.how will you remove duplicate records
4.how will you display duplicate records
5.u have table it consists of duplicate and original records , route the original and duplicate records
into two separate table
6.find out emp name and mgr name
7.
1 0
01
10
01
Write left right full join
8. What is diff between volatile table and non volatile table
9.how to find dependent table of created view
10.joins all joins
11.lookup,referencial integrity
12.sub queries
3
13.abt project and urself
------------------------------------------*********************-------------------------------------------
Congnizent First round..
1) Tell me abt urself ..?
2)On High profile language explain ur project architecture.?
3) Write a query to find the person who has hired before 2021.?
4) write a query to find the employee who has hired in 2018 and 2019?
5) why do u use to_date?
6) difference between to_date and to_char?
7) write a query to join 3 tables.?third table should be joined in left outer join?
8) what is significant of left join? tell me with examples.?
9) What is self join and inner join ?
10)what is the purpose of self join with examples.?
11) Query optimization.?
12) what do u prefer join or sub query ?why?
13) difference between partition by and group by?
14) rolls and responsibilities as a tester.?
15) What are the validation do u perform in stagging?
16) challenges in DWH?
17) What are the validation do u perform in transformation test.?
18) Importance of metadata testing.?
19) wr do u check the column count.?
20) what are triggers in SQL?
21) validation perform in SCD?
22)Types of facts.?with examples.
23)types of fact table..?
24) what is cummulative fact?
25) Dimensions modeling types .?
26) Different tabs in ALM?
27) How do log the defect.?
28) as a tester what most concerns u in defect life cycle.?

4
--------------------*************************------------------------------------------------
Swapna sent iq
[0:49 pm, 28/01/2022] Swapna Nsr: Question: I want customer details who are having maxing billing
based on customer type
[0:49 pm, 28/01/2022] Swapna Nsr: Question 2: Customer details who are having duplicate bill
number
[0:49 pm, 28/01/2022] Swapna Nsr: 1)tell me about youself
[0:49 pm, 28/01/2022] Swapna Nsr: Questions asked in accion labs
[0:49 pm, 28/01/2022] Swapna Nsr: 6) explanation on scdtype1 and scdtype2.. interview asked me
to one requirement and he asked me explain validations in details and he asked me different
scenarios
[0:49 pm, 28/01/2022] Swapna Nsr: And interview wanted to display the sum(amount) in monthwise
[0:49 pm, 28/01/2022] Swapna Nsr: 5)in a table the columns are product I'd , amount,date(in the
form of dd/mm/yyyy
[0:49 pm, 28/01/2022] Swapna Nsr: 2)how to verify session logs and where to see session logs
[0:49 pm, 28/01/2022] Swapna Nsr: 4)write a syntax of lead and lag analytical functions and explain
functionality in detail
[0:49 pm, 28/01/2022] Swapna Nsr: 3) what are the validations covered
[0:49 pm, 28/01/2022] Swapna Nsr: 8)have to deployed code to test?
[0:49 pm, 28/01/2022] Swapna Nsr: 7)what is release doc
[0:49 pm, 28/01/2022] Swapna Nsr: 11)there is a table called transaction and the data is like below
I'd amount year
1. 10. 01-01-2018
2. 100. 01-02-2018
.......…
.....
He want me to display sum(amount) on year wise
[0:49 pm, 28/01/2022] Swapna Nsr: 10)table record count=100 and table 2 record count=300..there
is no common column between these 2 tables.. interview asked me to display count of table 1+count
of table2 (400) in one column..ex:count
400
[0:49 pm, 28/01/2022] Swapna Nsr: 9)explain agile process
[0:49 pm, 28/01/2022] Swapna Nsr: 16)he asked me to write a query to display yesterday load
data..also he mentioned that don't use static value(like date- 03-21-2018)
5
[0:49 pm, 28/01/2022] Swapna Nsr: 13)there are 2 date columns in the format of DDMMYYY ..and
he asked me to get the differences between 2 date columns in seconds
[0:49 pm, 28/01/2022] Swapna Nsr: 15)in a column there are positive and negative values ..he asked
me to display sum(positive) value in one column and sum(negative) columns in other column
[0:49 pm, 28/01/2022] Swapna Nsr: 14) star schema and snowflake schema explanation
[0:49 pm, 28/01/2022] Swapna Nsr: Select sum(count) from
(Select count (*) as count from tab1
Union all
Select count (*) as count from tab2) result
[0:49 pm, 28/01/2022] Swapna Nsr: Query: select * from Target where load_date in (select
max(load_date) from Target)
[0:49 pm, 28/01/2022] Swapna Nsr: 12) in a table data is like DDMMYYY and he want me to display
data in the format of DD-YYYY-MON
[0:49 pm, 28/01/2022] Swapna Nsr: 17) difference between nvl and collease function
[0:49 pm, 28/01/2022] Swapna Nsr: 1) the data in source
1 swapna
2 harshita
Target the data looks like
1 swapna
1 swapna
2 harshita
2 harshita
Write a query in Target side
[0:49 pm, 28/01/2022] Swapna Nsr: 4) the data in source
Table 1. Table2
Col 1. Col2
100. 1
200. 2
In Target
6
101
202
Write a query
[0:49 pm, 28/01/2022] Swapna Nsr: 7). I have data in table as below
I'd sal
A 100
B 200
C 50
Update the salary..for emp who has sal>100 the update with 1
And sal<100 then update with 0 in a table.. write a query
[0:49 pm, 28/01/2022] Swapna Nsr: 3) the data is in source
Swapna
The data in Target
Swapna**
[0:49 pm, 28/01/2022] Swapna Nsr: 6) the dta in source like
Table1. Table2
Col1. Col2
200. 1
300. 2
The dta in Target like
2001
3002
[0:49 pm, 28/01/2022] Swapna Nsr: 2) the dta in source like
Xyz123
12abc
Target
Xyz
Abc
7
Write a query
[0:49 pm, 28/01/2022] Swapna Nsr: 5) the dta in source
I'd year name
A. 2012. Sony
A. 2014. Sony
A. 2016. Sony
B. 2000. Jyo
B. 2001. Jyo
B. 2010. Jyo
The dta in Target like
A. 2016 Sony
B 2010. Jyo
Write a query
Diff btw case and decode
Dif btw function and procedurewt is function
What is procedure
Sime postive values and some negative vause was given and those values should display in dif
column
Jyoshna.. From that one 5th and 6th chsracter should display
Jyoshna kancham@gmail.com each and every word should go in to a different column
Top 10 records using Top function
What is rank and what is dense rank. Where we wil use it
Wt is diff between drop and trant
What is TCL. What ar ther in it
What is save pointer and commit and roll back
What is cube
What is BI testing. Have you wirked on it
How many types of joins are there. What us full outer join
8
What is lead and lag. Where do we wil use it
How wil u get the duplicate records
What is hive, sqoop, mao, reduce
What are comands in unix. Tell me
How wil search a file using grep command
What is agile methodology
What is v model
What is STLc
What is defect life cycle tell me snout it
What is integration testing
Tell me the architecture for ur project
Let ys suppose 1000 records are there in source and only 500 records are loading into the target
. Whats ur approach
Same thing when is running obly some of tge records loaded whata ur approach
What are all the validations wil do in tge landibg area
How wil u check the header and footer, file name
How wil u differentiate the staging 1 ang staging 2 tables. Does it contains same tables or not. If it
contains same tables how wil differentiate it
What is surrogate key and wt validations are u doung in the surrogate key
What is mapping
Where wil u check the logs
What is Traceability matrix
Where wil u do requirements mapping
What ares the transformations u know
What is lookup transform
What is router transformation. Where do we wil use it
Diff btw drop and truncate
What is v model
What is sub query
What is eorelsted subquery
Where do we wil use it

9
Wt is data mart
data:
Eid. ename. sal. deptno date
1. sony. 10. 01. 01012018
1. sony. 15. 01. 02022018
2. jyo. 20. 01. 02022018
2. jyo. 20. 01. 03032018
get only latest month data from emp
select * from (select emp.*,dense_rank() over(partiatian by eid order by month desc) as rk from
emp) a where a.rk=1
select * from emp e inner join (select eid,max(date) as latest_date from emp group by eid) emp
on emp.eid=e eid and emp.latest_date=e.date
Select * from Emp e where e.Rowid= select max(e1.Rowid) from Emp where e.eid=e1.eid)
data:
eid name city
1. jyo. bng
1 jyo. chn
2 swapna bng
select count(distinct city),eid from emp where city in ('bng','chn')
group by eid
having count (*)=2
select case when sal>0 then sal
end as +ve value,
case when sal<0 then sal end as -ve value
from emp
10
How do u associate joinet transformation in ur project
Source lo 1000 records unnayi and target lo 800 records loaded wht is ur approach?
How wil u mind the missing record in target tabke with out using minus
What is the other way to mapping the column without using minus
If records are loaded as per busines logic or not
Write syntax to extract data from 2 tables of different schemas....
Ans--if both the schemas in same server..we can join 2 tables..else we can extract both tables into
Excel and then compare the data
What is sequence generator and write its syntax
We can use interesect function
What are the performance issues tgat u face while testing
If the sources are tables and flat files..we can use join joiner transformation
Syntax I don't know.. sequence generator will generate sequence of numbers
Code issues..due to more joins,lookups..the mapping is taking more time..and sometimes if server
down the performance will go down
CREATE SEQUENCE sequence-name
START WITH initial-value
INCREMENT BY increment-value
MAXVALUE maximum-value
CYCLE | NOCYCLE;
Ft_claims:
dim_provider_id dim_member_id dim_claim_type_id dim_claim_id dim_date total_claims

total_claims_approved
if we generate measures(fact-total_claims) based on all dimension keys..additive fact.
means to calculate measure we need to consider all dimension keys present in fact
________________
if we generate measures(fact-total_claims) based on some dimension keys..semi additive fact
ex: need to measure fact based on dim_claim_type_id,dim_provider_id,dim_date
______________
11
if we generte masures with out reffering dimension keys in fact---Non additive
ex: precentage of claims
database link is a schema object in one database that enables you to access objects on another
database. The other database need not be an Oracle Database system. However, to access non-
Oracle systems you must use Oracle Heterogeneous Services
Once data is moved to staging..the files should be moved to archive folder..in case of any invalid
files..the files will be moved into reject folder and reactive error should be present in error tables
In my project we are receiving data in the form of flat files or xmls from source.. through etl jobs we
are loading data from files to staging layer
In facts we will have aggregated data
And coming to validation across all stages..
In staging2 we apply business logic before loading to datwarehouse
And then reports people will generate reports from DWH
[: And from staging2 we are loading data into dimensions and facts through etl
: And then stg2 to dwh
: file to staging validations
: Duplicate check
Structure validation
Duplicate check
Null check
Invalid dtaa check
Business logic
Negative scenarios
And then I will check data between stg1 and stg2..the below are the validations
12
Aggregated data validation based on deaign
Scd type 2 validations
Dimension to fact validations
[Surrogate key validations
Using minus query..I will check data is matching or not
And then I will verify count and data
[Since source is flat file..I will create a temporary table in staging and I will import flat file data into
that table
Here in fact we will have aggreagated data .to make sure the measure are calculated properly..I will
apply transformation logic from source query and then I do minus with target
Error validations
If records are matching means..the logic is fine..if not I will check the data..if it is defect I will raise
defect in JIRA
Count check
* how to find even number of records and odd number of records
Select * from (select Emp.*,row_number() over(order by eid) as rnm from Emp)a where a.rnm%2=0
or (mod(a.rnm,2)=0)
Select * from (select Emp.*,row_number() over(order by eid) as rnm from Emp)a where a.rnm%2=1
or (mod(a.rnm,2)=1)---odd
*I want customer details who are having maxing billing based on customer type
*Customer details who are having duplicate bill number
1. Difference between where and having function
2. Tell me about your project/architecture/tools you are using.
3. What is fact and dimention table
4. How will you validate incremental and full load
5. DML, DCL Queries/ different queries.
6. Rate yourself in writing SQL queries?g*,, CV
7. SQL Queries: Find Duplicates, Find top 10 duplicates, Find manager name to an employee using
self join. -- To be done.
13
8. What are the validations you will check to validate payment is correctly done or not? In General
what validations/Test cases will you do?
9. what are different DML Queries?
10. Use Substr in a query
11. Query to delete duplicate records?
12. How good are you in UNIX? how will you open file/search a file using Grep function/get the
count of records in a file?
13. SCD to keep historical data?
14. Difference between DROP/TRUNCATE/DELETE.
15. Real life example of OLAP and OLTP?
16. Explain why would you need ETL?
17. In a bank after the whole day of transaction where money in credited and debited multiple
times, current value is calculated as a fact. What kind of fact this is? -- Semi-additive.
18. Types of fact and dimension?
19. What are different transformations done in ETL process?
20. What are different constraints used in a table?
21. What is data mining?
22. What if the difference between connected and unconnected lookup?
23. What if difference between data mart and data warehouse?
24. Difference between Powercenter and IICS?
25. Difference between DECODE and CASE Condition?
26. what are different analytical functions?
27. what is the Agile process?
28. what is normalization and de normalization?
29. why Stage table is used?
30. what are constrains?
31. 100:aditi:midnapore
2000:amit:kolkata query to return only name?
32. update multiple rows in single query?
33. difference between dimension table and lookup table?
34. What are the Ceremonies of agile process?
35. Why would you migrate on premesis dtaa to cloud?

14
36. What is smoke testing and sanity testing?
37. Print your name in vertical order.
38. print summation of number till that row.

Interview Questions ETL

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interview Questions ETL

Uploaded by

Copyright:

Available Formats

1

Interview Questions ETL

1.Tell me about urself

2.explain about the project architecture

4.tell me about _ve testing and +ve testing according to etl

5.difference between delete and truncate and drop

6.at what scenarios we use delete and truncate

7.duplicate record find out

8.if u have DBA permission how will u create a back up table

9.Delete duplicate records.

12.defect life cycle

14.self join query she given one scenario

15.some Unix commands

2 .what are the validation s u will do

3.which schema ur using in dwh

Difference between star schema and snow flake

4 why mostly used star y can't snowflake

5.some sql querie

6.some general questions

3.once data loaded into ur environment what test u do.

5.Defect life cycle

7.diff b/w db and dwh and DM

9.some SQL function with an explanation and example.

10.Team size(in a project)

1.Defect life cycle

1.select count(*) from emp

If count is not fetching what u will do in terradata

2.find the 16th highest sal in physics department

3.how will you remove duplicate records

4.how will you display duplicate records

6.find out emp name and mgr name

Write left right full join

8. What is diff between volatile table and non volatile table

9.how to find dependent table of created view

10.joins all joins

13.abt project and urself

Congnizent First round..

1) Tell me abt urself ..?

2)On High profile language explain ur project architecture.?

5) why do u use to_date?

6) difference between to_date and to_char?

8) what is significant of left join? tell me with examples.?

9) What is self join and inner join ?

10)what is the purpose of self join with examples.?

11) Query optimization.?

12) what do u prefer join or sub query ?why?

13) difference between partition by and group by?

14) rolls and responsibilities as a tester.?

15) What are the validation do u perform in stagging?

16) challenges in DWH?

17) What are the validation do u perform in transformation test.?

18) Importance of metadata testing.?

19) wr do u check the column count.?

20) what are triggers in SQL?

21) validation perform in SCD?

22)Types of facts.?with examples.

23)types of fact table..?

24) what is cummulative fact?

25) Dimensions modeling types .?

26) Different tabs in ALM?

27) How do log the defect.?

28) as a tester what most concerns u in defect life cycle.?

[0:49 pm, 28/01/2022] Swapna Nsr: 1)tell me about youself

[0:49 pm, 28/01/2022] Swapna Nsr: Questions asked in accion labs