You are on page 1of 14

1

Interview Questions ETL

1.Tell me about urself

2.explain about the project architecture

3.tell me about how u validate the data tell some business scenarios

4.tell me about _ve testing and +ve testing according to etl

5.difference between delete and truncate and drop

6.at what scenarios we use delete and truncate

7.duplicate record find out

8.if u have DBA permission how will u create a back up table

9.Delete duplicate records.

10.if u don't have proper documents like stmdoc than how will u validate the data?what is ur
approach?

11.severity and priority examples in ur project all possible scenario s should explain

12.defect life cycle

13.if the defect is not going to complete in the current sprint what status will change?

14.self join query she given one scenario

15.some Unix commands

2nd round

about urself

2 .what are the validation s u will do

3.which schema ur using in dwh

Difference between star schema and snow flake

4 why mostly used star y can't snowflake

5.some sql querie

6.some general questions

-------------------------------------------*********************--------------------------------------

Project Architecture

2.Severity and Priority with manual and ETL testing for all combinations of severity an Priority

3.once data loaded into ur environment what test u do.

4.Level of testing
2

5.Defect life cycle

6.why we set 'deferred' status and why it can't be fixed in the current sprint.

7.diff b/w db and dwh and DM

8.some queries

9.some SQL function with an explanation and example.

10.Team size(in a project)

11.Cast function

12.snow flake and Star schema ,which is used most and why it is used instead of another one

2nd round

1.Defect life cycle

2.Given some lengthy business scenario then asked which type u will use,type 1 or type2 and why
and if it is any type then tell me the test cases

-----------------------------------------***************************-------------------------------------------------

1.select count(*) from emp

If count is not fetching what u will do in terradata

2.find the 16th highest sal in physics department

3.how will you remove duplicate records

4.how will you display duplicate records

5.u have table it consists of duplicate and original records , route the original and duplicate records
into two separate table

6.find out emp name and mgr name

7.

1 0

01

10

01

Write left right full join

8. What is diff between volatile table and non volatile table

9.how to find dependent table of created view

10.joins all joins

11.lookup,referencial integrity

12.sub queries
3

13.abt project and urself

------------------------------------------*********************-------------------------------------------

Congnizent First round..

1) Tell me abt urself ..?

2)On High profile language explain ur project architecture.?

3) Write a query to find the person who has hired before 2021.?

4) write a query to find the employee who has hired in 2018 and 2019?

5) why do u use to_date?

6) difference between to_date and to_char?

7) write a query to join 3 tables.?third table should be joined in left outer join?

8) what is significant of left join? tell me with examples.?

9) What is self join and inner join ?

10)what is the purpose of self join with examples.?

11) Query optimization.?

12) what do u prefer join or sub query ?why?

13) difference between partition by and group by?

14) rolls and responsibilities as a tester.?

15) What are the validation do u perform in stagging?

16) challenges in DWH?

17) What are the validation do u perform in transformation test.?

18) Importance of metadata testing.?

19) wr do u check the column count.?

20) what are triggers in SQL?

21) validation perform in SCD?

22)Types of facts.?with examples.

23)types of fact table..?

24) what is cummulative fact?

25) Dimensions modeling types .?

26) Different tabs in ALM?

27) How do log the defect.?

28) as a tester what most concerns u in defect life cycle.?


4

--------------------*************************------------------------------------------------

Swapna sent iq

[0:49 pm, 28/01/2022] Swapna Nsr: Question: I want customer details who are having maxing billing
based on customer type

[0:49 pm, 28/01/2022] Swapna Nsr: Question 2: Customer details who are having duplicate bill
number

[0:49 pm, 28/01/2022] Swapna Nsr: 1)tell me about youself

[0:49 pm, 28/01/2022] Swapna Nsr: Questions asked in accion labs

[0:49 pm, 28/01/2022] Swapna Nsr: 6) explanation on scdtype1 and scdtype2.. interview asked me
to one requirement and he asked me explain validations in details and he asked me different
scenarios

[0:49 pm, 28/01/2022] Swapna Nsr: And interview wanted to display the sum(amount) in monthwise

[0:49 pm, 28/01/2022] Swapna Nsr: 5)in a table the columns are product I'd , amount,date(in the
form of dd/mm/yyyy

[0:49 pm, 28/01/2022] Swapna Nsr: 2)how to verify session logs and where to see session logs

[0:49 pm, 28/01/2022] Swapna Nsr: 4)write a syntax of lead and lag analytical functions and explain
functionality in detail

[0:49 pm, 28/01/2022] Swapna Nsr: 3) what are the validations covered

[0:49 pm, 28/01/2022] Swapna Nsr: 8)have to deployed code to test?

[0:49 pm, 28/01/2022] Swapna Nsr: 7)what is release doc

[0:49 pm, 28/01/2022] Swapna Nsr: 11)there is a table called transaction and the data is like below

I'd amount year

1. 10. 01-01-2018

2. 100. 01-02-2018

.......…

.....

He want me to display sum(amount) on year wise

[0:49 pm, 28/01/2022] Swapna Nsr: 10)table record count=100 and table 2 record count=300..there
is no common column between these 2 tables.. interview asked me to display count of table 1+count
of table2 (400) in one column..ex:count

400

[0:49 pm, 28/01/2022] Swapna Nsr: 9)explain agile process

[0:49 pm, 28/01/2022] Swapna Nsr: 16)he asked me to write a query to display yesterday load
data..also he mentioned that don't use static value(like date- 03-21-2018)
5

[0:49 pm, 28/01/2022] Swapna Nsr: 13)there are 2 date columns in the format of DDMMYYY ..and
he asked me to get the differences between 2 date columns in seconds

[0:49 pm, 28/01/2022] Swapna Nsr: 15)in a column there are positive and negative values ..he asked
me to display sum(positive) value in one column and sum(negative) columns in other column

[0:49 pm, 28/01/2022] Swapna Nsr: 14) star schema and snowflake schema explanation

[0:49 pm, 28/01/2022] Swapna Nsr: Select sum(count) from

(Select count (*) as count from tab1

Union all

Select count (*) as count from tab2) result

[0:49 pm, 28/01/2022] Swapna Nsr: Query: select * from Target where load_date in (select
max(load_date) from Target)

[0:49 pm, 28/01/2022] Swapna Nsr: 12) in a table data is like DDMMYYY and he want me to display
data in the format of DD-YYYY-MON

[0:49 pm, 28/01/2022] Swapna Nsr: 17) difference between nvl and collease function

[0:49 pm, 28/01/2022] Swapna Nsr: 1) the data in source

1 swapna

2 harshita

Target the data looks like

1 swapna

1 swapna

2 harshita

2 harshita

Write a query in Target side

[0:49 pm, 28/01/2022] Swapna Nsr: 4) the data in source

Table 1. Table2

Col 1. Col2

100. 1

200. 2

In Target
6

101

202

Write a query

[0:49 pm, 28/01/2022] Swapna Nsr: 7). I have data in table as below

I'd sal

A 100

B 200

C 50

Update the salary..for emp who has sal>100 the update with 1

And sal<100 then update with 0 in a table.. write a query

[0:49 pm, 28/01/2022] Swapna Nsr: 3) the data is in source

Swapna

The data in Target

Swapna**

[0:49 pm, 28/01/2022] Swapna Nsr: 6) the dta in source like

Table1. Table2

Col1. Col2

200. 1

300. 2

The dta in Target like

2001

3002

[0:49 pm, 28/01/2022] Swapna Nsr: 2) the dta in source like

Xyz123

12abc

Target

Xyz

Abc
7

Write a query

[0:49 pm, 28/01/2022] Swapna Nsr: 5) the dta in source

I'd year name

A. 2012. Sony

A. 2014. Sony

A. 2016. Sony

B. 2000. Jyo

B. 2001. Jyo

B. 2010. Jyo

The dta in Target like

A. 2016 Sony

B 2010. Jyo

Write a query

Diff btw case and decode

Dif btw function and procedurewt is function

What is procedure

Sime postive values and some negative vause was given and those values should display in dif
column

Jyoshna.. From that one 5th and 6th chsracter should display

Jyoshna kancham@gmail.com each and every word should go in to a different column

Top 10 records using Top function

What is rank and what is dense rank. Where we wil use it

Wt is diff between drop and trant

What is TCL. What ar ther in it

What is save pointer and commit and roll back

What is cube

What is BI testing. Have you wirked on it

How many types of joins are there. What us full outer join
8

What is lead and lag. Where do we wil use it

How wil u get the duplicate records

What is hive, sqoop, mao, reduce

What are comands in unix. Tell me

How wil search a file using grep command

What is agile methodology

What is v model

What is STLc

What is defect life cycle tell me snout it

What is integration testing

Tell me the architecture for ur project

Let ys suppose 1000 records are there in source and only 500 records are loading into the target

. Whats ur approach

Same thing when is running obly some of tge records loaded whata ur approach

What are all the validations wil do in tge landibg area

How wil u check the header and footer, file name

How wil u differentiate the staging 1 ang staging 2 tables. Does it contains same tables or not. If it
contains same tables how wil differentiate it

What is surrogate key and wt validations are u doung in the surrogate key

What is mapping

Where wil u check the logs

What is Traceability matrix

Where wil u do requirements mapping

What ares the transformations u know

What is lookup transform

What is router transformation. Where do we wil use it

Diff btw drop and truncate

What is v model

What is sub query

What is eorelsted subquery

Where do we wil use it


9

Wt is data mart

data:

Eid. ename. sal. deptno date

1. sony. 10. 01. 01012018

1. sony. 15. 01. 02022018

2. jyo. 20. 01. 02022018

2. jyo. 20. 01. 03032018

get only latest month data from emp

select * from (select emp.*,dense_rank() over(partiatian by eid order by month desc) as rk from
emp) a where a.rk=1

select * from emp e inner join (select eid,max(date) as latest_date from emp group by eid) emp

on emp.eid=e eid and emp.latest_date=e.date

Select * from Emp e where e.Rowid= select max(e1.Rowid) from Emp where e.eid=e1.eid)

data:

eid name city

1. jyo. bng

1 jyo. chn

2 swapna bng

select count(distinct city),eid from emp where city in ('bng','chn')

group by eid

having count (*)=2

select case when sal>0 then sal

end as +ve value,

case when sal<0 then sal end as -ve value

from emp
10

How do u associate joinet transformation in ur project

Source lo 1000 records unnayi and target lo 800 records loaded wht is ur approach?

How wil u mind the missing record in target tabke with out using minus

What is the other way to mapping the column without using minus

If records are loaded as per busines logic or not

Write syntax to extract data from 2 tables of different schemas....

Ans--if both the schemas in same server..we can join 2 tables..else we can extract both tables into
Excel and then compare the data

What is sequence generator and write its syntax

We can use interesect function

What are the performance issues tgat u face while testing

If the sources are tables and flat files..we can use join joiner transformation

Syntax I don't know.. sequence generator will generate sequence of numbers

Code issues..due to more joins,lookups..the mapping is taking more time..and sometimes if server
down the performance will go down

CREATE SEQUENCE sequence-name

START WITH initial-value

INCREMENT BY increment-value

MAXVALUE maximum-value

CYCLE | NOCYCLE;

Ft_claims:

dim_provider_id dim_member_id dim_claim_type_id dim_claim_id dim_date total_claims


total_claims_approved

if we generate measures(fact-total_claims) based on all dimension keys..additive fact.

means to calculate measure we need to consider all dimension keys present in fact

________________

if we generate measures(fact-total_claims) based on some dimension keys..semi additive fact

ex: need to measure fact based on dim_claim_type_id,dim_provider_id,dim_date

______________
11

if we generte masures with out reffering dimension keys in fact---Non additive

ex: precentage of claims

database link is a schema object in one database that enables you to access objects on another
database. The other database need not be an Oracle Database system. However, to access non-
Oracle systems you must use Oracle Heterogeneous Services

Once data is moved to staging..the files should be moved to archive folder..in case of any invalid
files..the files will be moved into reject folder and reactive error should be present in error tables

In my project we are receiving data in the form of flat files or xmls from source.. through etl jobs we
are loading data from files to staging layer

In facts we will have aggregated data

And coming to validation across all stages..

In staging2 we apply business logic before loading to datwarehouse

And then reports people will generate reports from DWH

[: And from staging2 we are loading data into dimensions and facts through etl

: And then stg2 to dwh

: file to staging validations

: Duplicate check

Structure validation

Duplicate check

Null check

Invalid dtaa check

Business logic

Negative scenarios

And then I will check data between stg1 and stg2..the below are the validations
12

Aggregated data validation based on deaign

Scd type 2 validations

Dimension to fact validations

[Surrogate key validations

Using minus query..I will check data is matching or not

And then I will verify count and data

[Since source is flat file..I will create a temporary table in staging and I will import flat file data into
that table

Here in fact we will have aggreagated data .to make sure the measure are calculated properly..I will
apply transformation logic from source query and then I do minus with target

Error validations

If records are matching means..the logic is fine..if not I will check the data..if it is defect I will raise
defect in JIRA

Count check

* how to find even number of records and odd number of records

Select * from (select Emp.*,row_number() over(order by eid) as rnm from Emp)a where a.rnm%2=0
or (mod(a.rnm,2)=0)

Select * from (select Emp.*,row_number() over(order by eid) as rnm from Emp)a where a.rnm%2=1
or (mod(a.rnm,2)=1)---odd

*I want customer details who are having maxing billing based on customer type

*Customer details who are having duplicate bill number

1. Difference between where and having function

2. Tell me about your project/architecture/tools you are using.

3. What is fact and dimention table

4. How will you validate incremental and full load

5. DML, DCL Queries/ different queries.

6. Rate yourself in writing SQL queries?g*,, CV

7. SQL Queries: Find Duplicates, Find top 10 duplicates, Find manager name to an employee using
self join. -- To be done.
13

8. What are the validations you will check to validate payment is correctly done or not? In General
what validations/Test cases will you do?

9. what are different DML Queries?

10. Use Substr in a query

11. Query to delete duplicate records?

12. How good are you in UNIX? how will you open file/search a file using Grep function/get the
count of records in a file?

13. SCD to keep historical data?

14. Difference between DROP/TRUNCATE/DELETE.

15. Real life example of OLAP and OLTP?

16. Explain why would you need ETL?

17. In a bank after the whole day of transaction where money in credited and debited multiple
times, current value is calculated as a fact. What kind of fact this is? -- Semi-additive.

18. Types of fact and dimension?

19. What are different transformations done in ETL process?

20. What are different constraints used in a table?

21. What is data mining?

22. What if the difference between connected and unconnected lookup?

23. What if difference between data mart and data warehouse?

24. Difference between Powercenter and IICS?

25. Difference between DECODE and CASE Condition?

26. what are different analytical functions?

27. what is the Agile process?

28. what is normalization and de normalization?

29. why Stage table is used?

30. what are constrains?

31. 100:aditi:midnapore

2000:amit:kolkata query to return only name?

32. update multiple rows in single query?

33. difference between dimension table and lookup table?

34. What are the Ceremonies of agile process?

35. Why would you migrate on premesis dtaa to cloud?


14

36. What is smoke testing and sanity testing?

37. Print your name in vertical order.

38. print summation of number till that row.

You might also like