0% found this document useful (0 votes)

561 views16 pages

Flipkart Analyst Interview Insights

The document outlines the interview process for a Business Analyst position at Flipkart, detailing key SQL and Python questions, data handling strategies, and estimation techniques. It includes examples of SQL queries for window functions, indexing, and data retrieval, as well as Python functions for handling missing data and finding sequences. Additionally, it discusses case studies and managerial questions related to project management and team disagreements.

Uploaded by

sathish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

561 views16 pages

Flipkart Analyst Interview Insights

Uploaded by

sathish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

FLIPKART Business Analyst Interview

Experience (1-3 years)

CTC - 14 LPA
--------------------------------------------------------------

SQL
1. What are window functions, and how do they differ from aggregate
functions? Can you give a use case?
Explanation:

• Window Functions perform calculations across a set of table rows that are related
to the current row, without collapsing the rows into a single summary result.

• Aggregate Functions, on the other hand, return a single result for a group of rows,
reducing the number of rows in the result set.

Use Case:

If you want to calculate a running total or rank without losing the row-level granularity,
window functions are useful.

Example Query:

Use Case: Calculate the running total of sales for each salesperson.

Schema:

CREATE TABLE Sales (

SalesID INT,

SalesPerson VARCHAR(50),

SaleAmount INT,

SaleDate DATE
);

INSERT INTO Sales VALUES

(1, 'Alice', 200, '2025-01-01'),

(2, 'Alice', 150, '2025-01-02'),

(3, 'Bob', 300, '2025-01-01'),

(4, 'Bob', 100, '2025-01-03'),

(5, 'Alice', 250, '2025-01-03');

Query:

SELECT

SalesPerson,

SaleAmount,

SaleDate,

SUM(SaleAmount) OVER (PARTITION BY SalesPerson ORDER BY SaleDate) AS

RunningTotal

FROM Sales;

Result:

SalesPerson SaleAmount SaleDate RunningTotal

Alice 200 2025-01-01 200

Alice 150 2025-01-02 350

Alice 250 2025-01-03 600

Bob 300 2025-01-01 300

Bob 100 2025-01-03 400

2. Explain indexing. When would an index potentially reduce
performance, and how would you approach indexing strategy for a large
dataset?
Explanation:

• An index improves the speed of data retrieval operations by creating a structure (like
a B-tree) for faster lookups.

• Downside of Indexing:

o Indexes increase storage requirements.

o They slow down write operations (INSERT, UPDATE, DELETE) because the
index needs to be updated.

When Indexing May Reduce Performance:

• Small Tables: Full table scans are often faster than index lookups.

• Frequent Updates: When the table has frequent write operations, maintaining
indexes adds overhead.

• Unused Indexes: An index on columns rarely queried wastes resources.

Indexing Strategy for Large Datasets:

1. Index Frequently Queried Columns: Use indexes on columns often used in

WHERE, JOIN, GROUP BY, and ORDER BY clauses.

2. Avoid Over-Indexing: Index only what is necessary.

3. Composite Indexes: Create multi-column indexes for queries involving multiple

columns.

4. Regular Monitoring: Use tools like EXPLAIN or Query Analyzer to ensure indexes are
effective.

Example:

CREATE INDEX idx_customer_id ON Orders (CustomerID);

3. Write a query to retrieve customers who have made purchases in the

last 30 days but did not purchase anything in the previous 30 days.
Example Query:

Schema:

CREATE TABLE Purchases (

PurchaseID INT,

CustomerID INT,

PurchaseDate DATE

);

INSERT INTO Purchases VALUES

(1, 101, '2024-12-15'),

(2, 102, '2024-12-25'),

(3, 101, '2024-11-10'),

(4, 103, '2024-12-01'),

(5, 102, '2024-11-01');

Query:

WITH RecentPurchases AS (

SELECT CustomerID

FROM Purchases

WHERE PurchaseDate >= CURDATE() - INTERVAL 30 DAY

PreviousPurchases AS (

SELECT CustomerID

FROM Purchases

WHERE PurchaseDate BETWEEN CURDATE() - INTERVAL 60 DAY AND CURDATE() -

INTERVAL 30 DAY

)
SELECT DISTINCT CustomerID

FROM RecentPurchases

WHERE CustomerID NOT IN (SELECT CustomerID FROM PreviousPurchases);

4. Given a table of transactions, find the top 3 most purchased products

for each category.
Example Query:

Schema:

CREATE TABLE Transactions (

TransactionID INT,

ProductName VARCHAR(50),

Category VARCHAR(50),

Quantity INT

);

INSERT INTO Transactions VALUES

(1, 'ProductA', 'Category1', 10),

(2, 'ProductB', 'Category1', 5),

(3, 'ProductC', 'Category1', 20),

(4, 'ProductD', 'Category2', 15),

(5, 'ProductE', 'Category2', 10),

(6, 'ProductF', 'Category2', 25);

Query:

WITH RankedProducts AS (

SELECT

Category,
ProductName,

Quantity,

RANK() OVER (PARTITION BY Category ORDER BY Quantity DESC) AS Rank

FROM Transactions

SELECT

Category,

ProductName,

Quantity

FROM RankedProducts

WHERE Rank <= 3;

Result:

Category ProductName Quantity

Category1 ProductC 20

Category1 ProductA 10

Category1 ProductB 5

Category2 ProductF 25

Category2 ProductD 15

Category2 ProductE 10

5. How would you identify duplicate records in a large dataset, and how
would you remove only the duplicates, retaining the first occurrence?
Example Query:

Schema:

CREATE TABLE Employees (

EmployeeID INT,

Name VARCHAR(50),

Department VARCHAR(50)

);

INSERT INTO Employees VALUES

(1, 'Alice', 'HR'),

(2, 'Bob', 'Finance'),

(3, 'Alice', 'HR'),

(4, 'Charlie', 'IT'),

(5, 'Bob', 'Finance');

Query to Identify Duplicates:

SELECT Name, Department, COUNT(*)

FROM Employees

GROUP BY Name, Department

HAVING COUNT(*) > 1;

Query to Remove Duplicates (Retain First Occurrence):

WITH CTE AS (

SELECT

EmployeeID,

Name,

Department,

ROW_NUMBER() OVER (PARTITION BY Name, Department ORDER BY EmployeeID) AS

RowNum

FROM Employees

)
DELETE FROM Employees

WHERE EmployeeID IN (

SELECT EmployeeID

FROM CTE

WHERE RowNum > 1

);

PYTHON
1. Write a Python function to find the longest consecutive sequence of
unique numbers in a list.
Explanation:

The problem is to find the longest subarray where all the elements are unique and
consecutive. This can be solved using a sliding window technique:

1. Use a set to track unique elements in the current window.

2. Use two pointers (start and end) to expand and contract the window as needed.

3. Update the maximum length of the subarray when the condition is met.

Code:

def longest_consecutive_sequence(nums):

if not nums:

return 0, []

unique_set = set()

start = 0

max_length = 0

longest_seq = []
for end in range(len(nums)):

while nums[end] in unique_set:

unique_set.remove(nums[start])

start += 1

unique_set.add(nums[end])

current_length = end - start + 1

if current_length > max_length:

max_length = current_length

longest_seq = nums[start:end + 1]

return max_length, longest_seq

# Example usage

nums = [1, 2, 3, 1, 4, 5, 6, 2, 7, 8]

length, sequence = longest_consecutive_sequence(nums)

print("Longest Length:", length)

print("Longest Sequence:", sequence)

Example Output:

For the input [1, 2, 3, 1, 4, 5, 6, 2, 7, 8]:

mathematica

CopyEdit

Longest Length: 6

Longest Sequence: [1, 4, 5, 6, 2, 7]

2. If you’re working with a large dataset with missing values, what Python
libraries would you use to handle missing data, and why?
Libraries to Use:

1. pandas:

o Provides powerful tools like fillna(), dropna(), and isnull() to handle missing
data effectively.

o Suitable for structured datasets like dataframes.

2. numpy:

o Provides [Link] for identifying missing values in arrays and operations

like [Link]() to detect and manipulate them.

o Useful for numerical computations with arrays.

3. scikit-learn:

o Offers the SimpleImputer and IterativeImputer classes for statistical

imputation.

o Best for preprocessing data before machine learning tasks.

4. pyjanitor (optional):

o Built on top of pandas, it simplifies cleaning operations, including handling

missing data.

Examples:

Example Dataset:

import pandas as pd

import numpy as np

data = {

"Name": ["Alice", "Bob", "Charlie", None],

"Age": [25, [Link], 30, 22],

"Salary": [50000, 60000, None, 45000]

}

df = [Link](data)

print("Original Dataset:\n", df)

Example 1: Using pandas

# Drop rows with missing values

df_dropped = [Link]()

print("\nAfter Dropping Missing Values:\n", df_dropped)

# Fill missing values with a default value

df_filled = [Link]({

"Name": "Unknown",

"Age": df["Age"].mean(),

"Salary": df["Salary"].median()

})

print("\nAfter Filling Missing Values:\n", df_filled)

Example 2: Using scikit-learn

from [Link] import SimpleImputer

# Impute missing values in the "Age" and "Salary" columns

imputer = SimpleImputer(strategy="mean")

df[["Age", "Salary"]] = imputer.fit_transform(df[["Age", "Salary"]])

print("\nAfter Imputation Using Scikit-learn:\n", df)

Example Output:

Original Dataset:

Name Age Salary

0 Alice 25.0 50000.0

1 Bob NaN 60000.0

2 Charlie 30.0 NaN

3 None 22.0 45000.0

After Dropping Missing Values:

Name Age Salary

0 Alice 25.0 50000.0

After Filling Missing Values:

Name Age Salary

0 Alice 25.000000 50000.0

1 Bob 25.666667 60000.0

2 Charlie 30.000000 47500.0

3 Unknown 22.000000 45000.0

After Imputation Using Scikit-learn:

Name Age Salary

0 Alice 25.0 50000.0

1 Bob 25.7 60000.0

2 Charlie 30.0 47500.0

3 None 22.0 45000.0

Guesstimates
1. Estimate the number of online food delivery orders in a large
metropolitan city over a month:
• Assume population:
o Large metropolitan city population ≈ 10 million.

o Online food delivery users ≈ 40% (4 million).

• Assume order frequency per user:

o Regular users (20%): 15 orders/month.

o Occasional users (80%): 4 orders/month.

• Estimate orders:

o Regular users: 0.2 × 4M × 15 = 12M orders.

o Occasional users: 0.8 × 4M × 4 = 12.8M orders.

• Total monthly orders:

o 12M + 12.8M = 24.8M orders/month.

2. How many customer service calls would a telecom company receive

daily for a customer base of 1 million?
• Assume complaint rate:

o Active users (95% of 1M): 950,000.

o Daily issue rate: 2%.

• Breakdown of issues:

o Billing (30%): 5,700 calls/day.

o Network (40%): 7,600 calls/day.

o Other (30%): 5,700 calls/day.

• Total daily calls:

o 950,000 × 0.02 = 19,000 calls/day.

Case Studies
1. A sudden decrease in conversion rate is observed in a popular product
category. How would you investigate the cause and propose solutions?
• Data Analysis:

o Analyze traffic sources for sudden drops.

o Compare conversion rates across platforms and devices.

o Study user demographics and behavior trends.

• Operational Factors:

o Check inventory and pricing issues.

o Identify recent policy changes (return policies, delivery fees).

o Monitor seasonal trends or competitor campaigns.

• Technical Investigation:

o Audit website or app performance (loading time, errors).

o Identify bugs in the checkout process.

• Solutions:

o Optimize product pages and pricing.

o Run A/B testing for checkout improvements.

o Launch targeted marketing campaigns.

[Link] the company is considering adding a new subscription model.

How would you evaluate its potential impact on customer lifetime value
and revenue?
• Market Research:

o Survey customer willingness to pay for subscriptions.

o Study competitor subscription offerings.

• Revenue Impact:

o Calculate incremental revenue from expected subscribers.

o Factor in cannibalization of existing one-time purchases.

• CLV Analysis:

o Assess changes in average customer tenure.

o Include upsell opportunities and churn rate reduction.

• Implementation Feasibility:

o Evaluate operational costs for managing subscriptions.

o Plan loyalty benefits and exclusivity perks.

Managerial Questions
1. Describe a time when you faced conflicting priorities on a project. How
did you manage your workload to meet deadlines?
Managing conflicting priorities on a project:

• Identify and prioritize:

o Break tasks into urgent vs. important.

o Align priorities with organizational goals.

• Delegate and negotiate:

o Reassign tasks to team members based on expertise.

o Negotiate extended deadlines or resource allocation.

• Time management:

o Use tools like Gantt charts or Kanban boards.

o Allocate focused time slots for high-priority tasks.

• Communicate effectively:

o Update stakeholders on progress and challenges.

o Seek guidance from managers to resolve bottlenecks.

2. How would you handle a disagreement within the team on an analytical

approach?
Handling a disagreement within the team on an analytical approach:

• Encourage open dialogue:

o Facilitate a brainstorming session to hear all perspectives.

o Promote a culture of constructive feedback.

• Use data as the arbitrator:

o Test multiple approaches with sample data.

o Choose the method that delivers the best outcome.

• Promote collaboration:

o Merge ideas into a hybrid approach if feasible.

o Assign team members to analyze pros and cons objectively.

• Escalate if needed:

o Seek guidance from a senior manager if the disagreement persists.

o Ensure the final decision aligns with organizational goals.

Wipro Data Analyst Interview Questions
No ratings yet
Wipro Data Analyst Interview Questions
29 pages
Python DSA Interview Solutions
No ratings yet
Python DSA Interview Solutions
14 pages
Numpy
No ratings yet
Numpy
27 pages
358 33 Powerpoint Slides DSC Chapter 15
No ratings yet
358 33 Powerpoint Slides DSC Chapter 15
55 pages
SQL Server Assingment PDF
No ratings yet
SQL Server Assingment PDF
6 pages
Printable Analog Electronics Full Notes PDF
No ratings yet
Printable Analog Electronics Full Notes PDF
476 pages
Minterm Maxterm K Map
No ratings yet
Minterm Maxterm K Map
15 pages
Soft Computing Viva Questions Guide
No ratings yet
Soft Computing Viva Questions Guide
2 pages
LAB-Skill Advanced Course Machine Learning With Python Experiments
No ratings yet
LAB-Skill Advanced Course Machine Learning With Python Experiments
23 pages
Advance Python Question Paper 2023
No ratings yet
Advance Python Question Paper 2023
2 pages
100 DSA Python
No ratings yet
100 DSA Python
45 pages
New OOPS Assignment 1
No ratings yet
New OOPS Assignment 1
4 pages
Indexing Structures Explained
No ratings yet
Indexing Structures Explained
44 pages
Embedded Systems Course Overview
0% (1)
Embedded Systems Course Overview
2 pages
C++ ACM ICPC Code Notebook
No ratings yet
C++ ACM ICPC Code Notebook
18 pages
1z0-148 Exam Questions & Answers
No ratings yet
1z0-148 Exam Questions & Answers
54 pages
Python Week 6 GRPA
No ratings yet
Python Week 6 GRPA
2 pages
Probability PDF Set 1
No ratings yet
Probability PDF Set 1
33 pages
Python Interview Questions 1653100147
No ratings yet
Python Interview Questions 1653100147
24 pages
LP VI Bi Lab Manual
No ratings yet
LP VI Bi Lab Manual
28 pages
F U-4 PDF
No ratings yet
F U-4 PDF
48 pages
ML Interview Notes
No ratings yet
ML Interview Notes
3 pages
Top SQL and PseudoCode Questions MNCs
No ratings yet
Top SQL and PseudoCode Questions MNCs
3 pages
Cs2303 Toc 2marks
No ratings yet
Cs2303 Toc 2marks
31 pages
Mathematics for Data Science: Week 7 Topics
No ratings yet
Mathematics for Data Science: Week 7 Topics
18 pages
Database Sample Exam Questions and Answers
No ratings yet
Database Sample Exam Questions and Answers
8 pages
Indirect Sorting
No ratings yet
Indirect Sorting
24 pages
BDA 6TH SEM Question Bank
No ratings yet
BDA 6TH SEM Question Bank
6 pages
Digital Logic - Logic Gates - Chandan Jha Sir
No ratings yet
Digital Logic - Logic Gates - Chandan Jha Sir
14 pages
Introduction to Bit Manipulation in Python
No ratings yet
Introduction to Bit Manipulation in Python
16 pages
DVR Lab Manual
No ratings yet
DVR Lab Manual
87 pages
Data Preprocessing and Exploration Techniques
No ratings yet
Data Preprocessing and Exploration Techniques
54 pages
Data Analyst Interview Questions Guide
No ratings yet
Data Analyst Interview Questions Guide
20 pages
Cybersecurity: Protecting Data Online
No ratings yet
Cybersecurity: Protecting Data Online
9 pages
Python for Machine Learning Guide
No ratings yet
Python for Machine Learning Guide
8 pages
Market Basket Analysis Software Development
No ratings yet
Market Basket Analysis Software Development
5 pages
Dbms Model Question Paper
No ratings yet
Dbms Model Question Paper
6 pages
Scenario Based SQL Interview Question 1758730412
No ratings yet
Scenario Based SQL Interview Question 1758730412
11 pages
Complete Oracle Paper
No ratings yet
Complete Oracle Paper
59 pages
Pandas DataFrame Operations Guide
No ratings yet
Pandas DataFrame Operations Guide
6 pages
Wiley R Johnston (1982) NumericalMethods A Software Approach
No ratings yet
Wiley R Johnston (1982) NumericalMethods A Software Approach
295 pages
Numpy ML - AI
No ratings yet
Numpy ML - AI
135 pages
DSA Sheet by Shradha Didi & Aman Bhaiya - Google Drive
No ratings yet
DSA Sheet by Shradha Didi & Aman Bhaiya - Google Drive
2 pages
Midterm 02 Solutions
No ratings yet
Midterm 02 Solutions
10 pages
Mod2Inheritance Assignment
No ratings yet
Mod2Inheritance Assignment
2 pages
SQL Interview Questions For Cognizant GenC (Next - Pro) Roles (Scenario-Based)
No ratings yet
SQL Interview Questions For Cognizant GenC (Next - Pro) Roles (Scenario-Based)
4 pages
A Beginner's Introduction To Typesetting With LaTeX
100% (1)
A Beginner's Introduction To Typesetting With LaTeX
117 pages
Multi-Tier Architecture
No ratings yet
Multi-Tier Architecture
3 pages
Infinity HHD-data
No ratings yet
Infinity HHD-data
2 pages
A Major Project Report On: Civil Engineering
No ratings yet
A Major Project Report On: Civil Engineering
5 pages
Week 6
100% (1)
Week 6
27 pages
Bny Mellon
No ratings yet
Bny Mellon
92 pages
Python and SQL Coding Exercises
No ratings yet
Python and SQL Coding Exercises
30 pages
Informatics Practices Pre-Board Exam 2019-20
No ratings yet
Informatics Practices Pre-Board Exam 2019-20
6 pages
IT Exam Answer Key
No ratings yet
IT Exam Answer Key
6 pages
Sample Paper-2
No ratings yet
Sample Paper-2
8 pages
HEALTHCARE
No ratings yet
HEALTHCARE
3 pages
EXL Data Analyst Interview Questions
100% (1)
EXL Data Analyst Interview Questions
43 pages
Data Science and SQL Exam Questions
No ratings yet
Data Science and SQL Exam Questions
8 pages
Walmart Data Analyst Interview Experience
No ratings yet
Walmart Data Analyst Interview Experience
10 pages
Berislav Topic, Chapter IV
No ratings yet
Berislav Topic, Chapter IV
23 pages
Design and Build A Belt Grinder Evan Shorten
No ratings yet
Design and Build A Belt Grinder Evan Shorten
1 page
HTM 01 06 PartB
No ratings yet
HTM 01 06 PartB
37 pages
AWR Report Interpretation Checklist For Diagnosing Database Performance Issues (Doc ID 1628089.1)
No ratings yet
AWR Report Interpretation Checklist For Diagnosing Database Performance Issues (Doc ID 1628089.1)
6 pages
Olivetti Printer Fault Instruction Manual - 17032015JB
No ratings yet
Olivetti Printer Fault Instruction Manual - 17032015JB
5 pages
Tello+User+Manual+v1.0 EN 2.12
No ratings yet
Tello+User+Manual+v1.0 EN 2.12
22 pages
50 Popular English Idioms To Sound Like A Native Speaker
No ratings yet
50 Popular English Idioms To Sound Like A Native Speaker
12 pages
Liebherr P 994 B Litronic Operation and Maintenance Manual
No ratings yet
Liebherr P 994 B Litronic Operation and Maintenance Manual
124 pages
Jamnalal Bajaj Institute of Management Studies: Case Study
No ratings yet
Jamnalal Bajaj Institute of Management Studies: Case Study
20 pages
Human and Social Capital in Agriculture
No ratings yet
Human and Social Capital in Agriculture
9 pages
Jurisdiction Dispute in Homeowners' Case
No ratings yet
Jurisdiction Dispute in Homeowners' Case
15 pages
IGI Report
No ratings yet
IGI Report
1 page
Weekly Student Diary: Cost of Cultivation
No ratings yet
Weekly Student Diary: Cost of Cultivation
16 pages
Ship Design Methods and Processes
100% (8)
Ship Design Methods and Processes
34 pages
Communication Skills (NEP) Sem1 Notes
No ratings yet
Communication Skills (NEP) Sem1 Notes
34 pages
REET 2024 Admit Card for Jaipur Exam
No ratings yet
REET 2024 Admit Card for Jaipur Exam
1 page
Digital Control System Overview
No ratings yet
Digital Control System Overview
17 pages
Pricelist Line Maintenance Services Valid From 20230301-1
No ratings yet
Pricelist Line Maintenance Services Valid From 20230301-1
16 pages
AGCO Power - Core75
100% (1)
AGCO Power - Core75
4 pages
CV Kamrul Hasan-5
100% (1)
CV Kamrul Hasan-5
2 pages
Bladder Surge Control System Specifications
No ratings yet
Bladder Surge Control System Specifications
5 pages
Thermodynamics in Daily Life
0% (1)
Thermodynamics in Daily Life
5 pages
Crime Prediction Using Machine Learning
No ratings yet
Crime Prediction Using Machine Learning
7 pages
Icpo - en 590 Rotterdam
100% (1)
Icpo - en 590 Rotterdam
2 pages
ME8711 Simulation and Analysis Laboratory
No ratings yet
ME8711 Simulation and Analysis Laboratory
62 pages
AU Academic Calendar 2025-26
No ratings yet
AU Academic Calendar 2025-26
1 page
IIG Service Interruption Reports
No ratings yet
IIG Service Interruption Reports
54 pages
Effective Work Breakdown Structures
100% (1)
Effective Work Breakdown Structures
59 pages
Key Ip Pre Board 2024-25
No ratings yet
Key Ip Pre Board 2024-25
10 pages
Ravioli Den Stock Price Predictions
No ratings yet
Ravioli Den Stock Price Predictions
3 pages