Min Cost in Tree (2019)

Min Cost in Tree
Interview Question
David Wahler (dwahler@)
● ~10 years at Indeed

○ Job Search (intern)
○ Search Quality
○ Data Infrastructure
● ~8 years interviewing
● 6+ years asking min cost in tree

Question: Min Cost in Tree
● https://wiki.indeed.com/display/eng/Minimum+cost+in+a+tree
The Pitch and Setup
Setup
Problem statement:
You’re given a tree in which each edge

has a non-negative integer cost.
Locate and return the leaf node whose

cost from the root is minimal.
Example
Correct result is node J

(cost: 5+0+2 = 7)
Example includes:
● nodes with 0, 1, 2 and 3 children
● paths of different lengths
● edges with cost zero
● optimal leaf is neither first nor last
(when using DFS or BFS)
Setup
Essential points to explain:
● Costs are on edges, not nodes

● Costs are non-negative
● Edges point from parent to child
● Input is the root Node
● Expected result is a Node object
Skeleton Code - Java
class Node {
List<Edge> childEdges; // possibly empty
}
class Edge {
int cost; // ≥ 0
Node target; // non-null
}
public static Node minCostLeaf(Node root) {

// …
}
Skeleton Code - C++
struct Node {
vector<Edge> children;
};
struct Edge {
int cost; // ≥ 0
Node* target; // non-null
};
Node* minCostLeaf(Node* root) {

// …
}
Skeleton Code - Python
class Node:
children = … # type is [Edge]
class Edge:
cost = … # type is int, ≥ 0
target = … # type is Node, not None
def minCostLeaf(root):
# …
return leaf_node
Common questions
Questions (1/3)
- Q: How big is the tree?
A: The tree fits in memory, and the depth is reasonably small (a few
hundred or so)
- Q: What if there’s a tie?

A: It’s OK to break ties arbitrarily. (If time allows, see if candidate can
suggest other ways of resolving this, e.g. always returning the left-most
leaf, or returning all tied leaves.)
Questions (2/3)
- Q: Can I modify the tree?

A: No, the input should be considered immutable. (If candidate gets stuck,
consider relaxing this.)
- Q: Can I assume the input is a well-formed tree?

A: Yes. (e.g. Node pointers are non-null, no cycles)
Questions (3/3)
- Q: What’s the real-world significance of this problem?

A: This specific problem is a bit artificial, but it’s similar to various kinds of
other search and tree-traversal problems that do come up in practice,
without requiring too much “scaffolding”.
- (Bonus: the linear-time solution to the DAG extension is essentially equivalent to the Viterbi
algorithm, which is used to efficiently decode the error-correcting codes used in WiFi, LTE, etc.)
Brainstorming
Solutions
Key insights
● A greedy algorithm that only looks at the immediate

descendants of a node before choosing an edge is wrong.
● Need to traverse the tree and add up costs as we go.
○ Top-down: cost from root to current node
○ Bottom-up: cost from current node to minimum descendant leaf
Common approaches
● Recursive DFS
● Iterative BFS
☺ ● A* (requires extra
information that we ☹
● Dijkstra’s algorithm don’t have)
Naive solution (Recursive DFS, top-down)

private static class Result {
Result result = new Result(); int cost = Integer.MAX_VALUE;
dfs(root, 0, result); Node leaf;
return result.leaf; }
}
private static void dfs(Node current, int costSoFar, Result result) {

if (current.children.isEmpty()) {
if (costSoFar < result.cost) {
result.cost = costSoFar;
result.leaf = current;
}
} else {
for (Edge e : current.children) {
dfs(e.target, costSoFar + e.cost, result);
}
}
}
Naive solution (Recursive DFS, bottom-up)

return dfs(root).leaf; int cost = Integer.MAX_VALUE;
} Node leaf;
public static Result dfs(Node current) { Result(int cost, Node leaf) {

if (current.children.isEmpty()) { this.cost = cost;
return new Result(0, current); this.leaf = leaf;
}
} else {
}
Result best = null;
Result candidate = dfs(e.target);
candidate.cost += e.cost;
if (best == null || candidate.cost < best.cost) {
best = candidate;
}
}
return best;
}
}
Naive solution (Iterative BFS)
public static Node minCostLeaf(Node root) { private static class State {

int cost;
Queue<State> queue = new LinkedList<>(); Node node;
State best = null;
State(int cost, Node node) {
this.cost = cost;
queue.add(new State(0, root)); this.node = node;
}
}
while (!queue.isEmpty()) {
State state = queue.poll();
if (state.node.children.isEmpty()) {
if (best == null || state.cost < best.cost) {
best = state;
}
} else {
for (Edge e : state.node.children) {
queue.add(new State(e.target, state.cost + e.cost);
}
}
}
return best.node;
}
Optimizations
● Naive approach always examines the entire tree

● Since costs are non-negative, once we’ve seen at least one leaf, we can
prune subtrees that can’t beat the current best leaf
● Some candidates will realize this on their own (+initiative), but others won’t
see it until you give an example that hints at it
● Pruning is much easier with top-down traversal
○ We have to decide whether to visit a node before traversing its children
○ Also possible with bottom-up traversal
■ instead of passing pair of (current cost, current best)...
■ pass difference between current cost and current best
Top-down DFS with pruning

Result result = new Result(); int cost = Integer.MAX_VALUE;
dfs(root, 0, result); Node leaf;
return result.leaf; }
}
private static void dfs(Node current, int costSoFar, Result result) {

if (costSoFar < result.cost) {
result.cost = costSoFar;
result.leaf = current;
}
} else if (costSoFar < result.cost) {
dfs(e.target, costSoFar + e.cost, result);
}
}
}
Optimizations
● Dijkstra’s algorithm: visit nodes in order of cost, stopping at the first leaf
● Does a better job of pruning, but has overhead of maintaining a priority
queue
Dijkstra’s algorithm

PriorityQueue<State> queue = new PriorityQueue<>();
State best = null; private static class State implements Comparable<State> {
int cost;
Node node;
queue.add(new State(0, root));
State(int cost, Node node) {
this.cost = cost;
while (!queue.isEmpty()) { this.node = node;
}
State state = queue.poll();
public int compareTo(State other) {
if (state.node.children.isEmpty()) { return Integer.compare(cost, other.cost);
return state.node; }
}
} else {
for (Edge e : state.node.children) {
queue.add(new State(e.target, state.cost + e.cost);
}
}
}
throw new IllegalStateException("unreachable");

}
DAG extension
● At this point, I point out that the same Node/Edge data structures can be
used to represent other kind of graphs (if candidate hasn’t already said so)
● Extension: solve the same problem on a directed acyclic graph with a
single root.
○ i.e. nodes can have multiple parents, which means multiple paths from the root
○ “Find the leaf node which has the shortest path that connects it to the root”
● Depending on time available, candidates typically don’t write code for a
complete solution to the extension
DAG extension
● The candidate’s tree solution will probably give the correct answer for a
DAG as well, but will be very expensive due to re-visiting nodes. Do they
realize this?
○ Worst-case is typically O(2V); many candidates hand-wave and suggest O(V2) or similar
● If not, draw example e.g. repeated diamonds
● Pruning helps a bit, but not in the worst case
DAG extension
● Incorrect solutions:
○ DFS, skipping already-visited nodes
○ BFS, skipping already-visited nodes (shorter paths may have more edges)
● Correct solutions:
○ Dijkstra’s, skipping already-expanded nodes
○ Dynamic programming:
■ Bottom-up DFS with memoization
■ Topological sort
Bottom-up DFS with memoization (1/2)

Map<Node, Result> memo = new HashMap<>();
return dfs(root, memo).leaf;
}
private static Result dfs(Node current, Map<Node, Result> memo) {

if (memo.containsKey(current)) {
return memo.get(current);
}
Result result;
result = new Result(0, current);
} else {
// …
Bottom-up DFS with memoization (2/2)
} else {
Result best = null;
Result candidate = dfs(e.target, memo);
candidate = new Result(candidate.cost + e.cost,
candidate.target);
if (best == null || candidate.cost < best.cost) {
best = candidate;
}
}
result = best;
}
memo.put(current, result);
return result;
}
Discussion points
Discussion points
● Test cases
○ Trivial tree (root has no children)
○ Large costs (integer overflow)
○ Tree with large height (stack overflow)
○ Input constraint validation?
■ null pointers
■ negative costs
■ cycles
Discussion points
● Time complexity
○ Naive DFS/BFS is O(N) for a tree, O(V + E) for a DAG
○ With pruning, still O(N), but probably performs much better in practice
○ Dijkstra’s: O(N log N) for a tree, O(E log E) for a DAG
■ DAG complexity can be improved to O(E log V) with a custom priority queue
● Space complexity
○ DFS: O(depth)
■ Some candidates store costs in a map and don’t clean them up, which means space
is O(N)
○ BFS: O(width)
○ Dijkstra’s/memoization: O(N)
Common pitfalls (1/2)
● Not understanding variable scopes in recursive calls

○ Conflating references to same variable in different stack frames
○ Assuming pass-by-reference semantics in pass-by-value languages
● Mistakes in computing costs
○ e.g. comparing an edge cost to a path cost
● Code duplication between root and non-root nodes
● Using global or class variables to store state (bad practice)
○ Breaks when called multiple times and/or in multiple threads
Common pitfalls (2/2)
● Mutating input (deliberately or accidentally)

● Inefficiencies:
○ Collecting leaves in one pass, then searching for minimum
○ Finding minimum by sorting
○ Pruning: only tracking local minimum, not global
● (C++) Memory leaks or unsafe pointer dereferences
The bar
The Bar
● No
○ Unable to solve tree
● Weak No
○ Needs significant assistance to solve tree; no workable ideas for DAG
● Yes
○ Correct, efficient code for tree; at least one independent, non-trivial idea for DAG
● Strong Yes
○ Solves tree easily; discusses pros-and-cons of multiple approaches for DAG; no significant
implementation bugs (or finds/fixes them without help)
Conclusion
Pros
● Room to discuss multiple “optimal” solutions
○ including trading worst-case for average-case
● Good at testing ability to mentally model how code behaves
● Tests both analytical and implementation skills
Cons
● Solutions depend on a relatively small number of distinct insights
● Not good for phone screens
● Has been asked a lot over the years
Open Q & A

Min Cost in Tree (2019)

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Min Cost in Tree (2019)

Uploaded by

Copyright:

Available Formats

Min Cost in Tree

● ~10 years at Indeed

● 6+ years asking min cost in tree

You’re given a tree in which each edge

Locate and return the leaf node whose

Correct result is node J

Essential points to explain:

● Costs are on edges, not nodes

public static Node minCostLeaf(Node root) {

Node* minCostLeaf(Node* root) {

- Q: What if there’s a tie?

- Q: Can I modify the tree?

- Q: Can I assume the input is a well-formed tree?

- Q: What’s the real-world signiﬁcance of this problem?

● A greedy algorithm that only looks at the immediate

public static Node minCostLeaf(Node root) {

private static void dfs(Node current, int costSoFar, Result result) {

public static Node minCostLeaf(Node root) {

public static Result dfs(Node current) { Result(int cost, Node leaf) {

public static Node minCostLeaf(Node root) { private static class State {

● Naive approach always examines the entire tree

public static Node minCostLeaf(Node root) {

private static void dfs(Node current, int costSoFar, Result result) {

public static Node minCostLeaf(Node root) {

throw new IllegalStateException("unreachable");

public static Node minCostLeaf(Node root) {

private static Result dfs(Node current, Map<Node, Result> memo) {

● Not understanding variable scopes in recursive calls

● Mutating input (deliberately or accidentally)

You might also like