Diff Flashcards

(20 cards)

1
Q

What is the goal of the diff algorithm?

A

To compute the minimum sequence of insertions and deletions to transform sequence A into sequence B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What edit operations are used in the diff problem?

A

Insert after a position and delete at a position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a subsequence?

A

A sequence formed by deleting zero or more elements without changing order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a common subsequence?

A

A subsequence that appears in both sequences A and B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is finding the LCS useful in diff?

A

Because maximizing the LCS minimizes the number of edits required.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Bellman equation for LCS?

A

l(i,j) = 1 + l(i-1,j-1) if a_i = b_j, else max(l(i-1,j), l(i,j-1)); l(i,0) = l(0,j) = 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the time complexity of LCS via dynamic programming?

A

Θ(m · n), where m and n are the lengths of the sequences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you recover the actual LCS from the DP table?

A

By tracing backwards from the bottom-right cell, following the matches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How is diff modeled as a graph problem?

A

As a grid where right = delete, down = insert, and diagonal = match (zero-cost).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What graph algorithm can be used to solve diff efficiently?

A

Dijkstra’s algorithm with cost 1 for insert/delete and 0 for match.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the size of the graph used in the diff problem?

A

Θ(m · n) vertices and edges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What optimization helps Dijkstra in this scenario?

A

Using a bucket queue for constant-time priority queue operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How many diagonals are relevant when the edit distance is D?

A

Only diagonals from -D to D.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the key idea in Myers’ algorithm?

A

Track the furthest x-position on each diagonal for each number of edits d.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the time complexity of Myers’ algorithm?

A

Θ(D · min(m, n)), where D is the edit distance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is Myers’ algorithm efficient for small D?

A

Because it limits the search space to a narrow band of relevant diagonals.

17
Q

What does Git use for its diff implementation?

A

Myers’ algorithm due to its efficiency and practicality.

18
Q

What is patience diff in Git?

A

A strategy that preserves large unchanged blocks for better readability.

19
Q

Why is patience diff useful for code diffs?

A

Because it highlights logically related blocks and minimizes noise.

20
Q

What does the DP table for LCS represent?

A

Lengths of LCS for each prefix pair of A and B.