Placement Prep

Two-Player Coin Game Using Dynamic Programming

Greedy fails the two-player coin game. Learn the parity-based optimal strategy, the DP recurrence, and working Python code: placement-interview ready.

By FACE Prep Team 6 min read
dynamic-programming interview-questions placement-prep algorithms python

In the two-player coin game, two opponents alternate picking coins from either end of a row, both play optimally, and the player with the higher total wins.

This is one of those problems where the greedy answer is wrong and the correct insight is elegant. The first player can always guarantee exactly one of two outcomes: collect all coins at odd-indexed positions, or collect all coins at even-indexed positions (whichever sum is larger). Understanding why requires stepping through both strategies.

Problem Setup

The game has a fixed set of rules:

  • An array of n coins with different values is placed in a row. n must be even for the clean parity argument to apply.
  • Player 1 goes first.
  • On each turn, a player picks one coin from either the left end or the right end of the remaining row.
  • Both players play optimally — each always chooses the move that maximises their own total.
  • After all coins are taken, the player with the higher sum wins. Equal sums result in a draw.

“Optimal play” is the critical phrase. It does not mean picking the highest-valued coin visible right now. It means picking the coin that leads to the highest total over the full game, accounting for every future response the opponent can make. This distinction separates a greedy solution from a dynamic programming one.

The problem appears regularly in on-campus placement rounds at Tier-2 and Tier-3 colleges, particularly in the coding sections of companies that test data-structures-and-algorithms knowledge. It is also a standard entry in competitive programming problem sets.

Why the Greedy Approach Fails

Greedy reasoning says: always pick whichever endpoint coin is larger. It seems sensible but it is wrong.

Walk through the array {18, 20, 15, 30, 10, 14} with both players using the greedy rule:

  • Player 1 compares 18 (left) and 14 (right), picks 18. Remaining: {20, 15, 30, 10, 14}.
  • Player 2 compares 20 and 14, picks 20. Remaining: {15, 30, 10, 14}.
  • Player 1 compares 15 and 14, picks 15. Remaining: {30, 10, 14}.
  • Player 2 compares 30 and 14, picks 30. Remaining: {10, 14}.
  • Player 1 compares 10 and 14, picks 14. Remaining: {10}.
  • Player 2 picks 10.

Final scores:

  • Player 1: 18 + 15 + 14 = 47
  • Player 2: 20 + 30 + 10 = 60

Player 2 wins. Player 1 went first and still lost. The problem is that picking 18 at the start forced the array into a shape where Player 2 could grab 20 and 30 consecutively.

The greedy choice of 18 over 14 gained Player 1 exactly 4 units on the immediate pick, but cost far more in terms of what it enabled the opponent to do next. This is the classic failure mode that dynamic programming is designed to correct: the locally optimal action is not the globally optimal one.

The Optimal Strategy: Position Parity

The key insight for an even-length array is structural. Label the coins by their original index positions (1-indexed):

  • Odd-indexed positions: 1, 3, 5, … — these coins form one group.
  • Even-indexed positions: 2, 4, 6, … — these coins form another group.

The two groups are disjoint. No coin appears in both groups.

Player 1 can always guarantee collecting the entire larger group. Here is why.

If Player 1 picks from the left end (position 1, odd-indexed), the remaining array runs from position 2 to position n. Both ends are now even-indexed. Whatever Player 2 picks (left or right), one odd-indexed position immediately becomes an endpoint again. Player 1 picks the odd-indexed endpoint each time. By induction, Player 1 collects all odd-indexed coins.

Symmetrically, if Player 1 picks from the right end (position n, even-indexed for an even-length array), both exposed ends after each of Player 2’s turns are always odd-indexed. Player 1 keeps picking the even-indexed endpoint and collects all even-indexed coins.

The decision rule:

  • Compute S_odd = sum of coins at positions 1, 3, 5, …
  • Compute S_even = sum of coins at positions 2, 4, 6, …
  • If S_odd > S_even: pick the left end first (position 1), then always pick odd-indexed endpoints.
  • If S_even > S_odd: pick the right end first (position n), then always pick even-indexed endpoints.
  • If S_odd == S_even: either choice guarantees a draw.

For {18, 20, 15, 30, 10, 14}:

  • S_odd (positions 1, 3, 5): 18 + 15 + 10 = 43
  • S_even (positions 2, 4, 6): 20 + 30 + 14 = 64

Since S_even > S_odd, Player 1 picks 14 (right end, position 6) first. No matter what Player 2 picks, Player 1 continues collecting even-indexed coins: 30, then 20.

Player 1 final: 14 + 30 + 20 = 64. Player 2 final: 43. Player 1 wins.

This O(n) strategy works because the coin positions are fixed throughout the game. No matter how the opponent plays, the parity constraint on the endpoints is preserved.

DP Recurrence and Python Code

The parity strategy is O(n) and elegant, but placement interviews and competitive rounds also ask for the general DP solution. It works on any input, including odd-length arrays and cases where a formal proof of the optimal value is needed.

The standard interval DP formulation, as documented in the Wikipedia entry on dynamic programming under “optimal substructure”, builds a table dp[i][j]:

  • dp[i][j] is the maximum sum the current player (whoever is moving) can collect from subarray a[i..j], assuming both players play optimally from that state onward.
  • Base case: dp[i][i] = a[i] (only one coin, take it).
  • Recurrence (0-indexed array):
dp[i][j] = max(
    a[i] + sum(i+1, j) - dp[i+1][j],   # pick left; opponent plays on a[i+1..j]
    a[j] + sum(i, j-1) - dp[i][j-1]    # pick right; opponent plays on a[i..j-1]
)

The term sum(i+1,j) - dp[i+1][j] is the amount the current player collects from the remaining subarray after the opponent takes their optimal share. Fill the table by increasing subarray length (length 1, then 2, then 3, and so on).

This is the canonical DP-31 interval game formulation. The table has O(n^2) cells, each computed in O(1) with a prefix-sum array, giving O(n^2) total time and space.

def two_player_coin_game(coins):
    n = len(coins)
    # prefix sum for O(1) range queries
    prefix = [0] * (n + 1)
    for k in range(n):
        prefix[k + 1] = prefix[k] + coins[k]

    def range_sum(i, j):
        return prefix[j + 1] - prefix[i]

    # dp[i][j] = max a first-mover can collect from coins[i..j]
    dp = [[0] * n for _ in range(n)]
    for i in range(n):
        dp[i][i] = coins[i]

    # fill by increasing subarray length
    for length in range(2, n + 1):
        for i in range(n - length + 1):
            j = i + length - 1
            pick_left  = coins[i] + range_sum(i + 1, j) - dp[i + 1][j]
            pick_right = coins[j] + range_sum(i, j - 1) - dp[i][j - 1]
            dp[i][j] = max(pick_left, pick_right)

    player1 = dp[0][n - 1]
    player2 = sum(coins) - player1

    if player1 > player2:
        return "PLAYER 1 WINS"
    elif player2 > player1:
        return "PLAYER 2 WINS"
    else:
        return "DRAW"


# Verify with the classic example
coins = [18, 20, 15, 30, 10, 14]
print(two_player_coin_game(coins))  # PLAYER 1 WINS

Running this on {18, 20, 15, 30, 10, 14} gives dp[0][5] = 64, confirming Player 1’s guaranteed maximum matches the parity strategy’s result exactly.

Edge cases worth testing

When preparing for a placement test, verify the code handles these:

  • Single-element array: Player 1 takes the only coin and wins trivially.
  • Two-element array: Player 1 takes the larger coin. No DP table needed.
  • All equal values: result is DRAW. The parity sums are equal, and both strategies confirm it.
  • Odd-length array: the parity strategy does not apply cleanly; use the full DP table.

These also come up inside pattern-based coding problems as sub-components of larger array manipulation questions. Placement tests sometimes embed the coin-game logic inside a longer scenario description with identical recurrence but different surface framing.

For related optimisation-style placement problems that use interval-based reasoning, the same discipline of writing the state definition before the recurrence is the right starting point.


The DP method in this article (define a state, write the recurrence, fill the table, read the answer) is the same discipline that shows up in building systems that make sequences of decisions under constraints. LLM applications work through exactly this kind of step-by-step state modelling. If you want to go from solving game-theory DP problems to shipping LLM-powered tools, TinkerLLM is the practical entry point at ₹299.

Primary sources

Frequently asked questions

Does the greedy strategy ever give the correct result?

Sometimes, by coincidence, but not reliably. For the array {18,20,15,30,10,14}, greedy play gives Player 1 only 47 versus the optimal 64. Greedy works when the two endpoint values happen to steer Player 1 toward the larger index group, but there is no guarantee of this.

What is the time and space complexity of the DP solution?

Time complexity is O(n^2) because there are O(n^2) subproblems and each is solved in O(1). Space complexity is also O(n^2) for the DP table. A prefix-sum array reduces the range-sum query inside each cell to O(1).

Does the parity strategy work if the array has an odd number of coins?

No. When n is odd, the two players cannot each take n/2 coins, so the even and odd group sizes differ by one. The parity argument breaks down for odd-length arrays. The full DP recurrence still works correctly for any n.

How do I spot this pattern quickly in a placement interview?

Look for these signals: two players, an array or sequence, alternating picks from the ends or specific positions, and both players playing optimally. If the problem asks for the maximum a player can score or whether Player 1 can guarantee a win, it is almost certainly an interval DP problem.

What if both players use the same greedy rule but compare by a different metric?

Any deterministic greedy rule — largest endpoint, smallest endpoint, or any fixed heuristic — can be exploited by the opponent if it is predictable. Only the DP-optimal strategy guarantees the best possible outcome regardless of what the opponent does.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)
Free AI Roadmap PDF