Placement Prep

Remove Duplicates from a Linked List: Sorted and Unsorted

Remove duplicates from sorted and unsorted linked lists. Covers single-pass O(n)/O(1) for sorted, HashSet and nested-loop for unsorted, with C++ and Python code.

By FACE Prep Team 5 min read
linked-list data-structures algorithms duplicate-removal dsa placement-prep

A sorted linked list drops its duplicates in a single O(n) traversal with no extra memory; for an unsorted one, pick between a hash set at O(n) space or nested loops at O(n²) time.

Both problems appear in placement coding rounds. The sorted variant tests basic pointer manipulation. The unsorted variant tests whether you can reason about the space-time tradeoff and choose the right approach for a given constraint.

Sorted List: One Pass, Zero Extra Memory

When the list is sorted, all copies of the same value are adjacent. There’s no need to look backwards or store anything extra. One pointer and one comparison per step is sufficient, because a duplicate can only ever appear immediately after the current node.

Algorithm steps:

  • Start curr at the head node.
  • While curr and curr->next both exist:
    • If curr->data == curr->next->data: set curr->next = curr->next->next (skip the duplicate).
    • Otherwise: advance curr = curr->next.
  • Stop when curr->next is null.

Trace with 1 -> 2 -> 2 -> 3 -> 4 -> 5 -> 5:

  • curr = 1, next = 2: not equal, advance.
  • curr = 2, next = 2: equal, skip. List becomes 1 -> 2 -> 3 -> 4 -> 5 -> 5.
  • curr = 2, next = 3: not equal, advance.
  • curr = 3, next = 4: not equal, advance.
  • curr = 4, next = 5: not equal, advance.
  • curr = 5, next = 5: equal, skip. List becomes 1 -> 2 -> 3 -> 4 -> 5.
  • curr = 5, next = null: stop.

Result: 1 -> 2 -> 3 -> 4 -> 5. This is the kind of linear-time, constant-space solution interviewers want to see before they ask “now what if the list isn’t sorted?” See Space Complexity of Algorithms: Worked Examples for the full O(1) vs O(n) derivation framework.

Unsorted List: Two Approaches Compared

Without the sorted guarantee, duplicates can appear anywhere in the list. A single forward pass won’t find them all, since the second copy of a value might appear anywhere after the first. Two options remain.

Hash Set Method

Track every value seen so far in a set. If the current node’s value is already in the set, remove the node. Otherwise, add the value and move on.

Steps:

  • Initialise an empty set seen, pointer curr = head, pointer prev = null.
  • While curr exists:
    • If curr->data is in seen: set prev->next = curr->next (remove curr).
    • Otherwise: add curr->data to seen, update prev = curr.
    • Advance curr = curr->next.

Time: O(n) average (each lookup and insert takes O(1) average time with C++ unordered_set or Python’s built-in set). Space: O(n) worst case (all values distinct, every value lands in the set).

Nested-Loop Method

Use two loops. The outer loop selects each node as a reference. The inner loop scans every subsequent node and deletes any that match the reference value.

Steps:

  • curr = head. While curr exists:
    • runner = curr. While runner->next exists:
      • If runner->next->data == curr->data: set runner->next = runner->next->next (skip duplicate).
      • Otherwise: advance runner = runner->next.
    • Advance curr = curr->next.

Time: O(n²). Space: O(1).

The choice comes down to the constraint. Memory tight? Use nested loops. Speed matters? Use the hash set. This is the same tradeoff that appears in problems like finding all symmetric pairs in an array, which also offers a hash-map solution versus a brute-force O(n²) comparison.

Implementations

Sorted Linked List

C++:

#include <iostream>
using namespace std;

struct Node {
    int data;
    Node* next;
    Node(int val) : data(val), next(nullptr) {}
};

void removeDuplicatesSorted(Node* head) {
    Node* curr = head;
    while (curr != nullptr && curr->next != nullptr) {
        if (curr->data == curr->next->data) {
            Node* dup = curr->next;
            curr->next = curr->next->next;
            delete dup;
        } else {
            curr = curr->next;
        }
    }
}

void printList(Node* head) {
    while (head) {
        cout << head->data;
        if (head->next) cout << " -> ";
        head = head->next;
    }
    cout << "\n";
}

int main() {
    // Build: 1 -> 2 -> 2 -> 3 -> 5 -> 5
    Node* head = new Node(1);
    head->next = new Node(2);
    head->next->next = new Node(2);
    head->next->next->next = new Node(3);
    head->next->next->next->next = new Node(5);
    head->next->next->next->next->next = new Node(5);

    removeDuplicatesSorted(head);
    printList(head);  // Output: 1 -> 2 -> 3 -> 5
    return 0;
}

Python:

class Node:
    def __init__(self, val):
        self.val = val
        self.next = None

def remove_duplicates_sorted(head):
    curr = head
    while curr and curr.next:
        if curr.val == curr.next.val:
            curr.next = curr.next.next
        else:
            curr = curr.next
    return head

def print_list(head):
    nodes = []
    while head:
        nodes.append(str(head.val))
        head = head.next
    print(" -> ".join(nodes))

# Test: 1 -> 2 -> 2 -> 3 -> 5 -> 5
head = Node(1)
head.next = Node(2)
head.next.next = Node(2)
head.next.next.next = Node(3)
head.next.next.next.next = Node(5)
head.next.next.next.next.next = Node(5)

remove_duplicates_sorted(head)
print_list(head)  # Output: 1 -> 2 -> 3 -> 5

Unsorted — Hash Set

C++:

#include <unordered_set>
using namespace std;

void removeDuplicatesHashSet(Node* head) {
    unordered_set<int> seen;
    Node* prev = nullptr;
    Node* curr = head;
    while (curr != nullptr) {
        if (seen.count(curr->data)) {
            prev->next = curr->next;
        } else {
            seen.insert(curr->data);
            prev = curr;
        }
        curr = curr->next;
    }
}

Python:

def remove_duplicates_hash_set(head):
    seen = set()
    prev, curr = None, head
    while curr:
        if curr.val in seen:
            prev.next = curr.next
        else:
            seen.add(curr.val)
            prev = curr
        curr = curr.next
    return head

Unsorted — Nested Loops

C++:

void removeDuplicatesNestedLoop(Node* head) {
    Node* curr = head;
    while (curr != nullptr) {
        Node* runner = curr;
        while (runner->next != nullptr) {
            if (runner->next->data == curr->data) {
                Node* dup = runner->next;
                runner->next = runner->next->next;
                delete dup;
            } else {
                runner = runner->next;
            }
        }
        curr = curr->next;
    }
}

Python:

def remove_duplicates_nested_loop(head):
    curr = head
    while curr:
        runner = curr
        while runner.next:
            if runner.next.val == curr.val:
                runner.next = runner.next.next
            else:
                runner = runner.next
        curr = curr.next
    return head

Edge Cases to Test

Three inputs every interviewer expects your code to handle cleanly:

  • Empty list (head = null): the outer while condition (curr != null) fails on entry. Neither function touches any pointer or dereferences null. Both return without error.
  • Single node (e.g., a list containing just 5): curr->next is null on the very first check. The loop exits immediately, leaving the single node untouched.
  • All duplicates (e.g., 3 -> 3 -> 3, sorted): the sorted algorithm skips both duplicates in sequence and leaves the head node (3) as the sole survivor. The hash set algorithm adds 3 on the first node, then removes the second and third. Both paths yield a single-node list containing 3.

Off-by-one bugs are most likely to surface at these boundaries. Testing the equilibrium index of an array teaches the same discipline: always trace through an empty array and a single-element array before claiming correctness.

Complexity at a Glance

VariantTimeSpaceWhen to use
Sorted list: single passO(n)O(1)List is guaranteed sorted
Unsorted: hash setO(n) avgO(n)Speed matters, memory available
Unsorted: nested loopsO(n²)O(1)Memory constrained, n is small

The hash-set vs nested-loop decision is the clearest space-time tradeoff in DSA: pay with memory to buy speed, or pay with time to save memory. That same decision reappears in systems design interviews when the “list” is millions of records in a data pipeline. TinkerLLM (₹299) is where you build a working LLM project and see those memory-versus-throughput choices made in actual inference code, ending up with a GitHub repository that demonstrates the work rather than just discussing it.

Primary sources

Frequently asked questions

What is the time complexity of removing duplicates from a sorted linked list?

O(n) time and O(1) space. A single traversal is enough: compare each node to its successor and skip the next node if their values match. No extra data structure is needed because sortedness guarantees all duplicates are adjacent.

How do you remove duplicates from an unsorted linked list without extra memory?

Use two nested loops. The outer loop selects each node in turn; the inner loop scans all subsequent nodes and removes any that match the outer node's value. This costs O(n²) time but uses only O(1) auxiliary space.

What happens if all nodes in the linked list have the same value?

Both algorithms handle this correctly. For the sorted approach, the pointer skips every duplicate until only the first node remains. For the hash set approach, the first occurrence is added to the set and every subsequent node is removed from the list. The result is a single-node list.

Can you use a hash set on a sorted linked list too?

Yes, but it adds unnecessary O(n) space overhead. The sorted property means duplicates are always adjacent, so a single-pass comparison is both simpler and more space-efficient. Use the hash set only when the list is not guaranteed to be sorted.

Which approach is asked most often in placement coding rounds?

Both appear. Service-company coding tests typically ask the sorted variant to test basic traversal skills. Product-company technical rounds often ask the unsorted variant to see whether you recognise the space-time tradeoff and can articulate why you chose hash set over nested loops.

What is the difference between the sorted and unsorted duplicate-removal algorithms?

The sorted algorithm relies on the guarantee that duplicates are adjacent, so it only needs a single pointer and O(1) space. The unsorted algorithm has no such guarantee; it must either track seen values in a hash set (O(n) space) or recheck every previous node for each new node (O(n²) time).

How does pointer manipulation work when deleting a node from a linked list?

To delete a node, set the previous node's next pointer to the node after the one being deleted. In the sorted approach, curr->next = curr->next->next skips the duplicate. In the hash set approach, prev->next = curr->next removes curr from the chain. The deleted node is then garbage-collected in Python or freed with delete in C++.

Build AI projects

A self-paced playground for building with LLMs.

TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.

Try TinkerLLM (₹299 launch)
Free AI Roadmap PDF