Remove Duplicates from a Linked List: Sorted and Unsorted
Remove duplicates from sorted and unsorted linked lists. Covers single-pass O(n)/O(1) for sorted, HashSet and nested-loop for unsorted, with C++ and Python code.
A sorted linked list drops its duplicates in a single O(n) traversal with no extra memory; for an unsorted one, pick between a hash set at O(n) space or nested loops at O(n²) time.
Both problems appear in placement coding rounds. The sorted variant tests basic pointer manipulation. The unsorted variant tests whether you can reason about the space-time tradeoff and choose the right approach for a given constraint.
Sorted List: One Pass, Zero Extra Memory
When the list is sorted, all copies of the same value are adjacent. There’s no need to look backwards or store anything extra. One pointer and one comparison per step is sufficient, because a duplicate can only ever appear immediately after the current node.
Algorithm steps:
- Start
currat the head node. - While
currandcurr->nextboth exist:- If
curr->data == curr->next->data: setcurr->next = curr->next->next(skip the duplicate). - Otherwise: advance
curr = curr->next.
- If
- Stop when
curr->nextis null.
Trace with 1 -> 2 -> 2 -> 3 -> 4 -> 5 -> 5:
curr= 1, next = 2: not equal, advance.curr= 2, next = 2: equal, skip. List becomes1 -> 2 -> 3 -> 4 -> 5 -> 5.curr= 2, next = 3: not equal, advance.curr= 3, next = 4: not equal, advance.curr= 4, next = 5: not equal, advance.curr= 5, next = 5: equal, skip. List becomes1 -> 2 -> 3 -> 4 -> 5.curr= 5, next = null: stop.
Result: 1 -> 2 -> 3 -> 4 -> 5. This is the kind of linear-time, constant-space solution interviewers want to see before they ask “now what if the list isn’t sorted?” See Space Complexity of Algorithms: Worked Examples for the full O(1) vs O(n) derivation framework.
Unsorted List: Two Approaches Compared
Without the sorted guarantee, duplicates can appear anywhere in the list. A single forward pass won’t find them all, since the second copy of a value might appear anywhere after the first. Two options remain.
Hash Set Method
Track every value seen so far in a set. If the current node’s value is already in the set, remove the node. Otherwise, add the value and move on.
Steps:
- Initialise an empty set
seen, pointercurr = head, pointerprev = null. - While
currexists:- If
curr->datais inseen: setprev->next = curr->next(removecurr). - Otherwise: add
curr->datatoseen, updateprev = curr. - Advance
curr = curr->next.
- If
Time: O(n) average (each lookup and insert takes O(1) average time with C++ unordered_set or Python’s built-in set).
Space: O(n) worst case (all values distinct, every value lands in the set).
Nested-Loop Method
Use two loops. The outer loop selects each node as a reference. The inner loop scans every subsequent node and deletes any that match the reference value.
Steps:
curr = head. Whilecurrexists:runner = curr. Whilerunner->nextexists:- If
runner->next->data == curr->data: setrunner->next = runner->next->next(skip duplicate). - Otherwise: advance
runner = runner->next.
- If
- Advance
curr = curr->next.
Time: O(n²). Space: O(1).
The choice comes down to the constraint. Memory tight? Use nested loops. Speed matters? Use the hash set. This is the same tradeoff that appears in problems like finding all symmetric pairs in an array, which also offers a hash-map solution versus a brute-force O(n²) comparison.
Implementations
Sorted Linked List
C++:
#include <iostream>
using namespace std;
struct Node {
int data;
Node* next;
Node(int val) : data(val), next(nullptr) {}
};
void removeDuplicatesSorted(Node* head) {
Node* curr = head;
while (curr != nullptr && curr->next != nullptr) {
if (curr->data == curr->next->data) {
Node* dup = curr->next;
curr->next = curr->next->next;
delete dup;
} else {
curr = curr->next;
}
}
}
void printList(Node* head) {
while (head) {
cout << head->data;
if (head->next) cout << " -> ";
head = head->next;
}
cout << "\n";
}
int main() {
// Build: 1 -> 2 -> 2 -> 3 -> 5 -> 5
Node* head = new Node(1);
head->next = new Node(2);
head->next->next = new Node(2);
head->next->next->next = new Node(3);
head->next->next->next->next = new Node(5);
head->next->next->next->next->next = new Node(5);
removeDuplicatesSorted(head);
printList(head); // Output: 1 -> 2 -> 3 -> 5
return 0;
}
Python:
class Node:
def __init__(self, val):
self.val = val
self.next = None
def remove_duplicates_sorted(head):
curr = head
while curr and curr.next:
if curr.val == curr.next.val:
curr.next = curr.next.next
else:
curr = curr.next
return head
def print_list(head):
nodes = []
while head:
nodes.append(str(head.val))
head = head.next
print(" -> ".join(nodes))
# Test: 1 -> 2 -> 2 -> 3 -> 5 -> 5
head = Node(1)
head.next = Node(2)
head.next.next = Node(2)
head.next.next.next = Node(3)
head.next.next.next.next = Node(5)
head.next.next.next.next.next = Node(5)
remove_duplicates_sorted(head)
print_list(head) # Output: 1 -> 2 -> 3 -> 5
Unsorted — Hash Set
C++:
#include <unordered_set>
using namespace std;
void removeDuplicatesHashSet(Node* head) {
unordered_set<int> seen;
Node* prev = nullptr;
Node* curr = head;
while (curr != nullptr) {
if (seen.count(curr->data)) {
prev->next = curr->next;
} else {
seen.insert(curr->data);
prev = curr;
}
curr = curr->next;
}
}
Python:
def remove_duplicates_hash_set(head):
seen = set()
prev, curr = None, head
while curr:
if curr.val in seen:
prev.next = curr.next
else:
seen.add(curr.val)
prev = curr
curr = curr.next
return head
Unsorted — Nested Loops
C++:
void removeDuplicatesNestedLoop(Node* head) {
Node* curr = head;
while (curr != nullptr) {
Node* runner = curr;
while (runner->next != nullptr) {
if (runner->next->data == curr->data) {
Node* dup = runner->next;
runner->next = runner->next->next;
delete dup;
} else {
runner = runner->next;
}
}
curr = curr->next;
}
}
Python:
def remove_duplicates_nested_loop(head):
curr = head
while curr:
runner = curr
while runner.next:
if runner.next.val == curr.val:
runner.next = runner.next.next
else:
runner = runner.next
curr = curr.next
return head
Edge Cases to Test
Three inputs every interviewer expects your code to handle cleanly:
- Empty list (
head = null): the outer while condition (curr != null) fails on entry. Neither function touches any pointer or dereferences null. Both return without error. - Single node (e.g., a list containing just
5):curr->nextis null on the very first check. The loop exits immediately, leaving the single node untouched. - All duplicates (e.g.,
3 -> 3 -> 3, sorted): the sorted algorithm skips both duplicates in sequence and leaves the head node (3) as the sole survivor. The hash set algorithm adds3on the first node, then removes the second and third. Both paths yield a single-node list containing3.
Off-by-one bugs are most likely to surface at these boundaries. Testing the equilibrium index of an array teaches the same discipline: always trace through an empty array and a single-element array before claiming correctness.
Complexity at a Glance
| Variant | Time | Space | When to use |
|---|---|---|---|
| Sorted list: single pass | O(n) | O(1) | List is guaranteed sorted |
| Unsorted: hash set | O(n) avg | O(n) | Speed matters, memory available |
| Unsorted: nested loops | O(n²) | O(1) | Memory constrained, n is small |
The hash-set vs nested-loop decision is the clearest space-time tradeoff in DSA: pay with memory to buy speed, or pay with time to save memory. That same decision reappears in systems design interviews when the “list” is millions of records in a data pipeline. TinkerLLM (₹299) is where you build a working LLM project and see those memory-versus-throughput choices made in actual inference code, ending up with a GitHub repository that demonstrates the work rather than just discussing it.
Primary sources
Frequently asked questions
What is the time complexity of removing duplicates from a sorted linked list?
O(n) time and O(1) space. A single traversal is enough: compare each node to its successor and skip the next node if their values match. No extra data structure is needed because sortedness guarantees all duplicates are adjacent.
How do you remove duplicates from an unsorted linked list without extra memory?
Use two nested loops. The outer loop selects each node in turn; the inner loop scans all subsequent nodes and removes any that match the outer node's value. This costs O(n²) time but uses only O(1) auxiliary space.
What happens if all nodes in the linked list have the same value?
Both algorithms handle this correctly. For the sorted approach, the pointer skips every duplicate until only the first node remains. For the hash set approach, the first occurrence is added to the set and every subsequent node is removed from the list. The result is a single-node list.
Can you use a hash set on a sorted linked list too?
Yes, but it adds unnecessary O(n) space overhead. The sorted property means duplicates are always adjacent, so a single-pass comparison is both simpler and more space-efficient. Use the hash set only when the list is not guaranteed to be sorted.
Which approach is asked most often in placement coding rounds?
Both appear. Service-company coding tests typically ask the sorted variant to test basic traversal skills. Product-company technical rounds often ask the unsorted variant to see whether you recognise the space-time tradeoff and can articulate why you chose hash set over nested loops.
What is the difference between the sorted and unsorted duplicate-removal algorithms?
The sorted algorithm relies on the guarantee that duplicates are adjacent, so it only needs a single pointer and O(1) space. The unsorted algorithm has no such guarantee; it must either track seen values in a hash set (O(n) space) or recheck every previous node for each new node (O(n²) time).
How does pointer manipulation work when deleting a node from a linked list?
To delete a node, set the previous node's next pointer to the node after the one being deleted. In the sorted approach, curr->next = curr->next->next skips the duplicate. In the hash set approach, prev->next = curr->next removes curr from the chain. The deleted node is then garbage-collected in Python or freed with delete in C++.
A self-paced playground for building with LLMs.
TinkerLLM is FACE Prep's sister property. A guided environment for shipping real LLM applications, the kind of project that earns a paragraph on your resume, not a line.
Try TinkerLLM (₹299 launch)