Remove Duplicates from a List

problem-solving
Published

September 20, 2023

Python lists are versatile, but sometimes you end up with unwanted duplicates. This post explores several efficient methods to remove duplicates from a Python list, preserving the original order whenever possible.

Method 1: Using a Set

The simplest and often fastest way to remove duplicates while disregarding order is leveraging Python’s built-in set data structure. Sets, by definition, only contain unique elements.

my_list = [1, 2, 2, 3, 4, 4, 5, 1]

unique_set = set(my_list)

unique_list = list(unique_set)

print(f"Original list: {my_list}")
print(f"List with duplicates removed (order might change): {unique_list}")

This method is concise and efficient for large lists but doesn’t guarantee the original order of elements.

Method 2: Preserving Order with a Loop

If maintaining the original order is crucial, you can use a loop and a temporary list to track seen elements.

my_list = [1, 2, 2, 3, 4, 4, 5, 1]
unique_list = []
seen = set()

for item in my_list:
    if item not in seen:
        unique_list.append(item)
        seen.add(item)

print(f"Original list: {my_list}")
print(f"List with duplicates removed (order preserved): {unique_list}")

This approach iterates through the list. The seen set efficiently checks if an element has already been encountered. Only unique elements are appended to unique_list.

Method 3: List Comprehension (for concise code)

For those who prefer compact code, list comprehension offers a more elegant solution, achieving the same result as Method 2:

my_list = [1, 2, 2, 3, 4, 4, 5, 1]
unique_list = [item for i, item in enumerate(my_list) if item not in my_list[:i]]

print(f"Original list: {my_list}")
print(f"List with duplicates removed (order preserved): {unique_list}")

This method cleverly uses list comprehension and slicing (my_list[:i]) to check for duplicates up to the current element’s index. It’s concise but might be slightly less efficient than Method 2 for extremely large lists.

Method 4: Using dict.fromkeys() (Pythonic approach)

Leveraging dict.fromkeys() provides a somewhat less intuitive but Pythonic way to achieve the same result, albeit sacrificing order:

my_list = [1, 2, 2, 3, 4, 4, 5, 1]
unique_list = list(dict.fromkeys(my_list))

print(f"Original list: {my_list}")
print(f"List with duplicates removed (order might change): {unique_list}")

This method uses the fact that dictionary keys are unique. The order is not guaranteed here, much like Method 1.

Each method offers a different trade-off between efficiency, code readability, and order preservation. Choose the method best suited for your specific needs and context.