Heap sort stands as a powerful and efficient sorting algorithm, boasting a time complexity of O(n log n) in all cases – a significant advantage over algorithms like bubble sort or insertion sort, which can degrade to O(n²) in worst-case scenarios. This makes it a preferred choice for large datasets where performance is critical. This blog post will look at the complexities of heap sort, providing a clear explanation and practical Python implementation.
Understanding the Heap Data Structure
Before diving into the algorithm itself, it’s important to grasp the concept of a heap. A heap is a specialized tree-based data structure that satisfies the heap property:
- Max-Heap: The value of each node is greater than or equal to the value of its children. The root node contains the largest element.
- Min-Heap: The value of each node is less than or equal to the value of its children. The root node contains the smallest element.
We’ll focus on max-heap for our heap sort implementation, as it simplifies the process of finding the largest element. Heaps are often represented as arrays for efficient memory management and access.
Building the Heap
The first step in heap sort involves transforming the input array into a max-heap. This is achieved using a process called heapify. The heapify
function recursively ensures that the heap property is maintained for every subtree within the array.
def heapify(arr, n, i):
= i # Initialize largest as root
largest = 2 * i + 1 # left = 2*i + 1
l = 2 * i + 2 # right = 2*i + 2
r
# See if left child of root exists and is greater than root
if l < n and arr[l] > arr[largest]:
= l
largest
# See if right child of root exists and is greater than root
if r < n and arr[r] > arr[largest]:
= r
largest
# Change root, if needed
if largest != i:
= arr[largest], arr[i] # swap
arr[i], arr[largest]
# Heapify the root.
heapify(arr, n, largest)
The heapify
function takes the array, its size, and the index of the element to heapify as input. It recursively ensures the subtree rooted at that index satisfies the max-heap property.
Implementing the Heap Sort Algorithm
With the heapify
function in place, we can now implement the complete heap sort algorithm:
def heap_sort(arr):
= len(arr)
n
# Build a maxheap.
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
# One by one extract an element from heap
for i in range(n - 1, 0, -1):
0] = arr[0], arr[i] # swap
arr[i], arr[0)
heapify(arr, i,
#Example Usage
= [12, 11, 13, 5, 6, 7]
arr
heap_sort(arr)print("Sorted array is:", arr)
The algorithm first builds a max-heap from the input array. Then, it iteratively extracts the maximum element (the root) and places it at the end of the array. The remaining array is then heapified again, repeating the process until the entire array is sorted.
Analyzing the Time Complexity
As mentioned earlier, heap sort has a time complexity of O(n log n) for all cases (best, average, and worst). The building of the heap takes O(n) time, and extracting the maximum element and heapifying the remaining array takes O(log n) time for each of the n elements.
Space Complexity
Heap sort is an in-place algorithm, meaning it sorts the array without requiring significant extra space. Its space complexity is therefore O(1). This makes it memory-efficient, especially when dealing with large datasets.