Heap Sort

Heap sort is a comparison-based sorting algorithm that uses a binary heap data structure. It has guaranteed $O(n \log n)$ time complexity and sorts in-place.

Algorithm Overview

  1. Build Max Heap: Convert array into max heap
  2. Extract Max: Repeatedly remove max element and rebuild heap

Key Properties

  • Time: $O(n \log n)$ for all cases
  • Space: $O(1)$ (in-place)
  • Not stable: Equal elements may change relative order
  • Not adaptive: Doesn't benefit from partially sorted data

Heap Structure

A max heap is a complete binary tree where each parent β‰₯ its children.

Array Representation

For node at index $i$:

  • Parent: $\lfloor (i-1)/2 \rfloor$
  • Left child: $2i + 1$
  • Right child: $2i + 2$

Visualization

Array: [90, 70, 60, 50, 40, 30, 20]

Heapify Operation

Maintain heap property by "bubbling down" a node.

 1// Heapify subtree rooted at index i
 2// n is size of heap
 3func heapify(arr []int, n, i int) {
 4    largest := i
 5    left := 2*i + 1
 6    right := 2*i + 2
 7    
 8    // If left child is larger than root
 9    if left < n && arr[left] > arr[largest] {
10        largest = left
11    }
12    
13    // If right child is larger than largest so far
14    if right < n && arr[right] > arr[largest] {
15        largest = right
16    }
17    
18    // If largest is not root
19    if largest != i {
20        arr[i], arr[largest] = arr[largest], arr[i]
21        
22        // Recursively heapify the affected subtree
23        heapify(arr, n, largest)
24    }
25}

Time: $O(\log n)$ - height of heap

Iterative Heapify

 1func heapifyIterative(arr []int, n, i int) {
 2    for {
 3        largest := i
 4        left := 2*i + 1
 5        right := 2*i + 2
 6        
 7        if left < n && arr[left] > arr[largest] {
 8            largest = left
 9        }
10        
11        if right < n && arr[right] > arr[largest] {
12            largest = right
13        }
14        
15        if largest == i {
16            break
17        }
18        
19        arr[i], arr[largest] = arr[largest], arr[i]
20        i = largest
21    }
22}

Build Max Heap

Convert unsorted array into max heap.

1func buildMaxHeap(arr []int) {
2    n := len(arr)
3    
4    // Start from last non-leaf node and heapify all nodes
5    // Last non-leaf node is at index (n/2 - 1)
6    for i := n/2 - 1; i >= 0; i-- {
7        heapify(arr, n, i)
8    }
9}

Time: $O(n)$ - not $O(n \log n)$!

Proof: Most nodes are near bottom (few swaps), fewer nodes near top (many swaps).

$$ T(n) = \sum_{h=0}^{\log n} \frac{n}{2^{h+1}} \times h = O(n) $$

Complete Heap Sort

 1func HeapSort(arr []int) {
 2    n := len(arr)
 3    
 4    // Step 1: Build max heap
 5    buildMaxHeap(arr)
 6    
 7    // Step 2: Extract elements one by one
 8    for i := n - 1; i > 0; i-- {
 9        // Move current root (max) to end
10        arr[0], arr[i] = arr[i], arr[0]
11        
12        // Heapify reduced heap
13        heapify(arr, i, 0)
14    }
15}
16
17// Example
18func main() {
19    arr := []int{12, 11, 13, 5, 6, 7}
20    fmt.Println("Before:", arr)
21    HeapSort(arr)
22    fmt.Println("After:", arr)
23    // Output: [5, 6, 7, 11, 12, 13]
24}

Step-by-Step Example

Input: [4, 10, 3, 5, 1]

Step 1: Build Max Heap

1Initial: [4, 10, 3, 5, 1]
2
3Heapify index 1: [4, 10, 3, 5, 1] (10 > 5, 1 - no change)
4Heapify index 0: [10, 5, 3, 4, 1] (swap 4 and 10, then 4 and 5)
5
6Max Heap: [10, 5, 3, 4, 1]

Step 2: Extract Max

 1[10, 5, 3, 4, 1] β†’ Swap 10 and 1 β†’ [1, 5, 3, 4 | 10]
 2Heapify: [5, 4, 3, 1 | 10]
 3
 4[5, 4, 3, 1 | 10] β†’ Swap 5 and 1 β†’ [1, 4, 3 | 5, 10]
 5Heapify: [4, 1, 3 | 5, 10]
 6
 7[4, 1, 3 | 5, 10] β†’ Swap 4 and 3 β†’ [3, 1 | 4, 5, 10]
 8Heapify: [3, 1 | 4, 5, 10]
 9
10[3, 1 | 4, 5, 10] β†’ Swap 3 and 1 β†’ [1 | 3, 4, 5, 10]
11
12Final: [1, 3, 4, 5, 10]

Complexity Analysis

Time Complexity

Build Heap: $O(n)$
Extract Max (n times): $n \times O(\log n) = O(n \log n)$

Total: $O(n + n \log n) = O(n \log n)$

All cases: Best, Average, Worst = $O(n \log n)$

Space Complexity

In-place: $O(1)$ auxiliary space
Recursion stack: $O(\log n)$ for recursive heapify

Iterative version: $O(1)$ total space

Min Heap Sort (Descending Order)

For descending order, use min heap:

 1func heapifyMin(arr []int, n, i int) {
 2    smallest := i
 3    left := 2*i + 1
 4    right := 2*i + 2
 5    
 6    if left < n && arr[left] < arr[smallest] {
 7        smallest = left
 8    }
 9    
10    if right < n && arr[right] < arr[smallest] {
11        smallest = right
12    }
13    
14    if smallest != i {
15        arr[i], arr[smallest] = arr[smallest], arr[i]
16        heapifyMin(arr, n, smallest)
17    }
18}
19
20func HeapSortDescending(arr []int) {
21    n := len(arr)
22    
23    // Build min heap
24    for i := n/2 - 1; i >= 0; i-- {
25        heapifyMin(arr, n, i)
26    }
27    
28    // Extract min elements
29    for i := n - 1; i > 0; i-- {
30        arr[0], arr[i] = arr[i], arr[0]
31        heapifyMin(arr, i, 0)
32    }
33}

Optimizations

1. Bottom-Up Heapify (Floyd's Method)

More efficient heap construction:

 1func buildMaxHeapFloyd(arr []int) {
 2    n := len(arr)
 3    
 4    // Build heap from bottom up
 5    for i := n/2 - 1; i >= 0; i-- {
 6        // Sift down
 7        child := 2*i + 1
 8        for child < n {
 9            // Find larger child
10            if child+1 < n && arr[child+1] > arr[child] {
11                child++
12            }
13            
14            // If parent is larger, done
15            if arr[i] >= arr[child] {
16                break
17            }
18            
19            arr[i], arr[child] = arr[child], arr[i]
20            i = child
21            child = 2*i + 1
22        }
23    }
24}

2. Ternary Heap

Use 3 children per node instead of 2:

  • Fewer levels: $\log_3 n$ instead of $\log_2 n$
  • More comparisons per level: 3 instead of 2
  • Trade-off: Better cache locality vs. more comparisons

Comparison with Other Sorts

AlgorithmTime (Best)Time (Avg)Time (Worst)SpaceStableIn-Place
Heap Sort$O(n \log n)$$O(n \log n)$$O(n \log n)$$O(1)$NoYes
Quick Sort$O(n \log n)$$O(n \log n)$$O(n^2)$$O(\log n)$NoYes
Merge Sort$O(n \log n)$$O(n \log n)$$O(n \log n)$$O(n)$YesNo
Insertion Sort$O(n)$$O(n^2)$$O(n^2)$$O(1)$YesYes

Advantages

βœ… Guaranteed $O(n \log n)$: No worst-case degradation like quick sort
βœ… In-place: $O(1)$ extra space
βœ… No recursion needed: Can be fully iterative
βœ… Predictable: Performance is consistent

Disadvantages

❌ Not stable: Equal elements may change order
❌ Not adaptive: Doesn't benefit from partial sorting
❌ Cache unfriendly: Random memory access pattern
❌ Slower in practice: Higher constant factors than quick sort

Applications

1. Priority Queue

Heap sort is based on heap data structure, which is perfect for priority queues.

2. K Largest/Smallest Elements

 1func FindKLargest(arr []int, k int) []int {
 2    // Build max heap
 3    buildMaxHeap(arr)
 4    
 5    result := make([]int, k)
 6    n := len(arr)
 7    
 8    // Extract k largest
 9    for i := 0; i < k; i++ {
10        result[i] = arr[0]
11        arr[0], arr[n-1-i] = arr[n-1-i], arr[0]
12        heapify(arr, n-1-i, 0)
13    }
14    
15    return result
16}

Time: $O(n + k \log n)$

3. Median Maintenance

Use two heaps (max heap for lower half, min heap for upper half) to maintain running median.

4. External Sorting

Heap sort is useful for external sorting when data doesn't fit in memory.

When to Use Heap Sort

βœ… Use when:

  • Need guaranteed $O(n \log n)$ performance
  • Memory is limited (in-place sorting)
  • Worst-case performance matters
  • Implementing priority queue
  • Finding k largest/smallest elements

❌ Don't use when:

  • Stability is required (use merge sort)
  • Data is nearly sorted (use insertion sort or tim sort)
  • Average-case performance is critical (use quick sort)
  • Cache efficiency matters (use quick sort or merge sort)

Complete Implementation with Comments

 1package main
 2
 3import "fmt"
 4
 5// HeapSort sorts array in ascending order
 6func HeapSort(arr []int) {
 7    n := len(arr)
 8    
 9    // Build max heap
10    // Start from last non-leaf node
11    for i := n/2 - 1; i >= 0; i-- {
12        heapify(arr, n, i)
13    }
14    
15    // Extract elements from heap one by one
16    for i := n - 1; i > 0; i-- {
17        // Move current root (max) to end
18        arr[0], arr[i] = arr[i], arr[0]
19        
20        // Call heapify on reduced heap
21        heapify(arr, i, 0)
22    }
23}
24
25// heapify maintains max heap property
26// n is size of heap, i is index of root
27func heapify(arr []int, n, i int) {
28    largest := i       // Initialize largest as root
29    left := 2*i + 1    // Left child
30    right := 2*i + 2   // Right child
31    
32    // If left child exists and is greater than root
33    if left < n && arr[left] > arr[largest] {
34        largest = left
35    }
36    
37    // If right child exists and is greater than largest so far
38    if right < n && arr[right] > arr[largest] {
39        largest = right
40    }
41    
42    // If largest is not root
43    if largest != i {
44        arr[i], arr[largest] = arr[largest], arr[i]
45        
46        // Recursively heapify the affected subtree
47        heapify(arr, n, largest)
48    }
49}
50
51func main() {
52    arr := []int{12, 11, 13, 5, 6, 7, 3, 2, 9, 1}
53    fmt.Println("Original:", arr)
54    
55    HeapSort(arr)
56    fmt.Println("Sorted:", arr)
57    
58    // Output:
59    // Original: [12 11 13 5 6 7 3 2 9 1]
60    // Sorted: [1 2 3 5 6 7 9 11 12 13]
61}

Visualization of Heapify

Common Interview Problems

1. Kth Largest Element

Use heap sort or partial heap construction.

2. Sort Nearly Sorted Array

If array is k-sorted (each element is at most k positions away), use min heap of size k+1.

3. Merge K Sorted Arrays

Use min heap to efficiently merge k sorted arrays.

Related Snippets

  • Binary Search
    Binary search is an efficient algorithm for finding a target value in a sorted …
  • Binary Tree
    A binary tree is a tree data structure where each node has at most two children, …
  • General Tree
    A tree is a hierarchical data structure consisting of nodes connected by edges, …
  • Heap Data Structure
    A heap is a specialized tree-based data structure that satisfies the heap …
  • Linked List Operations
    Optimal techniques for common linked list operations using pointer manipulation, …
  • LRU and LFU Cache
    Cache replacement policies determine which items to evict when the cache is …
  • Merge Sort
    Merge sort is a stable, comparison-based sorting algorithm that uses the …
  • Quadtree
    A Quadtree is a tree data structure where each internal node has exactly four …
  • Quick Sort
    Quick sort is an efficient, in-place, comparison-based sorting algorithm that …
  • String Similarity Algorithms
    Algorithms for measuring similarity between strings, used in spell checking, DNA …
  • Terrain & Heightmap Generation
    Procedural terrain generation creates realistic landscapes using mathematical …
  • Trie (Prefix Tree)
    A Trie (pronounced &quot;try&quot;) is a tree-like data structure used to store …
  • Wave Function Collapse
    Wave Function Collapse (WFC) is a procedural generation algorithm that creates …