Let’s go straight to the point. Heaps are not the most commonly known data structure out there. They have very specific properties with narrow use cases. But once I learned about them, I see them more and more useful.

When you need to get min/max from a collection of items, the first thing that comes to mind is to get it sorted and then take the first/last element. But you don’t need to keep everything sorted, you just need to keep the min/max at the top. This is where heaps come in handy. Especially if the collection is dynamic.

I am not going to explain how this datastructure works in detail, it is done well in a lot of places. Just know that heaps are a binary tree with the following properties, comparing to a sorted array:

Operation Sorted Array Heap
Insert O(n) O(log n)
Delete O(n) O(log n)
Get Min/Max O(1) O(1)
Extract Min/Max (1) O(n) O(log n)
Sort/Heapify (2) O(n log n) O(log n)

(1) Extracting min/max is not the same as getting it. It removes the element from the collection. (2) Heapify is a process of converting an unsorted array into a heap.

container/heap#

Golang’s stdlib has a container/heap package that does the heavy algorithmic lifting. The interface is simple.

type Interface interface {
	sort.Interface
	Push(x any) // add x as element Len()
	Pop() any   // remove and return element Len() - 1.
}

Don’t let the sort.Interface confuse you. It won’t sort anything - it just needs the same functions: Len(), Less(i, j int) bool, and Swap(i, j int). You can play around the the basic implementation in the official documentation. For testing I am going use the IntHeap from the example in the documentation.

type IntHeap []int

func (h IntHeap) Len() int           { return len(h) }
func (h IntHeap) Less(i, j int) bool { return h[i] < h[j] }
func (h IntHeap) Swap(i, j int)      { h[i], h[j] = h[j], h[i] }

func (h *IntHeap) Push(x any) {
	// Push and Pop use pointer receivers because they modify the slice's length,
	// not just its contents.
	*h = append(*h, x.(int))
}

func (h *IntHeap) Pop() any {
	old := *h
	n := len(old)
	x := old[n-1]
	*h = old[0 : n-1]
	return x
}

sorted array based heap#

Just for performance comparison, let’s implement a heap using a sorted array. I am going to keep the same Pop and Push interface as the container/heap package, but the implementation will be different. Instead of maintaining a binary tree, we will keep an array sorted at all times.

type IntHeapSortedArray []int

func (h IntHeapSortedArray) Len() int { return len(h) }
func (h *IntHeapSortedArray) Init() {
	sort.Ints(*h) // Ensure the array is sorted
}

func (h *IntHeapSortedArray) Push(x any) {
	*h = append(*h, x.(int))
	sort.Ints(*h)
}
func (h *IntHeapSortedArray) Pop() any {
	if len(*h) == 0 {
		panic("heap is empty")
	}
	x := (*h)[0]
	*h = (*h)[1:]
	return x
}

benchmarking comparison#

Consider the following benchmark. We have 2 heap implementations: one using container/heap and another using a sorted array. We will push and pop items from both heaps and compare their performance. We are going to use different item counts to see how they scale.

func BenchmarkHeaps(b *testing.B) {
	h1 := &IntHeap{}
	h3 := &IntHeapSortedArray{}

	heaps := []struct {
		name string
		h    SlimHeap
	}{
		{"container/heap", h1},
		{"sorted array", h3},
	}

	for _, heap := range heaps {
		for itemCount := 10; itemCount < 1001; itemCount *= 10 {
			b.Run(fmt.Sprintf("%s/itemCount=%d", heap.name, itemCount), func(b *testing.B) {
				for i := 0; i < b.N; i++ {
					for j := 0; j < itemCount; j++ {
						heap.h.Push(j * rand.Intn(1000))
					}

					for j := 0; j < itemCount; j++ {
						heap.h.Pop()
					}
					assert.Equal(b, 0, heap.h.Len())
				}
			})
		}
	}
}

While the itemCount size grows exponentially (10, 100, 1000), the benchmark execution time grows linearly for the container/heap implementation (498, 3087, 27326), while the sorted array implementation grows exponentially (635.8, 13085, 559916).

BenchmarkHeaps/container/heap/itemCount=10-8         	 2493511	       498.0 ns/op	     132 B/op	      16 allocs/op
BenchmarkHeaps/container/heap/itemCount=100-8        	  344109	      3087 ns/op	    1562 B/op	     195 allocs/op
BenchmarkHeaps/container/heap/itemCount=1000-8       	   43798	     27326 ns/op	   15944 B/op	    1993 allocs/op
BenchmarkHeaps/sorted_array/itemCount=10-8           	 1881627	       635.8 ns/op	     292 B/op	      18 allocs/op
BenchmarkHeaps/sorted_array/itemCount=100-8          	   92127	     13085 ns/op	    3098 B/op	     196 allocs/op
BenchmarkHeaps/sorted_array/itemCount=1000-8         	    2151	    559916 ns/op	   33864 B/op	    1996 allocs/op

One day I am going to learn how to better visualize benchmark results, but for now this needs to do