Sorting is a quintessential computer science task. Given a list or array of elements that can be ordered (e.g., numbers, ages, salaries, scores) we desire to reorder them, into either increasing or decreasing order. Sorting is used in many branches of computer science, from information retrieval to machine learning.

How does merge sort work?

Merge sort is an example of a “divide and conquer” algorithm. This means the given array is divided into smaller sub-arrays each of which is sorted and then the sorted sub-arrays are merged together (hence the name “merge sort”).

Below is a simple illustration of how merge sort works. We are given the array [35, 67, 12, 95, 21] and desire to sort it in increasing order, so that 12 becomes the first element and 95 becomes the last element. In other words, the array should become [12, 21, 35, 67, 95]. We will show how merge sort works by showing the first few steps.

In the above picture, red lines refer to divide steps where an array is broken down. Green lines refer to merge steps where two sorted arrays are brought back together.

First note that the array has 5 elements so we break it into approximately two equal pieces: the sub-array [35, 67, 12] and then the sub-array [95, 21]. Each of these sub-arrays is further divided until eventually we get five single arrays [35], [67], [12], [95], and [21]. Note that each of these final sub-arrays are sorted (limiting case, because they each contain only one element).

Next, the merge phase of “merge sort” begins. Given two sorted arrays we aim to merge them so that the relative positions of numbers in each array are maintained and the overall array is sorted. For instance, [35] and [67] are merged to yield [35,67]. Then [35,67] and [12] are merged to yield [12,35,67], and so on.

How efficient is merge sort?

Compared to insertion sort and bubble sort, merge sort is more efficient and has O(n*log(n)) worst case time complexity, meaning as the size of the array increases the time taken to sort it increases not quadratically like insertion sort or bubble sort but slower than that. This means that as the size of the array to be sorted gets larger and larger, the advantages of merge sort become more and more apparent.

What are the advantages of merge sort?

Merge sort has many advantages. First, because of the divide-and-conquer strategy, merge sort can be easily parallelized, meaning the smaller arrays can be sorted by different computers/processing units in parallel, and hence this is a very practical approach to sorting very large arrays (imagine arrays of millions of numbers).

In fact, even the merging part can be parallelized. This also makes merge sort a basis to sort arrays in external memory (i.e., arrays that will not fit into main memory).

A second feature of merge sort is that, not only is its worst-case time complexity O(n*log(n)), its average-case time complexity (and best-case time complexity) is also O(n*log(n)).

What are the disadvantages of mergesort?

Because of the way the algorithm is structured, you will need a temporary array to merge the two sorted arrays in each step. As a result, merge sort is not memory efficient like bubble sort or insertion sort.

How can we code merge sort in Python?

We will first write code to divide an array into two approximately equal-sized parts. Next, we will write code to merge two sorted arrays (we will assume that the inputs are already given in sorted order). Finally we will write the overall mergesort routine to breakup given arrays, call merge sort recursively on each sub-array, and then use the merge function just written to put them back together.

First, here is Python code to divide the array

Note that we are using the Python Math library to obtain access to the ceiling function. Given an array length of 3, this routine will make the left array to have 2 elements and the right array to have 1 element.

Next, we will write the routine to merge two arrays. Remember that we are assuming that the arrays are (individually) sorted already.

This routine has several interesting nuances. First, we are creating a new array of just the right size to hold the elements of both the left and right array. Then we are having a system of counters to progressively go down all three arrays: the left array, the right array, and the combined array. At each step we are checking to see which element is smaller and adding it to the result, and suitably advancing the counter. If one array is exhausted we bring in elements from the other array.

Finally, here is the mergesort routine itself:

Note that we divide the array and then call mergesort recursively on each and then merge the results. If the array is small enough (empty or just one element), we conclude that it is already sorted and thus return it.

You can invoke it as:

This yields, as expected:

In this blogpost we have learnt about the merge sort algorithm and how to code it in Python. Learn about a different sorting algorithm in this blog post about insertion sort. Once you have mastered merge sort, learn more about the state-of-the-art quicksort algorithm. Also learn about Python's enumerate() functionality.