Breder.org

Merge sort

Merge sort is a divide-and-conquer algorithm for sorting which has a time complexity of O(n log n). It goes as follows given a sequence S of n elements:

1. Base case: If the sequence has a single element or no elements, it is sorted and we're done.

2. Divide step: Split the sequence in the middle, obtaining subsequences A and B.

3. Recursion step: Apply merge sort to each of the subsequences A and B, which will now be sorted.

4. Merge step: Take the smallest element between the sorted A and B subsequences by checking the next element of each from the start. If there are no more elements on either A or B, take all the remaining elements of the other. Repeat this step until a sorted sequence is built with all elements of A and B, which will be the sorted sequence S.

Functional implementation

# NOTE: assuming both "a" and "b" are sorted sequences.
# "c" acts as the accumulator for the resulting merged sequence.
def merge_sorted_sequences(a, b, c=[]):
    # if either sequence is empty, return the other (base case)
    if len(a) == 0: return c + b
    if len(b) == 0: return c + a

    # take the smallest element of either "a" or "b" and build
    # up the resulting sorted sequence "c"
    if a[0] <= b[0]:
        return merge_sorted_sequences(a[1:], b, c + [a[0]])
    else:
        return merge_sorted_sequences(a, b[1:], c + [b[0]])


def merge_sort(arr):
    if len(arr) <= 1: return arr         # base case

    m = len(arr) // 2                    # midpoint of "arr"
    p, q = arr[:m], arr[m:]              # divide step
    a, b = merge_sort(p), merge_sort(q)  # recursion step
    return merge_sorted_sequences(a, b)  # conquer step

Source