Using merge sort to solve ‘Counting Inversions Problem’

Describe: Given a list contains n numbers, count all inversions in that list.
Example: [4, 2, 3, 5, 1]
There are 6 inversions
4-2
4-3
4-1
2-1
3-1
5-1

The easiest way is using two for-loops, but the complexity is O(n^2), which is unacceptable with a list of large amount of numbers.
There is more efficient and faster way is using merge sort. If you still haven’t know merge sort yet, please check out this. Using merge sort, we can simultaneously count the number of inversions and sort data. The complexity of this method is equal with complexity of merge sort itself.

Here’s the pseudocode.

screen-shot-2017-01-03-at-7-40-03-pm

Example: [4, 2, 3, 5, 1]

*LEFT = [4,2]
-left = 4
-right = 2
-merge([4], [2]) = [2,4] : 1 inversion (2 jumps over 4)
*RIGHT = [3,5,1]
-left = 3
-right = [5,1]
–left = 5
–right = 1
–merge([5], [1]) = [1,5] : 1 inversion (1 jumps over 5)
-merge([3], [1,5]) = [1,3,5] : 1 inversion (1 jumps over 3)
*MERGE([2,4], [1,3,5]) = [1,2,3,4,5]
res = []
step1:
right[0] res = [1], left = [2,4], right = [3,5]
step2:
left[0] res = [1,2], left = [4], right = [3,5]
step3:
right[0] res = [1,2,3], left=[4], right=[5]
There will be no more inversion since [4] < [5]
Total of inversions: 1+1+1+2+1 = 6

Implementation: Python2.7

"""
    Using merge sort algorithm to count inversions in a list of data
    @author: loctv
    @time: 05:51PM January 3, 2017 (+7)
"""

def sort_and_count(data):
    """
        repeatedly divide data into two parts
        return (number_of_inversions, sorted_data)
    """
    if len(data) == 1: #data contains only 1 element
        return (0, data)
    #divide data into two parts
    left_side = data[:len(data)//2]
    right_side = data[len(data)//2:]
    #recursive call on two parts
    print left_side, right_side
    left_count, left_side = sort_and_count(left_side)
    right_count, right_side = sort_and_count(right_side)
    total_count, sorted_data = merge_and_count(left_side, right_side)
    return total_count + left_count + right_count, sorted_data

def merge_and_count(left, right):
    """
        merge left and right
        count the number of inversions while merging two sides
    """
    result, count_inversion = [], 0
    while len(left) > 0 and len(right) > 0:
        if left[0] < right[0]:
            result.append(left[0])
            del left[0]
        else: #inversions appear
            result.append(right[0])
            #number of inversion = the number of element in left side
            count_inversion += len(left)
            del right[0]
    #append the rest elements in either left or right side into result
    while len(left) > 0:
        result.append(left[0])
        del left[0]
    while len(right) > 0:
        result.append(right[0])
        del right[0]
    return count_inversion, result

if __name__ == '__main__':
    DATA = [4, 2, 3, 5, 1]
    INVERSIONS, SORTED_DATA = sort_and_count(DATA)
    print INVERSIONS, SORTED_DATA

Advertisements