Insertion sort is an elementary sorting algorithm. It has a time complexity of Θ(n2), thus being slower than heapsort, merge sort and also shellsort. Insertion sort is well suited for sorting small data sets or for the insertion of new elements into a sorted sequence.
Let a0, ..., an-1 be the sequence to be sorted. At the beginning and after each iteration of the algorithm the sequence consists of two parts: the first part a0, ..., ai-1 is already sorted, the second part ai, ..., an-1 is still unsorted (i ∈ 0, ..., n).
In order to insert element ai into the sorted part, it is compared with ai-1, ai-2 etc. When an element aj with aj ≤ ai is found, ai is inserted behind it. If no such element is found, then ai is inserted at the beginning of the sequence.
After inserting element ai the length of the sorted part has increased by one. In the next iteration, ai+1 is inserted into the sorted part etc. While at the beginning the sorted part consists of element a0 only, at the end it consists of all elements a0, ..., an-1.
Example: The following table shows the steps for sorting the sequence 5 7 0 3 4 2 6 1. On the left side the sorted part of the sequence is shown in red. For each iteration, the number of positions the inserted element has moved is shown in brackets. Altogether this amounts to 17 steps.
5 | 7 | 0 | 3 | 4 | 2 | 6 | 1 | (0) | |
5 | 7 | 0 | 3 | 4 | 2 | 6 | 1 | (0) | |
0 | 5 | 7 | 3 | 4 | 2 | 6 | 1 | (2) | |
0 | 3 | 5 | 7 | 4 | 2 | 6 | 1 | (2) | |
0 | 3 | 4 | 5 | 7 | 2 | 6 | 1 | (2) | |
0 | 2 | 3 | 4 | 5 | 7 | 6 | 1 | (4) | |
0 | 2 | 3 | 4 | 5 | 6 | 7 | 1 | (1) | |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | (6) |
The following function insertionsort sorts an integer array a[0], ..., a[n-1].
The sorting function is encapsulated in a class InsertionSorter. The method sort passes the array to be sorted to an array a, sets n to its length and calls insertionsort.
The statement to sort an array b with insertion sort is
InsertionSorter.sort(b);
The program follows.
The worst case occurs when in every step the proper position for the element that is inserted is found at the beginning of the sorted part of the sequence. I.e. in the while-loop sequences of length 1, 2, 3, ..., n-1 are scanned. Altogether, these are (n-1)·n / 2 ∈ Θ(n2) operations. This case occurs when the original sequence is sorted in decreasing order.
It is possible to find the inserting position of element ai faster, namely by binary search. However, moving the elements to the right in order to make room for the element to be inserted takes linear time anyway.
The exact number of steps required by insertion sort is given by the number of inversions of the sequence.
Definition: Let a = a0, ..., an-1 be a finite sequence. An inversion is a pair of index positions, where the elements of the sequence are out of order. Formally, an inversion is a pair (i, j), where i < j and ai > aj.1)
Example: Let a = 5, 7, 4, 9, 7. Then, (0, 2) is an inversion, since a0 > a2, namely 5 > 4. Also, (1, 2) is an inversion, since 7 > 4, and (3, 4) is an inversion, since 9 > 7. There are no other inversions in this sequence.
We now count the inversions (i, j) of a sequence a separately for each position j. The resulting value of vj gives the number of elements ai to the left of aj which are greater than aj.
For instance, for the sequence a = 5, 7, 4, 9, 7 we have v2 = 2, since the two inversions (0, 2) and (1, 2) have 2 as their second component. Here, 5 and 7 are greater than 4.
Definition: The inversion sequence v = v0, ..., vn-1 of a sequence a = a0, ..., an-1 is defined by
vj = |{ (i, j) | i < j ∧ ai > aj }|
for j = 0, ..., n-1.
Example: The above sequence a = 5, 7, 4, 9, 7 has the inversion sequence v = 0, 0, 2, 0, 1.
Obviously, we have vi ≤ i for all i = 0, ..., n-1. If all vi are 0, then and only then the sequence a is sorted. If the sequence a is a permutation, then it is uniquely determined by its inversion sequence v. The permutation n-1, ..., 0 has inversion sequence 0, ..., n-1.
Theorem: Let a = a0, ..., an-1 be a sequence and v = v0, ..., vn-1 be its inversion sequence. Then the number of steps T(a) required by insertion sort to sort sequence a is
T(a) = i = 0, ..., n-1 vi
Proof: Obviously, insertion sort takes vi steps in each iteration i. Thus, the overall number of steps is equal to the sum of all vi.
Example: The following table shows the sequence a of the first example and its corresponding inversion sequence. For instance, v5 = 4 since 4 elements to the left of a5 = 2 are greater than 2 (namely 5, 7, 3 and 4). The total number of inversions is 17.
Thus, insertion sort requires 17 steps to sort this sequence (see first example).
[Knu 73] D.E. Knuth: The Art of Computer Programming, Vol. 3 - Sorting and Searching. Addison-Wesley (1973)
[Sed 88] R. Sedgewick: Algorithms. 2nd edition, Addison-Wesley (1988)
[Web 1] http://www.bluffton.edu/~nesterd/java/SortingDemo.html Simulation of insertion sort and other sorting algorithms
1) If the sequence a is a permutation, the pair (ai, aj) corresponding to an inversion (i, j) may be called an inversion, instead of the index pair (i, j) [Knu 73].
Continue with: [Quicksort] or [up]