Insertion Sort
Insertion sort is a sorting algorithm that builds a final sorted array (sometimes called a list) one element at a time. While sorting is a simple concept, it is a basic principle used in complex computer programs such as file search, data compression, and path finding. Running time is an important thing to consider when selecting a sorting algorithm since efficiency is often thought of in terms of speed. Insertion sort has an average and worst-case running time of $O(n^2)$, so in most cases, a faster algorithm is more desirable.
Contents
The Sorting Problem
An algorithm that maps the following input/output pair is called a sorting algorithm:
- Input: An array $A$ that contains $n$ orderable elements $A[0,1,...,n-1]$.
- Output: A sorted permutation of $A$, called $B$, such that $B[0] \leq B[1] \leq \cdots \leq B[n-1].$ $_\square$
Here is what it means for an array to be sorted.
An array $<a_n>$ is sorted if and only if for all $i<j$, $a_i \leq a_j.$
In other words, a sorted array is an array that is in a particular order. For example, $[a,b,c,d]$ is sorted alphabetically, $[1,2,3,4,5]$ is a list of integers sorted in increasing order, and $[5,4,3,2,1]$ is a list of integers sorted in decreasing order.
By convention, empty arrays and singleton arrays (arrays consisting of only one element) are always sorted. This is a key point for the base case of many sorting algorithms.
Insertion Sort
The insertion sort algorithm iterates through an input array and removes one element per iteration, finds the place the element belongs in the array, and then places it there. This process grows a sorted list from left to right. The algorithm is as follows:
For each element $A[i]$, if $A[i]$ $\gt$ $A[i+1]$, swap the elements until $A[i]$ $\leq$ $A[i+1]$.
The animation below illustrates insertion sort:
Sort the following array using insertion sort.
$A = [8,2,4,9,3,6]$^{[1]}
Another way to visualize insertion sort is to think of a stack of playing cards.
You have the cards 3, 9, 6, 5, and 7. Write an algorithm or pseudo-code that arranges the values of the cards in ascending order.
Basically, the idea is to run insertion sort $n -1$ times and find the index at which each element should be inserted.
Here is some pseudocode.^{[2]}
1 2 3 4 5 6 7 8 9for i = 1 to length(A) x = A[i] j = i - 1 while j >= 0 and A[j] > x A[j+1] = A[j] j = j - 1 end while A[j+1] = x end for
The sorted array is [3, 5, 6, 7, 9]. $_\square$
Insertion Sort Implementation
Here is one way to implement insertion sort in Python. There are other ways to implement the algorithm, but all implementations stem from the same ideas. Insertion sort can sort any orderable list.
1 2 3 4 5 6 7 8 9def insertion_sort(array): for slot in range(1, len(array)): value = array[slot] test_slot = slot - 1 while test_slot > -1 and array[test_slot] > value: array[test_slot + 1] = array[test_slot] test_slot = test_slot - 1 array[test_slot + 1] = value return array
The above Python code sorts a list in increasing order. What single change could you make to have insertion sort sort in decreasing order?Flip the second greater than sign to a less than sign in line 5. Line 5 should read:
while test slot > -1 and array[test slot] < value: #flip this sign
Complexity of Insertion Sort
Insertion sort runs in $O(n)$ time in its best case and runs in $O(n^2)$ in its worst and average cases.
Best Case Analysis:
Insertion sort performs two operations: it scans through the list, comparing each pair of elements, and it swaps elements if they are out of order. Each operation contributes to the running time of the algorithm. If the input array is already in sorted order, insertion sort compares $O(n)$ elements and performs no swaps (in the Python code above, the inner loop is never triggered). Therefore, in the best case, insertion sort runs in $O(n)$ time.
Worst and Average Case Analysis:
The worst case for insertion sort will occur when the input list is in decreasing order. To insert the last element, we need at most $n-1$ comparisons and at most $n-1$ swaps. To insert the second to last element, we need at most $n-2$ comparisons and at most $n-2$ swaps, and so on.^{[3]} The number of operations needed to perform insertion sort is therefore: $2 \times (1+2+ \dots +n-2+n-1)$. To calculate the recurrence relation for this algorithm, use the following summation:
$\sum_{q=1}^{p} q = \frac{p(p+1)}{2}.$
It follows that
$\frac{2(n-1)(n-1+1)}{2}=n(n-1).$
Use the master theorem to solve this recurrence for the running time. As expected, the algorithm's complexity is $O(n^2).$ When analyzing algorithms, the average case often has the same complexity as the worst case. So insertion sort, on average, takes $O(n^2)$ time.
Insertion sort has a fast best-case running time and is a good sorting algorithm to use if the input list is already mostly sorted. For larger or more unordered lists, an algorithm with a faster worst and average-case running time, such as mergesort, would be a better choice.
Insertion sort is a stable sort with a space complexity of $O(1)$.
References
- Devadas , S. 6.006 Introduction to Algorithms: Lecture 3, Fall 2011. Retrieved March, 24, 2016, from http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/lecture-videos/MIT6_006F11_lec03.pdf
- Cormen, T., Leiserson, C., Rivest, R., & Stein, C. (2001). Introduction to Algorithms (2nd edition) (pp. 15-21). The MIT Press.
- Sinapova , L. Sorting Algorithms: Insertion Sort. Retrieved March, 24, 2016, from http://faculty.simpson.edu/lydia.sinapova/www/cmsc250/LN250_Weiss/L11-InsSort.htm