Tag Archives: Insertion sort

Computer Algorithms: Bucket Sort

Introduction

What’s the fastest way to sort the following sequence [9, 3, 0, 5, 4, 1, 2, 6, 8, 7]? Well, the question is a bit tricky since the input is somehow “predefined”. First of all we have only integers, and fortunately they are all different. That’s great and we know that in practice it’s almost impossible to count on such lucky coincidence. However here we can sort the sequence very quickly.

First of all we can pass through all these integers and by using an auxiliary array we can just put them at their corresponding index. We know in advance that that is going to work really well, because they are all different.

There is only one major problem in this solution. That’s because we assume all the integers are different. If not – we can just put all them in one single corresponding index.

That is why we can use bucket sort.

Overview

Bucket sort it’s the perfect sorting algorithm for the sequence above. We must know in advance that the integers are fairly well distributed over an interval (i, j). Then we can divide this interval in N equal sub-intervals (or buckets). We’ll put each number in its corresponding bucket. Finally for every bucket that contains more than one number we’ll use some linear sorting algorithm.

The thing is that we know that the integers are well distributed, thus we expect that there won’t be many buckets with more than one number inside.

That is why the sequence [1, 2, 3, 2, 1, 2, 3, 1] won’t be sorted faster than [4, 3, 1, 2, 9, 5, 4, 8].

Pseudo Code

1. Let n be the length of the input list L;
2. For each element i from L
   2.1. If B[i] is not empty
      2.1.1. Put A[i] into B[i] using insertion sort;
      2.1.2. Else B[i] := A[i] 
3. Concatenate B[i .. n] into one sorted list;

Complexity

The complexity of bucket sort isn’t constant depending on the input. However in the average case the complexity of the algorithm is O(n + k) where n is the length of the input sequence, while k is the number of buckets.

The problem is that its worst-case performance is O(n^2) which makes it as slow as bubble sort.

Application

As the other two linear time sorting algorithms (radix sort and counting sort) bucket sort depends so much on the input. The main thing we should be aware of is the way the input data is dispersed over an interval.

Another crucial thing is the number of buckets that can dramatically improve or worse the performance of the algorithm.

This makes bucket sort ideal in cases we know in advance that the input is well dispersed.

Computer Algorithms: Radix Sort

Introduction

Algorithms always depend on the input. We saw that general purpose sorting algorithms as insertion sort, bubble sort and quicksort can be very efficient in some cases and inefficient in other. Indeed insertion and bubble sort are considered slow, with best-case complexity of O(n2), but they are quite effective when the input is fairly sorted. Thus when you have a sorted array and you add some “new” values to the array you can sort it quite effectively with insertion sort. On the other hand quicksort is considered one of the best general purpose sorting algorithms, but while it’s a great algorithm when the data is randomized it’s practically as slow as bubble sort when the input is almost or fully sorted.

Now we see that depending on the input algorithms may be effective or not. For almost sorted input insertion sort may be preferred instead of quicksort, which in general is a faster algorithm.

Just because the input is so important for an algorithm efficiency we may ask are there any sorting algorithms that are faster than O(n.log(n)), which is the average-case complexity for merge sort and quicksort. And the answer is yes there are faster, linear complexity algorithms, that can sort data faster than quicksort, merge sort and heapsort. But there are some constraints!

Everything sounds great but the thing is that we can’t sort any particular data with linear complexity, so the question is what rules the input must follow in order to be sorted in linear time.

Such an algorithm that is capable of sorting data in linear O(n) time is radix sort and the domain of the input is restricted – it must consist only of integers.

Overview

Let’s say we have an array of integers which is not sorted. Just because it consists only of integers and because array keys are integers in programming languages we can implement radix sort.

First for each value of the input array we put the value of “1” on the key-th place of the temporary array as explained on the following diagram.

Radix sort first pass
Radix sort first pass
Continue reading Computer Algorithms: Radix Sort

Computer Algorithms: Quicksort

Introduction

When it comes to sorting items by comparing them merge sort is one very natural approach. It is natural, because simply divides the list into two equal sub-lists then sort these two partitions applying the same rule. That is a typical divide and conquer algorithm and it just follows the intuitive approach of speeding up the sorting process by reducing the number of comparisons. However there are other “divide and conquer” sorting algorithms that do not follow the merge sort scheme, while they have practically the same success. Such an algorithm is quicksort.

Overview

Back in 1960 C. A. R. Hoare comes with a brilliant sorting algorithm. In general quicksort consists of some very simple steps. First we’ve to choose an element from the list (called a pivot) then we must put all the elements with value less than the pivot on the left side of the pivot and all the items with value greater than the pivot on its right side. After that we must repeat these steps for the left and the right sub-lists. That is quicksort! Simple and elegant!

Quicksort
Continue reading Computer Algorithms: Quicksort

Computer Algorithms: Merge Sort

Introduction

Basically sorting algorithms can be divided into two main groups. Such based on comparisons and such that are not. I already posted about some of the algorithms of the first group. Insertion sort, bubble sort and Shell sort are based on the comparison model. The problem with these three algorithms is that their complexity is O(n2) so they are very slow.

So is it possible to sort a list of items by comparing their items faster than O(n2)? The answer is yes and here’s how we can do it.

The nature of those three algorithms mentioned above is that we almost compared each two items from initial list.

Insertion sort and bubble sort make too many comparisons, exactly what merge sort tries to overcome!
Insertion sort and bubble sort make too many comparisons, exactly what merge sort tries to overcome!

This, of course, is not the best approach and we don’t need to do that. Instead we can try to divide the list into smaller lists and then sort them. After sorting the smaller lists, which is supposed to be easier than sorting the entire initial list, we can try to merge the result into one sorted list. This technique is typically known as “divide and conquer”.

Normally if a problem is too difficult to solve, we can try to break it apart into smaller sub-sets of this problem and try to solve them. Then somehow we can merge the results of the solved problems.

If it's too difficult to sort a large list of items, we can break it apart into smaller sub-lists and try to sort them!
If it's too difficult to sort a large list of items, we can break it apart into smaller sub-lists and try to sort them!

Continue reading Computer Algorithms: Merge Sort

Computer Algorithms: Shell Sort

Overview

Insertion sort is a great algorithm, because it’s very intuitive and it is easy to implement, but the problem is that it makes many exchanges for each “light” element in order to put it on the right place. Thus “light” elements at the end of the list may slow down the performance of insertion sort a lot. That is why in 1959 Donald Shell proposed an algorithm that tries to overcome this problem by comparing items of the list that lie far apart.

Insertion Sort vs. Shell Sort
Insertion sort compares every single item with all the rest elements of the list in order to find its place, while Shell sort compares items that lie far apart. This makes light elements to move faster to the front of the list.

In the other hand it is obvious that by comparing items that lie apart the list can’t be sorted in one pass as insertion sort. That is why on each pass we should use a fixed gap between the items, then decrease the value on every consecutive iteration. Continue reading Computer Algorithms: Shell Sort