Tag Archives: Analysis of algorithms

Computer Algorithms: Jump Search

Overview

In my previous article I discussed how the sequential (linear) search can be used on an ordered lists, but then we were limited by the specific features of the given task. Obviously the sequential search on an ordered list is ineffective, because we consecutively check every one of its elements. Is there any way we can optimize this approach? Well, because we know that the list is sorted we can check some of its items, but not all of them. Thus when an item is checked, if it is less than the desired value, we can skip some of the following items of the list by jumping ahead and then check again. Now if the checked element is greater than the desired value, we can be sure that the desired value is hiding somewhere between the previously checked element and the currently checked element. If not, again we can jump ahead. Of course a good approach is to use a fixed step. Let’s say the list length is n and the step’s length is k. Basically we check list(0), then list(k-1), list(2k-1) etc. Once we find the interval where the value might be (m*k-1 < x <= (m+1)*k – 1), we can perform a sequential search between the last two checked positions. By choosing this approach we avoid a lot the weaknesses of the sequential search algorithm. Many comparisons from the sequential search here are eliminated.

How to choose the step’s length

We know that it is a good practice to use a fixed size step. Actually when the step is 1, the algorithm is the traditional sequential search. The question is what should be the length of the step and is there any relation between the length of the list (n) and the length of the step (k)? Indeed there is such a relation and often you can see sources directly saying that the best length k = √n. Why is that?

Well, in the worst case, we do n/k jumps and if the last checked value is greater than the desired one, we do at most k-1 comparisons more. This means n/k + k – 1 comparisons. Now the question is for what values of k this function reaches its minimum. For those of you who remember maths classes this can be found with the formula -n/(k^2) + 1 = 0. Now it’s clear that for k = √n the minimum of the function is reached.

Of course you don’t need to prove this every time you use this algorithm. Instead you can directly assign √n to be the step length. However it is good to be familiar with this approach when trying to optimize an algorithm.

Let’s cosider the following list: (0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610). Its length is 16. Jump search will find the value of 55 with the following steps.

Jump search basic implementation
Jump search skips some of the items of the list in order to improve performance!

Implementation

Let’s see an example of jump search, written in PHP. Continue reading Computer Algorithms: Jump Search

Beginning Algorithm Complexity and Estimation

Which is the Fastest Program?


When a programmer sees a chunk of code he tends to evaluate it in a rather intuitive manner and to qualify it as “elegant” or not. This is quite easy, because it’s subjective and nobody knows what exactly elegant means. However behind this there is a powerful mathematical approach of measuring a program effectiveness.

It’s a pity that most of the developers still think of the big O notation as something from the university classes, but unusual in the practice and they barely use it their job. But before describing the big O notation, let me start from something really simple.

Let’s have the following example (note that all the examples are in PHP):

$n = 100;
$s = 0;
 
for ($i = 0; $i < $n; $i++) {
	for ($j = 0; $j < $n; $j++) {
		$s++;	
	}	
}

As you can see there are two assignments and two nested loops. This is really a widely used example from any algorithm book.

Constants, Languages, Compilers

First of all the time to assign a value to a variable, to compare two values and to increment a variable is constant. It depends on the computer resources, the compiler or the language, but it’s constant on one machine if you compare two chunks of code. Now we can see that these operations take (add) constant time to the program, and we can assume this time is respectively a, b, c, d, e, f, g, h, i.

$n = 100; 	// a
$s = 0;		// b
$i = 0; 	// c
$i < $n; 	// d
$i++;		// e
$j = 0; 	// f
$j < $n;	// g
$j++;		// h
$s++;		// i

What Matters?

Actually the most important thing here is the value of n. By assigning greater values to n the more time will take the program to run. As we can see from the following table by multiplying the value of n by 10, the time became 100 times more.

n		time
10		0.00002
100		0.002
...		...

What happens in fact is that we can sum all these values.

a + b + c + n*d + n*e + n*(f + n*g + n*h + n*i)

and by substituting:

a + b + c = k
d + e + n = l
g + h + i = m

the result is:

m*+ l*n + k

Conclusion

Here the most important thing is the degree of n, because it can change dramatically the program time consumption depending on the n value. Thus this chunk has a quadratic complexity or O(n²).

Of course there are constants, but in the practice they are not so important. Take a look at these two functions:

f = 2*n²
g = 200*n

OK, for n = 1 the first one will be faster, but as n increments the second function becomes to be faster and faster, thus after a given value of n the second function is really the fastest!