Posts Tagged ‘Iterative

16
Jun
14

Recursive versus Iterative Algorithm Results Follow-up

A few week ago I did some preliminary investigation into the performance difference between recursive and iterative procedures/algorithms. The problem I chose to solve was path finding in an imperfect maze as there are algorithms which have the same theoretical complexity. I also already had some code lying around for both generating mazes and finding paths. I created iterative and recursive versions of the depth first search algorithm to compare with each other and included breadth first search for comparison to a known “slower” algorithm.

The first results I got were interesting, both the dfs (short for depth first search) algorithms were fairly close in terms of rates until the size of the mazes got larger. The results however were inconsistent because of the small number of mazes used for testing and differences in the behavior of the two dfs algorithms. Clearly I needed to test across a larger set of mazes and fix the behavior difference before I’d get good results. It wasn’t a waste of time as it did illustrate that it is possible for both types of algorithm to out-perform the other given even minor differences in behavior.

So I carefully adjusted the code so that both implementations of the dfs algorithms behaved in the same manner. Again I ran some initial tests and this time the iterative version of dfs out-performed the others in general. Satisfied that the code is behaving correctly I up-scaled the tests to 1000 different mazes across many sizes so I could graph the results.

alg-rate

You can see the rate for all the algorithms drops of very quickly! It is quite difficult to see much detail in this graph so I did some number crunching to determine the average time per search for each algorithm and size of maze.

alg-msec-per-search

This graph makes the difference much more clear. Note the parabola shape, this is because the worst case for all the algorithms is to search the entire maze, which obviously is the square of the X axis in this graph. The interesting part is how they compare with each other. The iterative dfs graph is very shallow compared to the others, it is in fact quite difficult to see much of an increase. It is however increasing in a similar fashion just not anywhere near as dramatic as both recursive dfs and iterative bfs (breadth first search).

So why the huge difference between iterative dfs and the others? The big difference between the two dfs algorithms is the cost of recursion. The extra cost is per square visited in the maze, so a doubling in the processing overhead per square produces a times four increase in the total overhead.

To get a good idea of why the overhead is so comparably large in the recursive case we need to think about how it is implemented at the machine code level. Basically for every square visited in the maze there is a function call to the recursive function and a return at the end. A function call and return involves a number of relatively expensive operations. Firstly information from the calling code needs to be pushed onto the stack (usually important register information), then the processor can “call” the function code which basically pushes a return address onto the stack and jumps to the function code. The first thing the newly called function must do is allocate space on the stack for its local variables, then it can do whatever processing it requires. Once finished the function will place the return value in either the stack or a register, remove local variables from the stack and then return control to the code that called it. This involves restoring the stack to remove the local variable information and then using the return address stored there to return to the calling code. The calling code can then retrieve anything it pushed on the stack before the call and continue operation. Quite the involved process!

Lets contrast that with what an iterative algorithm needs to do differently. Firstly any iterative algorithm needs to set up storage space for the body of the algorithm. In this case a stack (in the data segment or heap) for storing the current search and some variables for tracking the current location. Then the body can begin, usually this will be a loop of some kind, either a for loop or a while loop. These loops involve checking a condition and jumping at either the beginning or end, but are only short-range jumps within the function. The body of the code simply does the search updating the stack as it goes.

So for every square of the maze visited the recursive version has to deal with function call overhead (stack operations and far calls) where the iterative version has to only check a condition and make a short jump. There is more to it with the effects of the L1 and L2 caches, but I’ll leave that out because it’s x86 specific.

So the cost of recursion is high, in this case almost as bad as iterative breadth first search and it doesn’t guarantee the shortest path. Also because of limited program stack space, the recursive function cannot search as large a maze lest there be a stack overflow. So in general I’d recommend people stick with writing iterative code except where the complexity of a recursive function is significantly better. Your code should be faster and put less strain on the limited stack space.

21
May
14

Recursive and Iterative procedures

Today I’m looking at two different types of procedures, Recursive and Iterative. Many problems can be implemented using both techniques which have both benefits and drawbacks.

For those not acquainted with the terminology an iterative procedure accomplishes computation with a block of code that repeats many times until a terminating condition is met. The repeating block of code is commonly called a loop and can take many different forms. One of the main drawbacks of an iterative design is that it can become quite complicated making it harder to understand its function. However this is a trade-off to get higher performance and to keep the size of the program stack small.

Recursive procedures however are quite different as they infrequently make use of loops. Instead they achieve repetition by invoking itself. The Fibonacci function is a classic case. Fib(x) = Fib(x-1) + Fib(x-2) where the first two entries are hard coded to 1. It’s important that recursive functions have a basis case or a maximum depth to the number of calls otherwise an error called a stack overflow can occur. Basically a stack overflow is simply when the program stack literally overflows with too much data. The program stack has to store the local variables and return addresses for completed functions, recursive functions typically use up a lot of stack space. Another disadvantage is that function calls in many modern OO langauges are relatively expensive in terms of time required, as some languages require a pointer or two be dereferenced to access the function.

Fortunately all recursive procedures can be converted into iterative ones, the Fibonacci function again a classic and trivial case. The code might look something like this…

public int fib(int z)
{
  int[] series = new int[z];
  series[0] = 1;
  series[1] = 1;
  for (int i=2; i<z; i++)
    series[i] = series[i-1] + series[i-2];
  return series[z-1]; // end of the array contains the result.
}

Normally I’d say Iterative functions are usually faster, and in the case of the Fibonacci function that can be demonstrated, but with more complicated problems and algorithms it is much harder to prove. It’s simpler to construct the code and measure the performance! So to get an idea of the impact of function call overhead I decided to implement a problem in Java using a recursive and iterative algorithm.

The problem I have chosen is finding a path within a maze. I chose it because I already had some maze generation code and the depth first search algorithm should have similar performance when implemented both ways. The maze generated is supposed to have loops in it (ie it is not perfect) so I had to code the two procedures carefully to ensure they would pick the same path. I included a breadth first search algorithm (implemented iteratively) as well for comparison, it is guaranteed to find the shortest path where as both the depth first search implementations are not.

The testing procedure involved generating a maze and determining the rate at which each algorithm could compute paths in the maze. The start and end points were set randomly but the random number generator seed was reset to be the same at the start of each algorithms run. This means that they should have to compute the same paths. I set the program to repeat this process 10 times to get some results for today, but I have a more powerful machine repeating the process 1000 times, just I have to wait several more days for the results.

In implementing depth first search in both techniques there was a minor difference in the algorithm that I couldn’t avoid at the time. The recursive algorithm is smarter than the iterative one in a subtle way. When moving back up the tree after hitting a dead end the recursive algorithm intrinsically remembers which directions it has tried at nodes it returns to. The iterative algorithm isn’t as clever, it checks all the directions again (in the same order). This should make the iterative algorithm slower as it does more processing when moving back up the maze after hitting a dead end. How much slower? I don’t know, but hopefully not much.

This slideshow requires JavaScript.

The results are interesting but unfortunately not definitive. You can see that the two depth first search algorithms are fairly close in terms of rates, although the recursive algorithm tends to be faster until the maze size gets larger. I have tried it a number of times to see how consistent the results are. I found that the breadth first search generally gives fairly consistent results, but the two depth first search algorithms can vary depending on the mazes generated. More often the recursive algorithm is faster but in some data sets the iterative one is faster.

I did some other small scale tests and it seems the processor has an effect on the results. I’m guessing the JIT compiler has different optimisations for different processors which is changing the results. It is entirely possible that the JIT is optimising out the expense in using recursive calls although I think this unlikely.

This leads me to believe I need more data, so I’ll wait until I get results from the machine testing 1000 different mazes. I did these smaller tests on my Macbook and was unable to test with mazes larger than 512×512 because of a stack overflow. The machine I’m doing the larger tests on is managing to run at a maze size of 1024×1024 because it has a different java run-time. (running under Linux) I might have a go at making the Iterative depth first search function more like the recursive function in terms of smarts but we’ll see if the larger tests reveal anything more.




Blogs I Follow

Enter your email address to follow this blog and receive notifications of new posts by email.


Mister G Kids

A daily comic about real stuff little kids say in school. By Matt GajdoĊĦ

Random Battles: my life long level grind

completing every RPG, ever.

Gough's Tech Zone

Reversing the mindless enslavement of humans by technology.

Retrocosm's Vintage Computing, Tech & Scale RC Blog

Random mutterings on retro computing, old technology, some new, plus radio controlled scale modelling.

ancientelectronics

retro computing and gaming plus a little more

Retrocomputing with 90's SPARC

21st-Century computing, the hard way

lazygamereviews

MS-DOS game reviews, retro ramblings and more...