Last time I used the in-line assembler to improve the speed of the sprite blitting functions with lots of success. Other functions such as line drawing however still suffered speed issues on systems with out a built-in FPU such as the 386sx.

The reason for the slow line drawing was simple, I had used a simple slope based algorithm that used floating point. On a system with a FPU this was quite fast, about as fast as anything else, but obviously it wasn’t going to cut the mustard when it came to the old processors I’m targeting. So I decided to give fixed point arithmetic a go.

Fixed point numbers are a variation on basic integers, using some of the bits to represent the fractional part. You can then use integer instructions to do some basic math that involves fractions. This is much faster for processors lacking hardware floating point support. Check out more details of fixed point here or at Wikipedia. It’s a very common technique for embedded systems, but was also used in games such as Doom for speed.

I however ran into a problem. Because the integer in turbo pascal is 16bits I didn’t have enough bits for addressing all the pixels and have enough precision to for the fractional part of the slope. For the integer part I needed to have a range of 0-319, requiring 9 bits unsigned. The fractional part was thus left with 7 bits, with the smallest representable fraction being 1/128th. The smallest conceivable slope in 320×200 is 1/320 which obviously is much smaller. I could have switched to using longints or storing the fractional part separately but this would have added extra overhead that would make line drawing slower.

I set the line drawing problem aside for the moment to look for a quick circle drawing algorithm. Something I hadn’t implemented yet. The SWAG archive came up with suitable options, but also had some algorithms for line drawing. In particular the Bresenham line drawing algorithm.

Not having coded my own low level line drawing routines before I hadn’t heard of it, but it uses all integer math in a clever way to produce the correct slope. I modified the algorithm I found to reduce the amount of calculations for the screen address when plotting points in the line. The resulting code was slightly faster than previous line drawing code, but not overwhelmingly so.

Returning to implementing circle drawing, the algorithms I found unfortunately used floating point math. They all use the basic formula for a circle, x^2 + y^2 = r^2, thus requiring the use of a square root function to calculate the co-ordinates. Unfortunately it is one of the more expensive floating point operations.

One algorithm I looked at used the square root function, but rounded it to an integer immediately. This got me thinking, if I could implement an approximation of square root using integer arithmetic I could draw circles quickly. So after a bit of research I wrote exactly that. Code follows.

function intSqrt(num: word):word; var xo,xn:word; begin {we're using Newtons method for approximating the sqrt} if (num=0) then begin intSqrt:=0; exit; end; xo := 0; xn := num; if xn=0 then xn:=1; while (abs(xo-xn) > 1) do begin xo := xn; xn := (xo + (num div xo)) shr 1; end; intSqrt := xn; end;

As stated in the comment, I’ve used Newton’s method for calculating the root, mostly because it’s a simple method I know. I did a little testing, and it turns out this returns pretty much the same result as rounding the built-in sqrt function that uses floating point, only quicker. Once I finished the circle drawing code it drew around 36% more circles in the same time as the BGI code. That’s not bad, but I think there may be more room for optimisation here.

Next time I hope to have optimised the circle routine, and I’ll test the code on some hardware to see how all the different libraries compare to each other on different platforms.

## 0 Responses to “Creating a benchmark: Part 4”