Archive for February, 2015

27
Feb
15

Creating a Benchmark: part 3

Last time I started the process of comparing the BGI to a hand coded VGA library. I coded up a fairly lazy completely Pascal library. Today I’ve re-coded some parts of that code using x86 assembly with Pascals in-line assembler.

At work in the IDE

At work in the IDE

The first thing I wanted to tackle was the speed of the sprite blitting and filled boxes as they didn’t seem to live up to their potential. I decided to replace the Pascal move (copy memory) and fillchar (fill memory) functions as they are heavily used in the Pascal only version.

Luckily there are some neat instructions which make copying and filling memory faster even on old processors like the 8086 and 80286. These are MOVS and STOS, both string instructions which actually owe their existence to the Z80 and i8080 where they first appeared. Using them with the REP prefix makes them even better as it helps eliminate some looping code.

MOVS is for copying a block of memory, you load the ES:DI registers with the destination pointer and DS:SI with the source pointer and CX with the loop count. Then execute…

shr cx,1
@again:
rep movsw
jcxz @next
loop @again
@next:
jnc @done
movsb
@done:
...

This code will copy any count of bytes one word (16 bits) at a time, copying a single byte at the end if you specified an odd count. I’ve used JCXZ and LOOP to continue the data copying as some older processors have a bug where the REP MOVSW can end early if an interrupt occurs at the wrong time. I know this isn’t strictly necessary, but it’s a safety measure.

STOS works in much the same way, just it doesn’t source the data from a memory pointer, it uses the accumulator register instead.

With these new memory copy and fill routines done I tested the program to see if I had any improvement in performance. To my surprise there was none, the built-in functions for copying and filling memory must be about as good as what I wrote, but why is the blitting and box filling still slower than they should be?

It turned out that the loops in the filledBox and putImage functions were the culprit. The pascal code looked like this for putImages main loop…

for i:= 0 to sizey do
         copymem(bseg,4+(i*sizex),cardseg,((y+i)*320)+x, sizex);

It didn’t look problematic until I considered the instructions required for calculating the offset into the image data and screen buffer. Multiplication is an unfortunately slow operation, and with some nifty assembly code I rewrote both the putImage and filledBox procedures mostly in assembly, avoiding multiplications in the main part of the loop altogether.

It took me about 2-3 days to get through all the work re-writing two of the drawing functions in assembly, when it took about 1 to write the basic VGA graphics to begin with, but boy did it pay off. After re-writing most of putImage and filledBox in assembly I increased their performance by over 3 times for putImage and almost 2 times for filledBox. Both are also now significantly faster than the BGI implementation, being about twice as fast.

So the BGI is slow compared to raw x86 assembly after all, but it took significant effort to get that performance gain. For the myriad of one-man shareware programmers I still understand why they just went with the BGI, it was easy to use and good enough for what they were doing.

Making a VGA library with straight pascal was fairly easy to do, but had some disadvantages over BGI and wasn’t really quicker. I had to go to assembly before there was any significant performance gain. Coding assembly is daunting to many programmers, and for me is much more time consuming than writing in a higher level language. It will be quite some time before I really finish re-coding the library in assembly.

Next time I’ll have to tackle the line drawing functions, which are using some floating point numbers to accurately draw the lines. I’m planning on converting them to using fixed point numbers to improve speed on machines without an FPU, like my old 386sx. I’m also hoping assembly will help speed things up there to.

19
Feb
15

Hardware pickups

Recently I’ve been able to pick up some interesting hardware and I thought I’d share some photos of it with you. It was also an opportunity to try out some better lighting in the hope of getting better pictures.

387sx 33MhzThis unfortunately wasn’t the greatest photo due to the reflectivity of the packaging.

First up is a pair of 80387sx co-processors from the mid 80’s. Very few people actually bought and installed these chips as floating point arithmetic was mostly only used in scientific applications or Computer Aided Design. Consequently chips like these can be quite rare, and this pair clock in a 33Mhz making them some of the faster 387’s.

Intel weren’t the only company making co-processors for the 386, these included Cyrix, Chips and Technologies,  IIT, ULSI and Weitek. Most of these were faster than the Intel part, but had some compatibility issues, and some were of completely different designs.

Interestingly the NPU’s as they were known could be clocked asynchronously from the CPU. They also could operate whilst the CPU was busy doing something else, which gave machines with these some very crude parallel capabilities.

Here we have a Sun Microsystems mainboard from a Sparcstation IPX. The machine came in a neat lunchbox form-factor that was actually impressively small. This particular board has a Weitek Sparc processor that ran about 40Mhz. These chips had an FPU on-die, so they would have been similar to the 486 in performance. The LSI chip and some of it’s supporting chips are likely the 1Mb of system cache which was quite large for the time. The Sun GX chip is a graphics controller which contained some basic drawing acceleration. These features made the IPX quite an impressive little workstation. Most of the chips and the board itself appear to be manufactured in 1993.

It’s a shame I don’t have the rest of the machine, I’d like to be able to run this little beast. I’m not even sure I can get RAM for it, or if what I have is compatible. I’ll have to keep an eye out for the chassis and other parts.

Mechanical Keyboard

Mechanical Keyboard

This might look like an ordinary keyboard, but it is a proper mechanical switch keyboard that came with the next piece of hardware (a PC clone). Despite its very plain looks it feels fantastic to type on and has that distinctive mechanical sound. It has a larger DIN plug which actually suites many machines up to and included many Pentium based machines. It is a bit grubby but in otherwise good condition.

 

Lastly we have an interesting PC clone. This one was made by a company called Microbyte, which turns out that they were an Australian company based in Adelaide who made PC clones such as this one. It is clear that they designed and built their own boards and wrote their own PC compatible BIOS. Quite an achievement for what must have been a small engineering company. I found very little information about them online unfortunately.

My machine is a PC230sx, which has a 386sx@20Mhz with a Trident VGA card. It has SIPP memory fitted for both the main memory and video memory. They installed an unusually large amount for the VGA, having a full 1Mb of video memory. The system RAM is 2Mb in total.

When I bought this machine I didn’t think it had a hard disk, but it turns out that it has a Seagate ST3144A which is 130Mb. Probably an impressive and expensive drive in it’s day. This drive still works, I just had to configure the CMOS with the drives details which are handily written all over the machine.

You may notice the socket for a WD33c93 chip, this was a SCSI controller chip. This would have to be one of the few older machines that have the capability of on-board SCSI. I’m not sure why the chip is missing here, but these machines were apparently commonly fitted with SCSI drives instead. Looking in the BIOS seems to indicate that they were supported for booting. I may have to find one of these chips and see if I can get SCSI to work.

Between the VGA chip and VLSI chips lays an extremely long header where the expansion riser card would normally be inserted. This machine doesn’t have the riser card, so I can’t plug in a sound card or anything else which is a bit of a shame. I’m surprised the machine works without it as I’ve seen many other machines which don’t work correctly or at all when it is missing.

This board has some stickers that look like they were written by a service technician, they are attached to a part of the board under the floppy drive where there is a blank area containing no visible traces or chips. The first sticker reports an invalid opcode at a particular memory address which could indicate a problem with RAM or software.

Fortunately after testing the machine I’ve found the only problem so far is the malfunctioning COM1, the rest of the machine appears to be functional, and the IDE hard drive boots DOS ok. I have noticed that the Floppy drive light stays on, something which sometimes indicated incorrect installation of the cable. In this case the cable is correct, and the drive even reads disks, so there is likely a jumper setting on the drive that needs correcting.

I benchmarked this machine with Topbench to see how it compares to others. It was marginally faster than a 286@16Mhz with a 287 co-processor. I think there may be a few factors that contribute to this. Firstly I think the RAM must be a similar speed to that in the 286, thus slowing down the memory and opcode tests. It does perform better in the 3d games test which I found interesting as that has some floating point arithmetic. Luckily this is perfect for testing my homebrew platform game.

Finally I’m pleased with how the extra lighting has improved the pictures, but my technique still needs work. Perhaps another source of lighting is called for, or perhaps finally a step up to a better camera.

12
Feb
15

Creating a Benchmark: part2

A couple of weeks ago I created a basic benchmarking program for measuring the speed of the Borland Graphics Interface and its drivers. I’m primarily interested in how they compare not only to each other, but also to a hand-coded implementation. So this week I created a VGA graphics unit by hand and made a benchmark program around it.

I chose coding for VGA 320x200x256 as it is the easiest mode to code for and matches more of the BGI drivers. You simply initiate the graphics mode 13h (h for hex) with the video BIOS, this sets up a linear buffer for drawing at the memory location A000h:0000h. Each pixel is a single byte, so drawing a pixel doesn’t require bit masking unless you want it to. Drawing a pixel simply involves changing the byte at the offset following this simple formula. (y*320) + x.

Given this information it was no problem at all to code up a basic graphics unit. I didn’t use much in the way of assembly code to implement the unit, partly out of laziness, instead opting to implement it using Pascal code mostly. I haven’t implemented all the graphical functions in the BGI, simply because there are way too many.

Here are the results when tested under Dosbox with 3000 cycles. I used pretty much the same code to perform the measurement to ensure as much consistency as possible. It’s quite interesting to see that implementing your own graphics unit doesn’t really provide that much extra performance for most functions, and in this case blitting sprites is actually slower using my code! I suspect this is because I used a built-in Pascal function for copying memory that may not be super fast. I did note it is still faster than both the VESA and SVGA256M driver in the same mode.

So is it worth implementing your own graphics driver instead of using the BGI. The answer is a sorta, maybe. I haven’t optimised my graphics code in this case, so it surely could be a bit faster, but I did manage to use less memory for storing the sprites, and my code was much smaller in terms of size. However the Graph unit and VGA256 actually seem to have some decent performance comparatively, so if you need compatibility with other cards that are more difficult to code for, or simply don’t have time to code a graphics unit of your own, then the graph/BGI implementation isn’t too bad.

Code and DOS binary are available here.

05
Feb
15

Arctic Adventure for DOS

The title screen

Arctic Adventure was released in 1991, when the author George Broussard had just merged his company with Apogee. It is a sequel to the first game: Pharaoh’s Tomb, and shares the same game engine that was originally developed by Todd Replogle for Monuments of Mars. It shares most of its technical aspects with both of these games, as it uses exactly the same technologies.

Map ScreenAgain CGA graphics and PC Speaker sound were used, with about the same level to technical skill as both are roughly equivalent to the other games. The only really big change is using the white, cyan, and magenta CGA palette instead, which is quite appropriate given the Arctic theme. I noted that this time there was no performance warning for older machines, but I haven’t noted any significant improvement. So best to avoid the slowest 8088 and PCjr machines.

Game ScreenUnlike the other two games you start in an over-world style map which allows you to choose which level you wish to attempt. You need to gather keys and a boat to gain access to many of the levels, but you can attempt them in any order otherwise. Whilst you can only save at this screen, it’s quite  nice being able to return to this map screen without penalty so you can save your game, or choose another level if one is vexing you too much.

Not as easy as it looksEntering a level you’ll find similar collision issues that the other games suffered. The spikes in particular feel the most unfair as they will kill you without even touching your character. However overall it suffers from this much less than Pharaoh’s Tomb as you no longer have a limited number of lives. You simply return to the start of the level with everything you brought with you when you first arrived. This makes Death much less annoying as you can still progress even if you die many times, and you can choose another level when you get frustrated.

Looks simple enoughThe levels themselves are a mix of easier and harder puzzles, some of which are more a test of your platforming skills. They contain the same types of enemies and hazards as Pharaoh’s Tomb, just they have been re-skinned. It seems that the designer has made better use of these features as I didn’t run into the same problems as much, and the levels are much more enjoyable to play.

Like the other games Arctic Adventure was made freeware back in 2009, and is the better game of the three. It isn’t as frustrating as Pharaoh’s Tomb, but is more challenging than Monuments of Mars. Unfortunately it still suffers from some issues with the collision detection making some levels extra hard. If I had to pick a favourite, I’d probably favour Monuments of Mars, but Arctic Adventure is still quite enjoyable.




Enter your email address to follow this blog and receive notifications of new posts by email.


Mister G Kids

A daily comic about real stuff little kids say in school. By Matt Gajdoš

Random Battles: my life long level grind

completing every RPG, ever.

Gough's Tech Zone

Reversing the mindless enslavement of humans by technology.

Retrocosm's Vintage Computing, Tech & Scale RC Blog

Random mutterings on retro computing, old technology, some new, plus radio controlled scale modelling.

ancientelectronics

retro computing and gaming plus a little more

Retrocomputing with 90's SPARC

21st-Century computing, the hard way