Posts Tagged ‘software-development

21
Jan
13

Code Optimisation

Recently I started reading a book about code optimisation that focuses on the x86 architecture and assembly language. The book is called Graphics Programming Black book and is written by Michael Abrash who is well known for his work writing technical articles and coding on games such as Doom and Quake. Whilst the book has a graphics focus, the skills you can learn from the book can be applied to any high level language or assembly language you may happen to be using. I’ve dabbled with assembly code in the past and have found some parts of the book quite interesting. You can find the book on-line to read via html or PDF download. You can find it pretty easily if you just google Michael Abrash and the name of the book.

After my recent testing of data structure performance and beginning to read this book, I decided that I should look over all the Java code I had in order to try and optimise it. I have been for some time working on writing some android games, and in the process have built for myself a nice graphics library and the basics of some games. In testing and tuning my code I’ve found some good rules of thumb for making fast code in Java not just for the android platform but in general.

The first thing I’d recommend you do is to not use Iterator classes anywhere in your code, but use for loops instead. Iterators are objects and add overhead to your applications in multiple ways. Firstly the occupy heap space and need to be garbage collected when finished with. Now this may not seem all that bad, but if you have several nested loops, this could mean thousands of Iterator objects being created and destroyed on a regular basis in your code. Secondly it adds extra method call overhead every time you call hasNext() or next(), slowing down your loops. Again for small numbers of loops it seems insignificant, but when added up over a large number of loops the time taken to make the calls can become quite significant. For these reasons, even when using the API data structures such as ArrayList, I would avoid using Iterators.

In very time critical code, it is best to build your own data structures out of primitive arrays rather than to rely on the API code. The JIT compiler seems quite adept and optimising simple array based code to run a lot faster than pretty much any API or object based solution. It’s also best to try and minimise the number of method calls in your important loops to keep overhead down.

An interesting problem I recently had was to do with path finding within a maze in one of the games I am writing. Obviously I had a randomly generated maze that whilst being perfect to start with, has cycles (or loops) introduced to it to make the game more playable. When I first wrote the path finder I used a depth first approach that works quite well on perfect mazes as there is only one path from each point to each other point. The problem I had with it was two fold, the algorithm was slow in the larger mazes and with cycles being introduced it often chose longer paths than it had to. This perplexed me no end, so I went in search of what the problem might be, optimising some of the code to get better speed, and writing some extra code to try and get a better path. After having tried without any luck I decided to try using the breadth first search algorithm implemented iteratively. It turned out to be a whole bunch faster! I investigated into why this might happen given that depth first search should be faster to find a path (even if it’s not optimal). Upon comparing the code of the two algorithms I came to a simple conclusion. The depth first search code took a lot longer to execute a single iteration because it had many more method calls and comparisons than the simple breadth first search does. So whilst on paper the depth first search looked good, its performance is poor in comparison, because the simpler breadth first search could get through more iterations in the same amount of time. The breadth first search is so much better I was able to generate larger mazes and still find paths within a reasonable response time, even with worst case paths that searched most of the maze!

It’s interesting how sometimes the unexpected solution turns out to be the best. It shows you should always test your code, and spend a bit of time optimising your critical code. It will in the end make the experience of the end user all that much better, and enable you to do more with less.

Advertisements
06
Jan
13

Memory Allocation – Heap and Stack

A question I recently saw online was about how memory is allocated and used within the Java programming language. There was a little confusion about what the heap and stack are. Both the stack and heap are common to many languages such as pascal, C, and many more.

Stack space is a special construct in hardware used for storing temporary variables, return values and return addresses for functions. Most languages use the stack to store any local variables that will fall out of scope when a procedure (or method in OO language) finishes or returns. Java has the same style of stack construct for the virtual machine used for the same purpose. The Java language will however only store primitives (char, int, byte etc) and references to objects on the stack.

Heap space is a software construct and will be different depending on the language. It’s used for dynamically allocating memory usually for longer term storage of larger chunks of data. In languages like pascal and C, data on the heap is usually handled and referred to by a primitive called pointers. A pointer basically holds the memory location of whatever it is pointing to, they can be difficult to program.

Java doesn’t have a pointer primitive so it uses a different mechanism. Instead of having pointers, Java has references which are subtly different. A reference is different in that the memory location information is not available to the programmer where as it is with a pointer. This means you can’t do pointer arithmetic or try to access memory directly. The intent of this is to protect programmers from themselves and to make Java more secure. Java references are used for storing objects and arrays on the heap. The garbage collector keeps track of how many references are pointing to each item in the heap, and disposes of items that are no longer in use.

Older languages that target DOS and the older systems will allocate global variables to what is called the data segment. DOS and earlier processors used segmented memory addressing and in part comes from the 8086 machines assembly language. There are three types of segment, code, stack, and data. Code segments are for storing the machine code for the program and are pretty straight forward, there can be many of them but on DOS and 8086 machines the maximum size of each is 64Kb. The stack segment is singular and unique and is for the processor stack, there can be only one again with a maximum size of 64Kb. The data segment is as I said for statically defined data, and each segment is 64K in size. Borland Turbo Pascal only allowed one data segment and I suspect most other languages did the same. The heap would be outside the segments and would use the remaining memory, it was typically managed by a system library that was a part of the language used.

Java is fairly similar to native languages but does not use a segmented model, as many modern languages don’t. This is partly because hardware and software has changed in such a way as they are no longer needed.

03
Oct
12

The C/C++ programming language

C and it’s derivative C++are perhaps both the most popular programming languages currently in use. C has it’s root in the early 1970’s in AT&T’s Bell Labs, it quickly became used for low level programming of kernels, and many of the early Unix operating systems were written with it. C++ was developed based on C later the same decade to add the object oriented model of programming which was relatively new at the time. You can find much more information about both the languages on Wikipedia here for C and here for C++.

I’ve not got as much expertise with either language as I do with others, I’ve written a few small applications and found that it was indeed possible to write nice neat code that was readable and worked well. However I’ve also had many issues in particular trying to read other peoples code and working with the build systems often used by implementations of both languages.

C is by far the simpler of the two languages, and is often compared to it’s contemporary Pascal which was developed around the same time. Some people regard Pascal as inferior to C (and I imagine that there are many that believe the opposite) but the reality is that both languages are very similar in capability. I’ve written embedded software with it back when I was at Uni and found it was well suited to the task, and would work well developing software even for larger systems. The main problem I have with the language is the extensive use of symbols, that many programmers abuse and end up with unreadable code. Fortunately it isn’t as bad as it could be, and providing people comment their code it is somewhat readable.

C++ introduced the object oriented model, but did not exclude people from continuing to use the procedural model. This is interesting as it allows people to mix and match, writing parts of their software in the methodology that suites it. I wrote an experimental program many years ago using C++ and was able to generate some reasonably readable code. The problem is that it is also quite easy to write some terribly difficult to read code as well. With the object oriented features utilizing pointers much more, some sections of code start to look more like random binary code than something that is readable. This makes the design of the software much more important.

The build system is different from platform to platform, but most of them are make systems describing how to build the software. I found it a bit difficult to deal with make files for similar reasons I find the languages difficult. There are lots of difficult to read symbols around that make reading a single line an exercise in de-coding. On the other hand the various versions of make are generally more powerful than the base compilers. Make technically isn’t necessary to build software in either language, you could manually write a script, but it is the most commonly used type of tool, although Microsoft Visual Studio has moved away from using it.

One of the main problems I’ve found with both languages is that the standard API varies greatly from platform to platform.  This results in large amounts of confusing conditional defines to try and help code be a bit more cross platform. It’s also been difficult for me to get used to the large number of symbols commonly used. It often takes me some time to work out where and how pointers are being de-referenced. I’ve also found finding and reading documentation for the API has been difficult as well. I understand there is a Javadoc like system (doxygen) that can be used to generate documentation, I’d like to see this (or something like javadoc) used for the API documentation to make it a bit easier. It’s not that API’s are hard to use, just the documentation is.

Most of what I consider a problem is nothing to do with either language, but with how many programmer (especially amateur ones) use them. Comments are lacking from code, in some cases the only comment is the header describing the license for the software. People seem to often over use the define pre-processor construct which makes reading code very difficult.Variable names are often confusing, short, and not really descriptive, which wouldn’t be as much of a problem if there was a comment describing what it is for, but this basically never happens. There were such bad examples of code in early Unix systems that an obfuscation competition was started in response to some terrible C code. It still runs today, and you can find their website here.

I think most of the software would be much more readable if people applied more professional coding standards, and used the automated documentation creation tools like doxygen so comments within source files can be the documentation. Sometimes I wonder if people make it difficult for the sake of looking smarter or making it harder for everyone else to understand.

03
Jul
12

Using the inline assembler in Pascal

I’ve been asked recently how you go about adding assembly instructions in old pascal program using the compilers inline facility. Assembly is important if you want to write your own hardware handling routines, or need to do something as fast as possible. It can be difficult writing assembly code, but by and large it is actually a fairly simple language. With the old Borland Turbo Pascal 6 and 7 you did this using one of a few different structures.

You could use the asm directive to write assembly in a code block of its own, and this is my preferred way. The only issue is you are limited to using 286 instructions as these are the only ones that turbo pascal ever supported. So if you required 386 instructions or higher you were out of luck!

Here is some sample code I have used. It could be more efficient, but as a sample it suffices.

procedure copymem16(srcseg,srcofs,desseg,desofs,size:word);
var c,of1,of2:word;
begin
 c:=0;
 while c< size do
 begin
    of1:=c+ srcofs;
    of2:=c+ desofs;
    asm
     push es
     push di
     mov ax,srcseg
     mov es,ax
     mov di,of1
     mov bx,[es:di]
     mov ax,desseg
     mov es,ax
     mov di,of2
     mov [es:di],bx
     pop di
     pop es
    end;
   c:=c+2;
 end;
end;

procedure copymem(source,dest:pointer;size:word);
var c,of1,of2,srcseg,srcofs,desseg,desofs:word;
begin
 srcseg:= seg(source^);
 srcofs:= ofs(source^);
 desseg:= seg(dest^);
 desofs:= ofs(dest^);
 if (size mod 2) = 0 then
  begin
   copymem16(srcseg,srcofs,desseg,desofs,size);
   exit;
  end;
 for c:=0 to (size-1) do
  begin
    of1:=c+ srcofs;
    of2:=c+ desofs;
    asm
     push es
     push di
     mov ax,srcseg
     mov es,ax
     mov di,of1
     mov bh,[es:di]
     mov ax,desseg
     mov es,ax
     mov di,of2
     mov [es:di],bh
     pop di
     pop es
    end;
  end;
end;

There are some other ways of adding assembler if you need them.

You can use the inline statement to add bytes directly into your executable, but this means you need to actually write the code directly in hex instead of using the standard mnemonics for instructions. This makes this method very cumbersome, but good if you only want to add one or two instructions.

You can also link in obj files that you have made with other assembler tools. There are a lot of pitfalls to this method however. You need to convert the obj files into binary format for including in the executable, and they must be self-contained and use a small memory model. That is one segment for code, and one segment for data at the most. The stack space used by the rest of the program would be shared with the imported code. To call any of the code you have to know and work out the call address yourself (which will depend on where in ram you stored it!) in order to be able to use it. It can make having more than one procedure or function very difficult, especially if you make any changes to your code. If you want the assembler routines to be callable like pascal ones you have to be careful to use the pascal call model, and to not clobber any local variables in the stack. Finally you can’t access Pascal global variable or functions easily within your assembler code. You will generally have to push pointers on the stack (as a method of parameter passing) to be able to find out where the data is.

There are a couple of advantages to this last method. The main one being you can write using whatever assembler you see fit even using instructions for a processor Pascal doesn’t support. Also if you do it correctly you can load your code dynamically and throw it out of memory when you are done with it. This is how the BGI graphics drivers work, and a simplified version of how the overlay system works.

However if you don’t need better than 286 instructions (or want your product to support the 286) you might as well use the asm blocks as this allows you to have much more flexibility in what you can do.

17
Jun
12

The Java Programming language

The first language I learned at university was Java. Java is a high level object-oriented language, it is used in web pages in the form of applets. Some features of Java were very unique at the time it was created. The compiler does not generate native code, but rather generates something called byte code. Byte code basically a special form of assembly language for a machine that does not exist, in essence a virtual machine. Of course this special code will not function on normal hardware, so special interpreter software is used to run it. Interpreters for many platforms exist, for windows, mac OS, and all the unix platforms. This in and of itself isn’t that remarkable as many BASIC interpreters also generate a binary format to represent instructions in a similar way. Java code is typically very portable and is in use across many platforms today. Many programs can run across multiple platforms without modification.

Java was the major language I learned for using object-oriented and I found that it was pretty easy to get the hang of. All the major important aspects of object-oriented languages are represented. Inheritance and abstraction are implemented through extending classes and implementing interfaces. Something that helped me learn the language easier was the simpler form memory management. You can hold something called a reference to an object, which is simply Java speak for having a pointer to an object. A reference is created by the constructor implicitly and can be used freely. When the reference is no longer needed you can let it go, and the Java garbage collector will free the object at an appropriate time. You can’t perform pointer arithmetic on references (or change them manually) which means that memory references can’t be tampered with, which protects Java programs from themselves.

The API is very well documented which has been very handy to me over the years. You can generate your own documentation using the javadoc tool which I would definitely recommend you do. The API is full of very useful classes, in particular the util, nio and swing packages will contain classes you will use on a regular basis. I’ve found many of the data storage classes in util are good. Especially the vector and hashmap classes.

Over the course of my university degree I have written many Java programs for assignments. The language made implementing and studying computer science algorithms, data structures and software designs much simpler. It also helped my coding style and documentation improve and become more readable. I still use Java in much of my programming today as I can write simpler code without worrying about some of the specifics of the platform I’m writing for. I still like to write code in native programming languages partly for the challenge and partly for speed.

23
May
12

Some Interesting articles…

In doing some reading around the internet about processors I recently read an article which explains the design of modern microprocessors in a very nice and easy to understand way. You can find it here. Also of interest from the same author is a short piece about exception handling that you can find here. I tend to agree with him that exceptions are a bit of a problem in programming languages, and also avoid using them much like he does. I find exceptions generally only useful to me when trying to debug code that is throwing an exception. What I typically do is try to catch it at some point and print out the stack trace and useful variables, sometimes including actions and information from other relevant portions of code. I only do this with java as you are kind of stuck with the API throwing exceptions in some cases, otherwise I try not to use them at all.

21
May
12

Turbo Pascal for DOS

Turbo Pascal 6 about Screen

Turbo Pascal 6 about Screen

When I was a teenager I mostly wrote programs using gwbasic or qbasic. Around the time when I was about 15 or 16, my computer studies teacher gave me a copy of Borland turbo pascal 6 to try out and learn. It was a much different language to BASIC, and I unfortunately didn’t have any books to help me learn about it. Fortunately my teacher gave me a couple of sample programs so I could learn the structure and basic syntax of the language. I also compiled and inspected the code samples that came with it, and found them informative. However with out more good examples and information about hardware there were many tasks that were difficult.

One of  the main differences in pascal is the way that variables and functions are declared. Unlike C and other languages, variables can not be declared in the middle of a piece of code. There is a separate declaration portion to define them and their data types before they are used any where. This is good in some ways as it forces you to think about what you are going to need for your program, function or unit.

One of the very new things that I learned about that is very common, but not available in BASIC, was pointers. For the uninitiated a pointer is basically a number that “points” to a memory location in the computer. They are useful for dynamically allocating memory or dealing with large chunks of data. They are necessary in pascal to avoid filling up your data segment. Unfortunately they are also one of the easier things to get wrong. I avoided them when I first started writing in pascal as I had no knowledge about how they were used. Also I had very little need as my programs typically did not fill the data segment, but as the programs grew and I started using bit-mapped graphics, they became necessary.

A Open Program

An Open Program

Graphics and sound were also a very big change for me. Graphics was through a unit (library) called graph which utilised something called BGI (the borland graphics interface). Unfortunately I couldn’t get it to support the graphics modes that I was after at the time (320x200x256), as there was no BGI driver available for it in the language. Bit-mapped graphics also required that I used what was unfamiliar; pointers. So for quite a while graphics were right out of the question until I over came these hurdles later.

Later when I went to university I started to solve these problems, I found BGI drivers that supported VGA and VESA graphics modes, and learned the proper use of pointers for storing data as well as doing graphics. I later started writing in-line assembler into my programs to create my own hardware interface routines, the one I use the most being one I wrote for the adlib sound card.

Bob’s Fury

Bob's Fury Title Screen

Bob’s Fury Title Screen

Bob’s fury was one of the first projects I started back in 1999, it is a port of the original game that I wrote in QBasic. Arriving at university I had my first access to the internet and found solutions that had previously stopped me porting this to pascal. I found myself some BGI drivers that allowed the graphics mode I was after but also a higher resolution mode. I also found a nice pascal library that implemented the BASIC play statement. This made porting the sound from the original much easier.

Bob's Fury gameplay

Bob’s Fury Gameplay

I have continued to work on this game to this day, and it has changed a lot since the early days. I’ve added many features to the engine that weren’t possible in QBasic, and created more content for it. More recently I have been adding some of my own hardware interfaces, such as one for the keyboard and one for adlib sound cards, although I am yet to make some decent music for it, or the utility to create it. I’ve actually completed most of the game, but need to now create some new original levels for it as well as some story.

Romulus

Romulus in action

Romulus in action

My brothers and myself occasionally play a game called VGA planets. One of the enduring problems for us was that it would take too long with us playing so many of the human players, and there have not been any reasonably good AI players who also play as the shareware player. I know it’s sorta strange we liked playing the restricted version, but for us it introduced strategy elements that are not present in the registered version. So to help solve the problem I started to implement my own AI that would be better than many of the basic AI utilities out there.

I implemented it to behave the same way a player would as best as possible. The player reads result files and outputs turn files instead of modifying the host files directly like many other AI programs do. In this way Romulus cannot cheat, much the same way that spacelord doesn’t cheat. My focus has been to make the player reasonable good at taking advantage of the various abilities, and to be able to respond to various messages and threats such as ion storms. I’ve even attempted to implement a basic system of diplomacy which allows the player to select who it wants to wage war on.

It still needs quite a bit of work on it before I consider it complete. There are also a few segments I need to redesign and implement as the player doesn’t always do what I’d like it to. The main thing I need to do is to add more of the special abilities to the player.

Pascal Resources

There are many pascal resources around the web, but one of the best resources you can find for writing pascal on DOS is called the SWAG. Basically it is a collection of documentation, forum/BBS questions and answers, and code snippets sorted into different categories covering all the major aspects of programming in pascal. Information about programming the x86 hardware is some of the best and most useful content, especially if you are programming for DOS. I would highly recommend getting it if you can.

Pascal isn’t dead

There is a nice open source project called free pascal that implements a full pascal language including object-oriented extensions. Fortunately it is mostly syntax compatible with your old Borland pascal version 6 and 7. So you can use it to port old programs. I have noted it will also generate code for multiple platforms, including most of the operating systems and architectures that I am interested in. There is also a Delphi equivalent called Lazarus that will help you create new applications! The free pascal website is very useful, and has documentation to help you port your old programs and for the API.




Blogs I Follow

Enter your email address to follow this blog and receive notifications of new posts by email.


Mister G Kids

A daily comic about real stuff little kids say in school. By Matt Gajdoš

Random Battles: my life long level grind

completing every RPG, ever.

Gough's Tech Zone

Reversing the mindless enslavement of humans by technology.

Retrocosm's Vintage Computing, Tech & Scale RC Blog

Random mutterings on retro computing, old technology, some new, plus radio controlled scale modelling.

ancientelectronics

retro computing and gaming plus a little more

Retrocomputing with 90's SPARC

21st-Century computing, the hard way

lazygamereviews

MS-DOS game reviews, retro ramblings and more...