Archive for the 'NetBSD' Category

16
Apr
18

SS20 Desktop: Renewed Vigour

Last time I had started to finally get to grips with the system hanging issues, having found out much of the problem came down to the SMP kernel issues related to the on-board SCSI that are still prevalent within NetBSD releases. I was fortunate that I’d been given a chance to try out a patch that made SMP much more stable (although not perfect). This gave me essentially 4 different configuration options. After thinking about it, I decided it would probably be prudent to make some measurements to hopefully determine what the best way to go is.

I have three Mbus modules (pictured above), a dual CPU SuperSparc @ 50Mhz, a single CPU SuperSparc @60Mhz and a single CPU HyperSparc @ 90Mhz. The clock speeds can be a little misleading as there is a little more to each module. The SuperSparc modules each come with 1Mb per CPU of cache where as the HyperSparc has only 256Kb, and the dual CPU module runs on a slower Mbus @40Mhz whilst the other two run at 50Mhz. Additionally the rough guide to Mbus modules, an essential site for anyone with a sun machine like mine, suggested that the SuperSparc CPUs would actually perform better on a per clock basis. Given all this it’s not really clear which the best performers will be. From here on in I’ll abreviate SuperSparc to SS and HyperSparc to HS

Today we’re going to look at the results of some of the intensive benchmarks I’ve put the modules through, and at the end the best choice of configuration given the hardware I have on hand. All the tests are run with the same OS (NetBSD 7.1) and hardware with the exception of the Mbus modules under test.

The first set of benchmarks are aimed at measuring basic CPU speed. The benchmarks I’ve used are Dhrystone (version 2.1), Whetstone and both the double and single precision versions of the linpack benchmark. These tests are measuring single threaded performance of the modules.

Just looking at these charts it’s obvious that the HS is the fastest of the three modules. Given its higher clock speed that is to be expected, but it also attained higher scores per clock for all the tests except whetstone. The linpack tests show a large difference with the HS running about 12% faster per clock for double precision and about 22% faster per clock for the single. The Dhrystone test showed a much more subdued advantage, only running about 7% faster per clock. The Whetstone test showed the HS was slower, doing floating point arithmetic by about 11% slower per clock cycle.

Both SS modules performed about the same relative to their clock rate, which indicates the Mbus speed wasn’t a large factor in these tests, and that the data size was likely smaller than that of the L2 cache (1Mb). I would have expected the dual 50Mhz module to be slower in single threaded tasks as the Mbus is slowed to 40Mhz (as opposed to 50Mhz the others use).

I’m not sure how I feel about the results here, the data set size for the tests was almost certainly too small to even exceed the capacity of the HSs 256Kb cache. I’m not sure what to make of the linpack results, but the dhrystone and whetstone results seem to indicate the HS core is better at integer and string operations and the SS core is better at floating point.

I selected the next benchmark because it offered speed measurements over a range of data sizes. The Sieve of Eratosthenes is a simple algorithm for finding prime numbers within a finite numerical space. Rather than explain it myself look here on Wikipedia for more details. One of it’s key features is that it is quite hard on a CPU’s memory bandwidth, and it’s use of the cache is quite sub-optimal. I omitted testing the 50Mhz SS module.

The results are quite interesting. The HS enjoys an advantage of about 14% per clock when the data set fits within it’s cache, but suffers quite a performance drop once the data set gets larger. Despite being 30Mhz slower the SS is faster for data sets small enough for its cache but too large to fit in the HSs cache. I suspect this gap would be widest at just below 1Mb data size, but the program didn’t allow control over that. The worst data point shows the HS as 44% slower per clock. This is quite surprising, as the SS is not much faster than the Mbus speed (only 10Mhz faster) I didn’t expect the advantage in that data size to be so large. After 1Mb data size is exceeded, the HS starts to catch up again, but the data points don’t get large enough to know if it ever achieves equal relative performance again. I’d imagine that once the data is large enough both modules would perform close to the same as memory bandwidth becomes the limiting factor.

The next benchmark is similar in that there is measurement over a range of data sizes, but the algorithm is significantly different. The algorithm used is heapsort, a relatively efficient sorting algorithm used in many places. You can find more details here on Wikipedia. One of it’s characteristics is that it is much more cache friendly. Again I omitted testing the dual 50Mhz SS.

Looking at the graphs this test really requires some points at larger data sizes. I can only really guess, but I’d imagine that the performance would eventually converge given that memory bandwidth would eventually become the dominant factor. The previous test indicates that there may even be a window in which the SS performs better, but without actual data we will never know.

Given that I’ll be using this machine as a desktop workstation I ran a benchmark known as x11perf. It simply tests the maximum speed of components of the X11 protocol. It’s often known just as X for short, and is basically the software that unix systems use to interface to video displays.The chart shows performance relative to the dual 50Mhz SS (the yellow line represents it). A 2 is twice and fast, and 0.5 is half as fast. Each point on the X axis is a test, like line drawing for instance, there are so many tests (over 300) it wasn’t practical to separate and chart them individually. Out of interest I ran the dual 50Mhz SS with a MP kernel to see if it made any appreciable difference.

There are some quite interesting features of this chart. Firstly you’ll notice that both the faster modules have tests that are significantly slower than the dual SS (30-35% slower at worst). This is because those tests are CPU bound, and with a dual CPU module both the X server and client can have a whole CPU to itself. Typically those tests involve little actual drawing to screen, like plotting points.

In general the dual 50Mhz SS is slower than the faster modules. The SS @ 60Mhz is about 1.15 times faster on average and the HS is 1.75 times faster on average. The HS is in general the best on the raw performance numbers, with some odd exceptions. Some tests seem to favour the SS @ 60Mhz, which would be down to cache size.

Relative to their clock speed, the 60Mhz SS does better than the HS, but I’d imagine this would be due to the SBus limiting the maximum through put to the frame buffer. The SBus only runs @ 25Mhz so is almost certainly going to slow down a faster CPU when drawing.

The last and final test is one called Ramspeed. It’s basically designed to measure the memory bandwidth. I opted for the more general integer and floating point tests over the specific reading and writing tests as they are more likely to represent a computational load. There are 4 tests, Copy creates two buffers and copies data from one to the other, Scale creates two buffers and copies data from one to the other, but scales the number by some constant, finally Triad creates 3 buffers and adds two of them together (scaling one by a constant factor) and storing the result in the third buffer. All buffers are the same size. The tests I’ve chosen only test with buffers that are 32Mb in size, so much larger than the caches of either of the modules. You can select the buffer size and some tests available in the program test a range of sizes.

The results are pretty bad for the HS, it achieves slightly better speed only for the copy operations, which shouldn’t be surprising as the Mbus should be a limiting factor. However for the other tests the SS performs quite a bit better, so much in fact I ran the tests many times just to make sure. This would appear to be down to the memory and cache architecture of the modules, not just the cache size, although that is certainly playing an important role in the HS failing to perform. The HS does have significantly smaller L1 cache only having a 8k instruction cache versus a 20k Instruction and 16k data L1 cache in the SS core.

Having now spent a couple of weeks testing these modules I think we’re starting to get a picture of what these chips can do relative to each other. The HS is clearly faster as long as any data isn’t larger than its cache. The SS on the other hand isn’t as fast at it’s peak, largely due to a lower clock speed, but handles larger data sets significantly better. The X11 test showed that it is quite beneficial to have multiple CPUs in a workstation, even if only for basic X11 applications. However it also shows the HS being quite a good choice. I think the tests also show there was some merit to the idea that the SS modules performed better relative to their clock speed, but it also shows this is highly dependent on the work load.

So what am I going with and what would I recommend. With the hardware I have I’ll use the HS @ 90 for running the machine as a workstation as that makes it snappier to use in general. The flip side is that if I were to use the machine for a computational load, such as compiling a number of packages, number cruching, or a basic server the two SS modules would almost certainly perform much better as long as the job could be divided between the CPUs. Even the SS @ 60Mhz has a good chance of doing computation better on it’s own. The HS on it’s own is disadvantaged by not being able to multi-task as well, I have noticed that X is in general less responsive when the machine is under load (compared to both SS modules together), so a second HS module would probably be a nice addition in the future.

If money was no object and I could have any parts at all, both Ross and Sun had decent offerings. The fastest SS is 85-90Mhz, two of these would certainly be quite fast. However I’d imagine they probably wouldn’t be as fast as any pair of HS modules over 125Mhz. So in the end the HS modules would be the way to go if you had access to anything. As it stands, looking around online it’s actually really hard to find faster modules for a reasonable price. Among the SS modules those over 60Mhz are quite expensive and largely not available. The HS parts have a similar problem, but you can get 90Mhz – 133Mhz parts at fairly decent prices, although faster modules still command a high price, and slower modules wouldn’t be worth it. Again with what’s available the HS seems the way to go.

I’ve tried to be as thorough as possible, but if you want to see the raw data  and gnumeric spreadsheet with calculations and charts you can find them here.

Advertisements
14
Dec
17

SS20 Desktop: Kernel Issues

Over the past few weeks I’ve been continuing my work trying to get the latest NetBSD working on my Sparcstation 20. The system has been hanging and I’d had trouble working out why, so I turned to reading as much as I could to see if I could find any clues. I found in the mailing list someone suggesting that not all SCSI drives are co-operative with the on board controller when running a MP (multi-processor) kernel on later versions, so I looked through my collection of SCA drives to see if I had a different model I could try. I found I had an IBM Ultrastar disk that is around 18G in size, so I swapped the Fujitsu drive (model MAJ3182MC) out for it. Surprisingly this made my system behave much better, it would install, and run on the uni-processor kernel with no issues at all where the fujitsu drives seemed to cause the system to hang frequently under disk access.

However booting with a MP kernel still would hang within about 20 minutes or during disk access, so it was at this point I joined the mailing list to ask others what I could do to resolve the issue. The people on the list are quite friendly and have been very helpful in trouble shooting. It seems that there are some kernel bugs related to MP that are present in 7.1 that are at least partially resolved in more recent versions of the kernel. Like most open source OS’s the current stable release is behind by a version or two from where the developers are currently working. It seems that there is some possibility of the fix being back-ported to 7.1, I tested out a patched MP kernel that was greatly improved in this respect. It still hung, but after a much longer period of time, and only when provoked by a specific program. Feedback from the mailing list also seems to indicate that choosing not to use the on board SCSI is another way that I could work around the problem.

So I now have multiple options for running my system. I could switch to using a single processor, I’d have the option of either a 60Mhz SuperSparc (currently installed with a dual 50Mhz module) or 75Mhz Ross HyperSparc, and everything should work well. Alternatively I could acquire an SBus SCSI card to connect my hard drives, or forgo a local disk entirely by using networking booting and a NFS share, both avoiding having to use the on board SCSI. Finally I could use the system as it is now with the patched 7.1 kernel, it worked well enough that this is quite feasible. I’m leaning towards booting the machine over the network at the moment.

In the short term with Christmas approaching, I’ll be putting the project aside until I have more time in the new year.

16
Nov
17

SS20 Desktop: Some minor progress

Whilst it has been some time since the original post, I have been a busy beaver trying to get the old Sparcstation 20 running. I’ve been making an effort to get the hardware working with some mixed success, and have made much better progress with the software.

The hardware is of course the much more pressing matter for obvious reasons. I had a recurrence of the problem I had with stack under run errors and just general problems booting in general. Of course this lead me to suspect the hardware, so this week I went about trying to work out what exactly was causing the issue. One way to help determine which part is at fault is by stripping the system back to the minimal and gradually add components while testing the system in between. Having removed most components the stack under run symptom didn’t disappear, trying each memory stick individually didn’t improve things, so I began to fear the worst as surely not all the RAM I have is faulty. It was at this point I decided to run the set-defaults command to reset the computers configuration despite not seeing anything there that should cause any issues, this funnily enough seemed to do the trick, as far as getting the machine to the open boot prompt without any errors and passing all the diagnostic tests with everything installed. I had to scale back to a 17G fujitsu HDD as the larger one didn’t cooperate with the system.

At this point I breathed a big sigh of relief as my hardware is probably in working condition. It’s booting the OS (NetBSD 7.1) and seems to run fine with one problem. Random system hangs. There doesn’t seem to be any pattern such as when the machine is loaded down or network access. I’m guessing that the kernel is having some issue and tries to hand control back to the system ROM, but this some how hangs/fails. I might try running the machine with out the X server in case it is stopping any errors from being displayed. I looked into the kernel messages and noted a few devices that may also be the culprit. The kernel is detecting the on board graphics (comes up as sx0 in the messages) even though I do not have a VSIMM installed, as I’m using a SBUS graphic board instead. The audio chip in my machine is listed as a DBRI, which is known to have issues with the current kernel driver. If you try to play audio in any manner the system hangs, it’s been a bug for a while, it kinda worked under NetBSD 4.0 when I last had that running. With this in mind I’m building my own kernel with the drivers for these two devices and other unnecessary devices removed.

I’ve had much more luck getting software to build in my emulated machine. I’ve got a fairly large collection of software to try out. Although I did have trouble much earlier on when either QEMU or the emulated machine would hang during a build. I can’t be sure if that’s down to the emulation or if it’s a genuine issue with the OS, and a possible cause of my problems on the real machine. Whilst I haven’t really changed anything in the emulation, it hasn’t hung for quite a while, so it’s any bodies guess as to the cause when it did happen.

Progress has been slow, but I’m gradually getting there! I’ve seen some cheap Ross Hypersparc 90Mhz modules that I’m considering buying as an upgrade.

01
Jun
17

SparcStation Desktop project.

Unfortunately I’ve been neglecting my poor old Sun hardware, mostly because of time and space constraints. I thought I’d try to go some way to correcting this by actually beginning the process of setting up the SparcStation 20 as a vintage desktop work station. I’d been planning on doing this for ages, as long ago as when I built the replacement server machine.

Hardware wise I’ve not acquired anything new, although everything needed a test and some basic cleaning to get it working. I’m still having issues, but I’m unsure if it’s an hardware fault or a problem with the software I’m installing. We’ll get to the software in a moment, first we’ll look at the hardware installed.

At the moment I have 3 CPUs in the machine. They are all V8 Supersparcs with two 50Mhz chips on one module and a 60Mhz one on a module on it’s own. Each module has 1Mb of cache memory which doesn’t sound like much now, but was a large amount when these machines first appeared.

Frame Buffer

Frame Buffer

I’ve currently got about 304Mb of memory installed, I had more but unfortunately one of the sticks that was in it fails to detect anymore. I’d like to have a VSIMM as that would allow me to use the built in cg14 frame buffer (graphics card) which is probably the best performing one available for machines of it’s type. I managed to purchase a 2Mb TGX+ frame buffer and adapter to connect it to a VGA screen, which is doing an odd resolution of 1152×900 at 8 bits per pixel. It’s obviously not the fastest, but it does the job. I’ve selected an 136Gb 10K RPM SCA drive for the hard disk, certainly a bit of overkill, but it would just be sitting on my shelf otherwise.

The initial issue I had was stack under run errors after the boot screen came up and the machine attempted to boot. My first instinct was of course failed memory, which lead me to find the undetected memory module. But no matter which memory I ran I had the same problem. After some poking into the system environment (kinda like the BIOS settings in the PC but without the nice interface.) I found some items that were not at their defaults and changing them back seems to have fixed the stack under run.

Dual CPU MBUS module

Dual CPU MBUS module

Unfortunately that’s not the end of the issues, as after installing and running NetBSD for a while the machine will hang, reset or have a watchdog timer trigger. This certainly could be faulty RAM, but the power supply is also a potential suspect as is the operating system itself. I need to follow this up with some more testing, unfortunately I don’t have a spare PSU to test with.

Software wise I’m much more prepared and have had much more success. I’ve been using Qemu, which does full-system emulation for a number of old and different platforms, including Sparc systems. Qemu has been useful for building packages and the kernel specifically for my machine. Something I had done ages ago when I first intended to do the install.

At the time I built for NetBSD 6.1.4 which is the OS I’ve installed and tried out on the machine. It’s out of date by quite some margin now, so I’ve set up a new virtual machine to start work on getting 7.1 packages and kernel built. It has a bunch of improved hardware support, particularly in the frame buffer acceleration, so I’m keen to see how it goes. I’m still building packages I want for it, but I’m happy with 7.1 under qemu so far. I’m hoping the improved hardware support helps with the hang/watch dog/reset issues.

When it’s all done, I’ll post about what it’s like to use the machine for specific tasks, like say browsing the web and checking email.

18
Apr
15

Xsokoban on NetBSD

Having been busy and stressed out lately I haven’t had much time for tinkering or gaming. This is where smaller games that you can play as a quick distraction can help, as they are easy to squeeze in between other jobs, and todays game is one such game.

Xsokoban is obviously a clone of the old classic game sokoban. The original was made for the PC-8801 way back in 1982 by Thinking Rabbit. It has since been implemented and ported to pretty much every system, Xsokoban is one such port for Unix systems running X windows.

There isn’t anything really remarkable about this particular port. The quality of the graphics is quite reasonable, and it works well on pretty much any system I’ve tried it on including my older machines such as the Sparcstation 20. I’ve even played it over SSH via my comparatively slow ADSL2 connection, and it worked really quite well.

Interestingly the game has been studied in the field of computer science, it turns out it’s quite a difficult problem computationally, being PSPACE-complete. Lots of different researchers have worked on different algorithms for solving and producing optimal solutions. The complexity certainly makes me feel better about getting temporarily stuck on level 6!

As far as enjoyment goes, it’s really enjoyable for the puzzle solver in me, even when I’m stuck. It doesn’t take a huge commitment to play for a short while, and is quite challenging! This particular port isn’t any better or worse than any other, so don’t go out of your way, but NetBSD and FreeBSD users will find it easy to get running.

This slideshow requires JavaScript.

20
Oct
14

3rd Aniversary and Work on the Sparcstation

This weekend marks the third year I’ve been writing this blog. The first thing I wrote about was my Sparcstation 20, which I had just acquired at the time. I installed NetBSD 4.01 on it, which was reasonable then, but has become quite out of date now. So 186 posts and 3 years later I’m in the process of upgrading the machine to NetBSD 6.1.5.

Machine without the PSU

Machine without the PSU

This has been a long time coming, and there are a number of reasons for the upgrade. Firstly, the older version of NetBSD was becoming more difficult to keep software up to date on. I had stuck with 4.01 for some time because of performance issues I had when trying out 6.1.2 last year. But some packages didn’t update properly lately and I had been left with some software working and others just becoming broken. I could have stuck with an older version of pkgsrc, but that has problems as well.

Another reason is I’ve received the hardware required to use the machine as a desktop machine with screen,keyboard and mouse instead of a headless server. I retired the machine from active server duty and built a replacement server quite recently to facilitate both upgrading the OS and hardware to try and make it a practical desktop workstation. I was very fortunate to receive a donation of a keyboard and mouse suitable for the machine, and have since bought the frame buffer card and adapter to complete the hardware necessary.

Frame Buffer

Frame Buffer

I got the hardware up and running last weekend and powered up the machine with everything set up for the first time. I was happy that without upgrading the OS, I had the display, keyboard and mouse all working with an X server with little effort. I was impressed that the X server seemed quite speedy compared to what I expected. However X server (Xsun) was really outdated and didn’t seem to support everything thrown at it.

So I began to install NetBSD 6.1.4. I found it was best to use the serial console for the install as the install disk does not handle the sun console on the frame buffer properly. It seems that it just doesn’t have the TERMCAP entries for the sun console, as once the system is installed the console works fine. The install worked pretty much the same as the older version with a few minor changes. The performance of 6.1.4 seemed better than the last time I tried an upgrade, but still isn’t as fast as the older 4.01 release.

So I’ve begun building the system from sources to take advantage of the V8 Supersparc. I’m assuming the binary distributions you download are actually built for the slower V7 Sparc that can be common in some of the other older and slower machines. The build process is surprisingly very easy to follow. We will see if there is any significant difference when it’s finished building.

 

08
Sep
14

Building a Replacement Server

I’ve been using my old SparcStation 20 for about 3 years for storing my source repositories, allowing VPN access and web serving among other functions. I originally set it up like this as an interesting project to see if I could make good use of exceptionally old hardware with more modern software (NetBSD in this case) and it turned out to be quite handy. The experience as a whole has been a very positive one.

Sun Keyboard and Mouse

Sun Keyboard and Mouse

Now the time has come to not so much retire the SparcStation, but move it into a new function as a vintage workstation. I was very fortunate to receive a donation of a type 5c keyboard and mouse suitable for use with it, all I have to get is a frame buffer card and I can plug in a screen and use it as a desktop. Fortunately frame buffer cards are much easier to find than keyboard/mouse combinations so I shouldn’t have an issue finding one.

Having decided to build a new server machine, I went looking through my collection of old hardware to see what I could build out of my spare parts. I already had the large tower case recently donated, so I checked out what was installed in it. Turns out it was a Duron 800, which is quite reasonable, but after measuring its power consumption (about 70W without hard drives) I decided I could make a machine that was cheaper to run with some other parts.

Obviously I want something more efficient than the SparcStation, which uses around 130W with everything installed. It turned out to be quite difficult to find x86 hardware that is efficient once everything is installed. After looking at what I have and doing a bit of research I decided to try out the old Coppermine Celeron 800Mhz as it had quite a low TDP. Powered up with a graphic card but no hard disks it used about 60W, unfortunately it didn’t want to boot, and no amount of prodding got it to work.

Looking in my collection of old hardware I didn’t have many alternatives. I could use a socket 7 based system, but that would likely be _slower_ than the sparc and may use a similar amount of power. I have some Pentium II boards, but I wanted them as spares for my Win98 system. In the end I used some suitable socket 478 (Pentium 4) hardware, which initially looked bad efficiency wise. The P4 of course was known for running hot, and hence also using lots of power.

My older brother donated a MSI socket 478 mainboard to me some time ago without a CPU. I looked through my collection of CPUs and found a Celeron 2.4Ghz and heat sink. I installed it and 1Gbyte of DDR and it worked with little effort, but the power consumption without hard disks was about 80 something watts, not ideal. I decided to press on with this hardware as I had no other vintage parts that would be suitable, and that power usage ought to be the worst for the board and processor. That and I don’t have money for new hardware at the moment.

Machine Assembled

Machine Assembled

So I assembled the machine in the chassis with a Pioneer DVD drive and two Western Digital hard drives. I selected two 80Gb ATA WD drives as they turned out to have the best power consumption and reasonable capacity. All together, just sitting at the BIOS screen the machine used about 100W. Again a worst case and not that great a saving, but at least it’s significantly faster.

I decided to stick with NetBSD for this build for a few reasons, firstly it is simpler to migrate the configuration and data from the old machine. Secondly I like NetBSD because of how light it is and how easy it is to work with. I downloaded the latest version (6.1.4 as of this writing) and went through the install process. Installation was fairly easy, but I couldn’t get X to work correctly on my hardware. I didn’t have a local X server before, so I didn’t worry about getting it to work beyond XDMCP.

After installation I measured the power consumption of the machine at idle, I was pleasantly surprised that it dropped to about 65-70 Watts, a nice improvement over the sparc. Power usage peaks at about 100W when the machine is under full load as I first thought. After setting up the hard disks to power down after idle for a while I managed to reduce this to just bellow 60W.

I’m now happy with the hardware I have set up, although I could use modern hardware and save even more power. I’m currently in the process of setting up the software. I’m rebuilding the kernel and userland for NetBSD. It’s a surprisingly easy process, and well worth it especially for older hardware. I’m not ready to deploy the machine yet, but it looks like it will work well.




Blogs I Follow

Enter your email address to follow this blog and receive notifications of new posts by email.

Advertisements

Mister G Kids

A daily comic about real stuff little kids say in school. By Matt Gajdoš

Random Battles: my life long level grind

completing every RPG, ever.

Gough's Tech Zone

Reversing the mindless enslavement of humans by technology.

Retrocosm's Vintage Computing, Tech & Scale RC Blog

Random mutterings on retro computing, old technology, some new, plus radio controlled scale modelling.

ancientelectronics

retro computing and gaming plus a little more

Retrocomputing with 90's SPARC

21st-Century computing, the hard way

lazygamereviews

MS-DOS game reviews, retro ramblings and more...