Posts Tagged ‘Hardware

16
Apr
18

SS20 Desktop: Renewed Vigour

Last time I had started to finally get to grips with the system hanging issues, having found out much of the problem came down to the SMP kernel issues related to the on-board SCSI that are still prevalent within NetBSD releases. I was fortunate that I’d been given a chance to try out a patch that made SMP much more stable (although not perfect). This gave me essentially 4 different configuration options. After thinking about it, I decided it would probably be prudent to make some measurements to hopefully determine what the best way to go is.

I have three Mbus modules (pictured above), a dual CPU SuperSparc @ 50Mhz, a single CPU SuperSparc @60Mhz and a single CPU HyperSparc @ 90Mhz. The clock speeds can be a little misleading as there is a little more to each module. The SuperSparc modules each come with 1Mb per CPU of cache where as the HyperSparc has only 256Kb, and the dual CPU module runs on a slower Mbus @40Mhz whilst the other two run at 50Mhz. Additionally the rough guide to Mbus modules, an essential site for anyone with a sun machine like mine, suggested that the SuperSparc CPUs would actually perform better on a per clock basis. Given all this it’s not really clear which the best performers will be. From here on in I’ll abreviate SuperSparc to SS and HyperSparc to HS

Today we’re going to look at the results of some of the intensive benchmarks I’ve put the modules through, and at the end the best choice of configuration given the hardware I have on hand. All the tests are run with the same OS (NetBSD 7.1) and hardware with the exception of the Mbus modules under test.

The first set of benchmarks are aimed at measuring basic CPU speed. The benchmarks I’ve used are Dhrystone (version 2.1), Whetstone and both the double and single precision versions of the linpack benchmark. These tests are measuring single threaded performance of the modules.

Just looking at these charts it’s obvious that the HS is the fastest of the three modules. Given its higher clock speed that is to be expected, but it also attained higher scores per clock for all the tests except whetstone. The linpack tests show a large difference with the HS running about 12% faster per clock for double precision and about 22% faster per clock for the single. The Dhrystone test showed a much more subdued advantage, only running about 7% faster per clock. The Whetstone test showed the HS was slower, doing floating point arithmetic by about 11% slower per clock cycle.

Both SS modules performed about the same relative to their clock rate, which indicates the Mbus speed wasn’t a large factor in these tests, and that the data size was likely smaller than that of the L2 cache (1Mb). I would have expected the dual 50Mhz module to be slower in single threaded tasks as the Mbus is slowed to 40Mhz (as opposed to 50Mhz the others use).

I’m not sure how I feel about the results here, the data set size for the tests was almost certainly too small to even exceed the capacity of the HSs 256Kb cache. I’m not sure what to make of the linpack results, but the dhrystone and whetstone results seem to indicate the HS core is better at integer and string operations and the SS core is better at floating point.

I selected the next benchmark because it offered speed measurements over a range of data sizes. The Sieve of Eratosthenes is a simple algorithm for finding prime numbers within a finite numerical space. Rather than explain it myself look here on Wikipedia for more details. One of it’s key features is that it is quite hard on a CPU’s memory bandwidth, and it’s use of the cache is quite sub-optimal. I omitted testing the 50Mhz SS module.

The results are quite interesting. The HS enjoys an advantage of about 14% per clock when the data set fits within it’s cache, but suffers quite a performance drop once the data set gets larger. Despite being 30Mhz slower the SS is faster for data sets small enough for its cache but too large to fit in the HSs cache. I suspect this gap would be widest at just below 1Mb data size, but the program didn’t allow control over that. The worst data point shows the HS as 44% slower per clock. This is quite surprising, as the SS is not much faster than the Mbus speed (only 10Mhz faster) I didn’t expect the advantage in that data size to be so large. After 1Mb data size is exceeded, the HS starts to catch up again, but the data points don’t get large enough to know if it ever achieves equal relative performance again. I’d imagine that once the data is large enough both modules would perform close to the same as memory bandwidth becomes the limiting factor.

The next benchmark is similar in that there is measurement over a range of data sizes, but the algorithm is significantly different. The algorithm used is heapsort, a relatively efficient sorting algorithm used in many places. You can find more details here on Wikipedia. One of it’s characteristics is that it is much more cache friendly. Again I omitted testing the dual 50Mhz SS.

Looking at the graphs this test really requires some points at larger data sizes. I can only really guess, but I’d imagine that the performance would eventually converge given that memory bandwidth would eventually become the dominant factor. The previous test indicates that there may even be a window in which the SS performs better, but without actual data we will never know.

Given that I’ll be using this machine as a desktop workstation I ran a benchmark known as x11perf. It simply tests the maximum speed of components of the X11 protocol. It’s often known just as X for short, and is basically the software that unix systems use to interface to video displays.The chart shows performance relative to the dual 50Mhz SS (the yellow line represents it). A 2 is twice and fast, and 0.5 is half as fast. Each point on the X axis is a test, like line drawing for instance, there are so many tests (over 300) it wasn’t practical to separate and chart them individually. Out of interest I ran the dual 50Mhz SS with a MP kernel to see if it made any appreciable difference.

There are some quite interesting features of this chart. Firstly you’ll notice that both the faster modules have tests that are significantly slower than the dual SS (30-35% slower at worst). This is because those tests are CPU bound, and with a dual CPU module both the X server and client can have a whole CPU to itself. Typically those tests involve little actual drawing to screen, like plotting points.

In general the dual 50Mhz SS is slower than the faster modules. The SS @ 60Mhz is about 1.15 times faster on average and the HS is 1.75 times faster on average. The HS is in general the best on the raw performance numbers, with some odd exceptions. Some tests seem to favour the SS @ 60Mhz, which would be down to cache size.

Relative to their clock speed, the 60Mhz SS does better than the HS, but I’d imagine this would be due to the SBus limiting the maximum through put to the frame buffer. The SBus only runs @ 25Mhz so is almost certainly going to slow down a faster CPU when drawing.

The last and final test is one called Ramspeed. It’s basically designed to measure the memory bandwidth. I opted for the more general integer and floating point tests over the specific reading and writing tests as they are more likely to represent a computational load. There are 4 tests, Copy creates two buffers and copies data from one to the other, Scale creates two buffers and copies data from one to the other, but scales the number by some constant, finally Triad creates 3 buffers and adds two of them together (scaling one by a constant factor) and storing the result in the third buffer. All buffers are the same size. The tests I’ve chosen only test with buffers that are 32Mb in size, so much larger than the caches of either of the modules. You can select the buffer size and some tests available in the program test a range of sizes.

The results are pretty bad for the HS, it achieves slightly better speed only for the copy operations, which shouldn’t be surprising as the Mbus should be a limiting factor. However for the other tests the SS performs quite a bit better, so much in fact I ran the tests many times just to make sure. This would appear to be down to the memory and cache architecture of the modules, not just the cache size, although that is certainly playing an important role in the HS failing to perform. The HS does have significantly smaller L1 cache only having a 8k instruction cache versus a 20k Instruction and 16k data L1 cache in the SS core.

Having now spent a couple of weeks testing these modules I think we’re starting to get a picture of what these chips can do relative to each other. The HS is clearly faster as long as any data isn’t larger than its cache. The SS on the other hand isn’t as fast at it’s peak, largely due to a lower clock speed, but handles larger data sets significantly better. The X11 test showed that it is quite beneficial to have multiple CPUs in a workstation, even if only for basic X11 applications. However it also shows the HS being quite a good choice. I think the tests also show there was some merit to the idea that the SS modules performed better relative to their clock speed, but it also shows this is highly dependent on the work load.

So what am I going with and what would I recommend. With the hardware I have I’ll use the HS @ 90 for running the machine as a workstation as that makes it snappier to use in general. The flip side is that if I were to use the machine for a computational load, such as compiling a number of packages, number cruching, or a basic server the two SS modules would almost certainly perform much better as long as the job could be divided between the CPUs. Even the SS @ 60Mhz has a good chance of doing computation better on it’s own. The HS on it’s own is disadvantaged by not being able to multi-task as well, I have noticed that X is in general less responsive when the machine is under load (compared to both SS modules together), so a second HS module would probably be a nice addition in the future.

If money was no object and I could have any parts at all, both Ross and Sun had decent offerings. The fastest SS is 85-90Mhz, two of these would certainly be quite fast. However I’d imagine they probably wouldn’t be as fast as any pair of HS modules over 125Mhz. So in the end the HS modules would be the way to go if you had access to anything. As it stands, looking around online it’s actually really hard to find faster modules for a reasonable price. Among the SS modules those over 60Mhz are quite expensive and largely not available. The HS parts have a similar problem, but you can get 90Mhz – 133Mhz parts at fairly decent prices, although faster modules still command a high price, and slower modules wouldn’t be worth it. Again with what’s available the HS seems the way to go.

I’ve tried to be as thorough as possible, but if you want to see the raw data  and gnumeric spreadsheet with calculations and charts you can find them here.

Advertisements
20
Mar
18

Trying the Campbell Cassette Interface

Some time ago I acquired an interesting bit of vintage tech, the Campbell Scientific C20 cassette interface. Since then it had been sitting on my bench looking lonely, I decided that I should at least try it out before I salvage any of the many useful chips it has inside. I have found a user manual for it along with information confirming that it is indeed what I thought it was, an interface for reading data encoded on audio cassettes by data loggers.

Not having any audio cassettes with appropriately encoded data however created a very simple issue. How exactly should I get it to do anything at all? It turns out whilst the C20 is primarily designed for reading from tape, it is possible to get it to write one as well, I thought we might as well look at the encoding on my oscilloscope and record a sample of the audio.

So I connected my oscilloscope to the output and a serial line to my old MS-DOS machine. After twiddling with the serial settings both on the machine and in Kermit I managed to get a welcome message and menu from the device which confirms that at least the CPU and serial lines are working. Unfortunately this is about as far as I have gotten.

The manual is exceptionally useful, providing not only information about basic use, but also more detailed technical information and example programs in basic for operating the device. I’ve translated one of these programs for writing data to tape, it seems the device is receiving the data I’m sending, but nothing appears on the output that I can see. I’ve not worked out if it’s something as simple as not connecting the scope correctly or if there is some hardware failure.

So unfortunately not as much to report as I’d like, but time has been quite limited and something is better than nothing. I’ll keep trying in the short term.

14
Dec
17

SS20 Desktop: Kernel Issues

Over the past few weeks I’ve been continuing my work trying to get the latest NetBSD working on my Sparcstation 20. The system has been hanging and I’d had trouble working out why, so I turned to reading as much as I could to see if I could find any clues. I found in the mailing list someone suggesting that not all SCSI drives are co-operative with the on board controller when running a MP (multi-processor) kernel on later versions, so I looked through my collection of SCA drives to see if I had a different model I could try. I found I had an IBM Ultrastar disk that is around 18G in size, so I swapped the Fujitsu drive (model MAJ3182MC) out for it. Surprisingly this made my system behave much better, it would install, and run on the uni-processor kernel with no issues at all where the fujitsu drives seemed to cause the system to hang frequently under disk access.

However booting with a MP kernel still would hang within about 20 minutes or during disk access, so it was at this point I joined the mailing list to ask others what I could do to resolve the issue. The people on the list are quite friendly and have been very helpful in trouble shooting. It seems that there are some kernel bugs related to MP that are present in 7.1 that are at least partially resolved in more recent versions of the kernel. Like most open source OS’s the current stable release is behind by a version or two from where the developers are currently working. It seems that there is some possibility of the fix being back-ported to 7.1, I tested out a patched MP kernel that was greatly improved in this respect. It still hung, but after a much longer period of time, and only when provoked by a specific program. Feedback from the mailing list also seems to indicate that choosing not to use the on board SCSI is another way that I could work around the problem.

So I now have multiple options for running my system. I could switch to using a single processor, I’d have the option of either a 60Mhz SuperSparc (currently installed with a dual 50Mhz module) or 75Mhz Ross HyperSparc, and everything should work well. Alternatively I could acquire an SBus SCSI card to connect my hard drives, or forgo a local disk entirely by using networking booting and a NFS share, both avoiding having to use the on board SCSI. Finally I could use the system as it is now with the patched 7.1 kernel, it worked well enough that this is quite feasible. I’m leaning towards booting the machine over the network at the moment.

In the short term with Christmas approaching, I’ll be putting the project aside until I have more time in the new year.

16
Nov
17

SS20 Desktop: Some minor progress

Whilst it has been some time since the original post, I have been a busy beaver trying to get the old Sparcstation 20 running. I’ve been making an effort to get the hardware working with some mixed success, and have made much better progress with the software.

The hardware is of course the much more pressing matter for obvious reasons. I had a recurrence of the problem I had with stack under run errors and just general problems booting in general. Of course this lead me to suspect the hardware, so this week I went about trying to work out what exactly was causing the issue. One way to help determine which part is at fault is by stripping the system back to the minimal and gradually add components while testing the system in between. Having removed most components the stack under run symptom didn’t disappear, trying each memory stick individually didn’t improve things, so I began to fear the worst as surely not all the RAM I have is faulty. It was at this point I decided to run the set-defaults command to reset the computers configuration despite not seeing anything there that should cause any issues, this funnily enough seemed to do the trick, as far as getting the machine to the open boot prompt without any errors and passing all the diagnostic tests with everything installed. I had to scale back to a 17G fujitsu HDD as the larger one didn’t cooperate with the system.

At this point I breathed a big sigh of relief as my hardware is probably in working condition. It’s booting the OS (NetBSD 7.1) and seems to run fine with one problem. Random system hangs. There doesn’t seem to be any pattern such as when the machine is loaded down or network access. I’m guessing that the kernel is having some issue and tries to hand control back to the system ROM, but this some how hangs/fails. I might try running the machine with out the X server in case it is stopping any errors from being displayed. I looked into the kernel messages and noted a few devices that may also be the culprit. The kernel is detecting the on board graphics (comes up as sx0 in the messages) even though I do not have a VSIMM installed, as I’m using a SBUS graphic board instead. The audio chip in my machine is listed as a DBRI, which is known to have issues with the current kernel driver. If you try to play audio in any manner the system hangs, it’s been a bug for a while, it kinda worked under NetBSD 4.0 when I last had that running. With this in mind I’m building my own kernel with the drivers for these two devices and other unnecessary devices removed.

I’ve had much more luck getting software to build in my emulated machine. I’ve got a fairly large collection of software to try out. Although I did have trouble much earlier on when either QEMU or the emulated machine would hang during a build. I can’t be sure if that’s down to the emulation or if it’s a genuine issue with the OS, and a possible cause of my problems on the real machine. Whilst I haven’t really changed anything in the emulation, it hasn’t hung for quite a while, so it’s any bodies guess as to the cause when it did happen.

Progress has been slow, but I’m gradually getting there! I’ve seen some cheap Ross Hypersparc 90Mhz modules that I’m considering buying as an upgrade.

19
Oct
17

Motherboard: EPoX EP-8K3A

Today’s motherboard is a Socket 462 board (also known as Socket A) it is an EPoX EP-8K3A made in early 2002. The CPUs that fit this socket type have an exposed die that makes direct contact with the heat sink, this is generally good for heat dissipation, but makes installing or removing a heat sink a risky business. Here’s a photo of the board.

The board has a VIA KT333 chip-set, which at the time was one of the first to support the then new DDR333 standard. VIA chip-sets were very common at the time, especially where AMD CPUs were installed. An interesting feature was it’s ability to run the memory and FSB clocks asynchronously, although in practise this wasn’t that useful. If the memory was slower it became a bottle neck for the entire system. If it was faster the CPU wouldn’t have been able to make full use of the extra bandwidth, although that bandwidth could be used by other devices such as a graphic or sound card. Also noteworthy is the fact that this is a single memory channel board, later systems made use of the dual channel architecture which had a memory bandwidth advantage.

It has the usual suspects as far as peripherals go. It has a HDD/FDD controller, Serial/parallel ports, two USB ports, AC97 audio and a game port, which would have covered most users needs at the time. It lacks on-board LAN and USB 2.0, which would have been nice to have, but are easily added via the 6 PCI slots. There were two models, one had extra IDE ports connected to a RAID controller along with a diagnostic module that displayed the status on a two digit seven segment display. I have the board without these extra features, which doesn’t worry me as I can add a RAID card if needed.

EPoX was known for making boards for the enthusiast and over-clocker, and this board doesn’t disappoint on that front. You can see the voltage regulation circuitry has more capacitors and chokes than contemporary boards. They called this three-phase, but that’s not really a good description, basically it has three separate voltage regulator circuits just for the CPU core voltage. This results in a power supply with less noise on the line, and with the larger capacitor bank it also handles spikes in workload/power drain better. It probably increased the boards reliability over the long term, even if you didn’t over-clock. I found a review of the board that was written at the time it was released that has more details.

By the time this board was made jumpers were mostly a thing of the past, with everything under software control in the BIOS settings generally. With the front panel connectors clearly marked this board would have been quite easy to install and set up for an end use. This board would have been favoured by technicians partly because of this, but also because it would have almost certainly been more reliable, was fairly cheap, and was even forward compatible with processors and RAM that was yet to be released.

For end users this would have been a great work horse board for anyone, it is cheap, reliable, and has extensive upgrade options. However now as an old board, there are better socket A boards from the era with more features, better compatibility and faster chip-sets more capable of over-clocking. It would still be good in a vintage PC build, but not for a high performance machine of the era.

24
Aug
17

Motherboard: Another unknown Socket 3

Today I’m looking at another 486 socket 3 motherboard that unfortunately I can’t identify. Unlike the last one, this one actually had it’s model number on the silk screen, but the OEM who put it into a machine has covered the silkscreen label with either white paint or white out so that it is unreadable. Obviously this is a massive pain as I have no chance of finding a manual for this board, which is needed because of the large number of jumpers. I suspect they didn’t want end users finding out that it was a low quality board. Here’s a photo.

Again it’s a later 486 board as it has PCI slots rather than VLB slots. Reading the date codes on the chips reveals it was made in mid 1995, around the same time as the other socket 3 I have. The chipset was made by UMC, which I’m unfamiliar with. After having done some forum lurking over at VOGONS and reading some of the Red Hill Guide, it seems that it’s a fairly common chipset found on a variety of boards. I can’t comment on the performance myself, but others have had success getting decent performance out of their chipsets.

There are very few integrated peripherals, it has an old school DIN keyboard connector and two IDE ports, but strangely no floppy disk controller, serial or parallel ports. This ultimately wouldn’t have saved much money for the end user as they’d have to use add in cards to replace the functionality. Weirdly the IDE ports each use different styles of socket, another sign of cheapness.

The cache chips and system ROM are all socketed, which is a good sign that the cache is probably not a fake. The EPROM unfortunately had the sticker missing, exposing the window for the UV erasable chip. I’ve since put my own sticker over the window to protect it.

In an effort to identify it, I decided to pull the ROM chip and read it in my TL866 universal programmer. I was hoping to find a string that had the model name in it directly,but after an extensive search I only found the BIOS version string, “2A4X5B05”, which was enough to identify the manufacturer as Biostar but not the model.

Another unfortunate feature of this board is this real time clock chip with integrated battery. The idea is great in theory, but results in an unusable board when the battery runs flat, which it has.  Some of these RTC chips had the option of an external battery, unfortunately this isn’t one of them, so the only option I have is to either replace the chip (it’s not socketed) or hack it open and attach an external battery. Unfortunately this board doesn’t even remember the settings through a warm reboot, preventing it from actually booting an OS.

Like many 486 boards much of the basic configuration is done with jumpers. This usually means looking them up in the manual, but this board does have the basic settings for voltage, FSB speed and L2 cache size. Still there are obviously many more jumpers that are undocumented on the board, so the manual would be really handy. Luckily the silk screen has enough information you could install a CPU and not make the magic smoke escape.

At the time this board was made it was fairly low end, and windows 95 was just around he corner. It would have probably performed ok with MS-DOS and Windows 3.1, but would have been inadequate for Windows 95 when it came out later the same year. Most 486 machines didn’t really perform well with windows 95 so that’s hardly a surprise. The lack of integrated peripherals is probably the worst point with this particular board, as you’ll need add-on cards even for basics such as a floppy drive and serial port (which you’ll need for a mouse). Otherwise it would have made a serviceable, but not powerful machine.

01
Jun
17

SparcStation Desktop project.

Unfortunately I’ve been neglecting my poor old Sun hardware, mostly because of time and space constraints. I thought I’d try to go some way to correcting this by actually beginning the process of setting up the SparcStation 20 as a vintage desktop work station. I’d been planning on doing this for ages, as long ago as when I built the replacement server machine.

Hardware wise I’ve not acquired anything new, although everything needed a test and some basic cleaning to get it working. I’m still having issues, but I’m unsure if it’s an hardware fault or a problem with the software I’m installing. We’ll get to the software in a moment, first we’ll look at the hardware installed.

At the moment I have 3 CPUs in the machine. They are all V8 Supersparcs with two 50Mhz chips on one module and a 60Mhz one on a module on it’s own. Each module has 1Mb of cache memory which doesn’t sound like much now, but was a large amount when these machines first appeared.

Frame Buffer

Frame Buffer

I’ve currently got about 304Mb of memory installed, I had more but unfortunately one of the sticks that was in it fails to detect anymore. I’d like to have a VSIMM as that would allow me to use the built in cg14 frame buffer (graphics card) which is probably the best performing one available for machines of it’s type. I managed to purchase a 2Mb TGX+ frame buffer and adapter to connect it to a VGA screen, which is doing an odd resolution of 1152×900 at 8 bits per pixel. It’s obviously not the fastest, but it does the job. I’ve selected an 136Gb 10K RPM SCA drive for the hard disk, certainly a bit of overkill, but it would just be sitting on my shelf otherwise.

The initial issue I had was stack under run errors after the boot screen came up and the machine attempted to boot. My first instinct was of course failed memory, which lead me to find the undetected memory module. But no matter which memory I ran I had the same problem. After some poking into the system environment (kinda like the BIOS settings in the PC but without the nice interface.) I found some items that were not at their defaults and changing them back seems to have fixed the stack under run.

Dual CPU MBUS module

Dual CPU MBUS module

Unfortunately that’s not the end of the issues, as after installing and running NetBSD for a while the machine will hang, reset or have a watchdog timer trigger. This certainly could be faulty RAM, but the power supply is also a potential suspect as is the operating system itself. I need to follow this up with some more testing, unfortunately I don’t have a spare PSU to test with.

Software wise I’m much more prepared and have had much more success. I’ve been using Qemu, which does full-system emulation for a number of old and different platforms, including Sparc systems. Qemu has been useful for building packages and the kernel specifically for my machine. Something I had done ages ago when I first intended to do the install.

At the time I built for NetBSD 6.1.4 which is the OS I’ve installed and tried out on the machine. It’s out of date by quite some margin now, so I’ve set up a new virtual machine to start work on getting 7.1 packages and kernel built. It has a bunch of improved hardware support, particularly in the frame buffer acceleration, so I’m keen to see how it goes. I’m still building packages I want for it, but I’m happy with 7.1 under qemu so far. I’m hoping the improved hardware support helps with the hang/watch dog/reset issues.

When it’s all done, I’ll post about what it’s like to use the machine for specific tasks, like say browsing the web and checking email.




Blogs I Follow

Enter your email address to follow this blog and receive notifications of new posts by email.

Advertisements

Mister G Kids

A daily comic about real stuff little kids say in school. By Matt Gajdoš

Random Battles: my life long level grind

completing every RPG, ever.

Gough's Tech Zone

Reversing the mindless enslavement of humans by technology.

Retrocosm's Vintage Computing, Tech & Scale RC Blog

Random mutterings on retro computing, old technology, some new, plus radio controlled scale modelling.

ancientelectronics

retro computing and gaming plus a little more

Retrocomputing with 90's SPARC

21st-Century computing, the hard way

lazygamereviews

MS-DOS game reviews, retro ramblings and more...