In the work leading up to my two-month hiatus, the emulator became substantially more stable and more capable on almost a weekly basis. We are now at Release 0.14, which was pushed out in early October. There are still problems, and more features to implement, but the emulator is now in a very usable state.
The last blog post ("It's Alive...," 3 June 2013) seemed to strike a chord. Several people are now using the emulator, others have contacted us with comments, and we have received offers of additional B5500 material. There is more on this subject below.
Significant Changes and ImprovementsShortly before the last blog post, we resolved a very nasty problem with the so-called "R+7" aspect of subroutine stack linkage. That one fix made the emulator about an order of magnitude more stable than it had been prior to that point. It enabled us to begin using the system as you would a real B5500 under the control of its MCP operating system. Since then, the following major enhancements and fixes have been implemented:
Initially, all that we had were the SPO and Head-per-Track disk peripheral units. This made it impossible to run anything but programs we could load from the Mark XIII system tape images using the ColdLoader utility. A high priority was to implement card input. The current driver emulates the Burroughs B129 1400-card/minute reader. Card decks are ordinary ASCII text files. You load one or more files into the reader using a standard file picker dialog, then press the reader's START button. The MCP senses the reader's change in status, and starts reading cards, just as it worked on a real B5500."Dummy" Line Printer
Nigel started working on the implementation for a line printer peripheral unit, and ran into problems getting a prototype to work. It turned out the problems were mine in the way that printer I/Os were being initiated and terminated in the IOUnit and CentralControl modules. In the process of fixing those, I literally threw together a very basic diagnostic printer driver out of pieces of the SPO and card reader implementations. After getting my IOUnit problems fixed, I took out the diagnostic stuff, and well, we're still using that. It works fine for simple output, but at some point will need to be replaced by something with a better user interface and more complete functionality.Card Punch
After getting the card reader and preliminary line printer units to work, it was a straightforward task to clone a card punch peripheral driver out of those. Besides, I was beginning to work on the Card-Load-Select mode of loading the system, and needed a way to output card decks for programs like the COOL and COLD loaders.Improved Console Display
The B5500 had a very minimalist operator console -- just a few buttons and lights. There were lots more lights on the maintenance panels in the Distribution and Display unit, but those were usually hidden from view behind the "skins" of the mainframe cabinets. With the emulator, though, it was often difficult to see from the console what was happening with the system (or whether anything was happening at all), so I have added some annunciators to the console that show the activity of the I/O Units, external interrupts, and the individual peripheral devices. These are really helpful to gauge the activity of the system, and they make a nice show, besides. The extra lights can be disabled if you are a purist and want the console to look like it did on a real B5500.Smaller User Interface Windows
The initial design of several of the windows for the peripheral devices were just too large. They worked fine on the 23-inch monitor I typically use for development and testing, but on other systems, particularly laptops, the windows crowded each other out and made it difficult to see what was happening overall with the system. The windows for the SPO, card reader, and card punch have all been reduced in size to better accommodate smaller displays.RTS Presence-Bit Bug
RTS is the Return from Subroutine instruction. It is typically used to exit from what Burroughs termed "accidental-entry" procedures, but what the rest of the world refers to as "thunks." This type of subroutine is used to implement the semantics of Algol Call-by-Name parameters. It turns out that such subroutines can return data descriptors, and a requirement of the RTS instruction is to throw a Presence Bit interrupt (page fault) if the returned descriptor points to an absent memory segment.Floating-Point Arithmetic Bugs
That requirement was poorly-documented, and we weren't checking for descriptor absent status in the emulator. The result was that the emulator could use the address field of an absent descriptor as if that field contained a valid memory address. This error only showed up while running some FORTRAN programs, and it proved to be very difficult to trace the symptoms back to the cause. It took several long, frustrating days of tracing and debugging before I finally found an obscure reference to the P-bit requirement, after which the fix was obvious and simple -- as is often the case with such problems.
One of the first things I tried once the card reader and line printer were working was an Algol number-crunching program from my student days, for which I still had listings of source and output from a B5500 run in 1970. The program does orthonomalization of vectors to compute rheological parameters for two-phase flow in a round pipe (and before you ask, no, I don't understand what that means anymore).Card Load Select Bug
Getting the program to run was no problem, but the results from the emulator were not even close to those on the 1970 listing. I was getting at best one digit of agreement between the two. I have a few other programs and listings from that era, and they were showing similar problems for cases involving complex calculations. Programs doing simpler calculations showed quite good agreement, however, so that smelled like some sort of rounding or normalization problem in the emulator.
After being deviled by this for months and frustrated in a couple of attempts to find the problems, earlier this month I wrote an Algol program for the B5500 that generated a variety of numeric-word bit patterns, computed all of the combinations of those bit patterns for the add, subtract, multiply and floating-divide operators, and dumped the results in octal. I then converted that program to the modern Unisys MCP architecture (which uses the same numeric format) and generated an equivalent set of results.
Comparing the two sets of results indeed revealed a number of cases of off-by-one differences in mantissa values, and a few cases where the differences were even worse. Knowing the bit patterns that generated these differences, I was able to trace the evaluation of those specific patterns in the emulator, and found several problems with rounding and normalization. All but one of the problems were in add/subtract, which internally is the same operation with some sign manipulation. The remaining problem was due to rounding when multiplying two integers -- which by their nature should never have their product rounded.
After correcting these issues, the results from the emulator now match -- to the digit -- the results in all of my listings from 1970 that I've been able to check thus far.
By default, the B5500 booted from disk. A push-button switch on the operator console would cause it to load from cards, however. Loading the MCP from disk has been working for months, but attempting to load from cards would read the binary boot loader card plus the first card of the program being loaded, then hang. After previously making several runs at the problem, earlier this month I finally found the cause -- hardware load proceeds much like any other I/O, but it is initiated as a special mode of the I/O Unit. It generates a result descriptor, but not a completion interrupt.Hosting Site and Wiki
The problem turned out to be that the emulator was not suppressing the completion interrupt. Load from disk worked, either because of the timing involved, or more likely because the MCP's KERNEL bootstrap was smart enough to ignore the extraneous interrupt. In contrast, the binary one-card loader is pretty dumb, and apparently became confused by the pending interrupt left by the hardware load mechanism. Booting the system from cards now works.
While the emulator runs entirely within a web browser, it requires a web server from which it can be loaded into the browser. Not everyone has the wherewithal to set up and operate their own web server, so we have set up a web site to support the following:
You are welcome to visit and use this site at http://www.phkimpel.us/B5500/.
- Host the current release of the emulator.
- Make available the Mark XIII tape images containing the Burroughs system software.
- Make available releases of the emulator source code for downloading.
- Serve as a central source for emulator utilities and information.
We have also created a number of wiki pages on the project's
Google Code site (http://code.google.com/p/retro-b5500/)GitHub site describing how to set up and use the emulator and its components. There are links to these wiki pages under the main Help link on the hosting site
Browser Status and PerformanceOne of the goals of this project has been to have the emulator execute programs at the speed of a real B5500, or as close to that as can be practical in its browser-based environment. Throughout the development of the emulator, especially in the Processor module, we have been concerned about the emulator's potential performance, and have tried the keep the coding as lean as possible. With the emulator becoming reasonably stable and the ability to compile and run various programs through the card reader, we are starting to get a feel for its performance.
We could see this in the difference of the emulator's performance when running in Google Chrome compared to Mozilla Firefox. Earlier this summer, Chrome was ahead in implementing the 4ms HTML5 DOM standard for setTimeout(), and the effective speed was much closer to the B5500 than with Firefox. Apparently Firefox made a change to its timer granularity in version 22, and the emulator performance is now better with Firefox that it is with Chrome.
In that latter case, the Processor will resume sooner than it should (and not actually throttle the performance very much), but the throttling mechanism does its computations based on total cycles accumulated vs. total elapsed time, so at the end of the next throttling cycle, it will typically compute an even larger delay. Eventually the delay will grow to the point that it exceeds the threshold, and setTimeout() will be called to do some real throttling. This approach generates jitter in the execution of the Processor, but the delays and non-delays average out, and it happens fast enough (15ms is approximately the refresh rate on most monitors), that the jitter usually is not noticeable.
With this new approach to throttling in place, the emulator is still running a little slower than a real B5500, but only by 7-8%. Further improvements in apparent performance will probably require detailed tuning of clock accumulation in the individual instructions. That can wait. In any case, B5500 instruction timings are difficult to model, because the Processor overlapped execution and memory access whenever it could, and the crossbar memory access mechanism in Central Control could generate random delays due to conflicting access to a memory module by multiple Processors and I/O Units.
Performance of the emulator also depends somewhat on the underlying platform. Firefox and Chrome remain the two browsers we have found that support the features the emulator needs to run. Apple Safari through 6.0 does not yet support IndexedDB, although the emulator should work in Firefox on a Mac. It does not work on Microsoft Internet Explorer through IE10. We have not yet tried Opera.
I have tried the emulator on a variety of Windows systems using Firefox, and it runs everywhere I have tried. It runs well, if slightly slower than on my quad-core Optiplex 390, on a five-year old Dell D830 with a 2 GHz Pentium Core Duo T7250 under 32-bit Windows 7, and
Other ParticipantsOne of the gratifying things about this project is the interest that other people have shown in it. We have been somewhat surprised at the number of people who have picked up the emulator and started using it, without much apparent difficulty, and only then let us know what they were doing. A few of those people have become more intensely involved with the project:
- Fausto Saporito of Naples, Italy has been an early user of the emulator, and has contributed a number of FORTRAN benchmarks, including Whetstone and an arctangent program that appears to do a good job of measuring floating-point loss of significance for a processor. Fausto has also single-handedly transcribed the Mark XVI FORTRAN source from the listing on bitsavers.org. The current version is in the project's Subversion repository on Google Code.
- Tim Sirianni of Eureka, California, US, stunned us by reporting that he had the TS (timesharing) MCP running and was using the CANDE timesharing editor, sort of. We don't have datacom working yet in the emulator, which is where the "sort of" comes in. Tim found a way to use the SPO as a CANDE terminal, but it is awkward to use, and not an approach for the less-than-determined.
- Paul Cumberworth of Adelaide, Australia has transcribed the patches for our Mark XVI ESPOL compiler source and gotten those to compile with the base source. These are also available in our Subversion respository.
A copy of the scanned listing is available on our hosting site at http://www.phkimpel.us/PickUp/APL-B5500-Listing-19710111.pdf. It is about 44MB in size.
Fausto and Hans Pufal of Angouleme, France, have volunteered to transcribe the APL listing. Hans has helped us before, having previously transcribed the Mark XVI source code we used to create our ESPOLXEM cross-compiler. Fausto is starting from one end of the scanned APL listing and Hans from the other. At last report, they had only 20 pages to go until they meet in the middle, à la the Mont Blanc tunnel. Their progress to date is available in the Subversion repository. Once their transcription is complete, it will need to be proofread and corrected before it can be used. We will also need to get datacom working in the emulator.
Current EffortsWe have some known problems and a couple of high-priority features requiring attention.
- A proper datacom interface will be required to run the TSMCP and CANDE, as well as the APL interpreter. I am currently working on a very basic, one-terminal implementation of the B249 Data Transmission Control Unit and B487 Data Transmission Terminal Unit. Supporting external terminals in a browser environment is extremely difficult (browsers are quite determined be be clients, not servers), so this initial implementation will simply host a single terminal as a user interface to the B249/B487, somewhat similar to the way the SPO currently works. That should be adequate for most users. We hope to have this feature available soon.
- After datacom, the next priority is support in the emulator for magnetic tapes. We think we know how to approach this, but detailed design work has not yet begun.
- The B5500 would support two processors, but our attempts to get the second processor working have thus far been a failure. This has been especially frustrating, because the differences between P1 (the control processor, which is currently working) and P2 (which could only run Normal-State user programs under control of P1) are very minor. In fact, the two processors on a real B5500 were physically identical, and either one could be designated as P1 by means of a mechanical switch. I have made three serious runs at this problem, most recently last weekend, and come up short each time. I made some progress this last time, finding a problem in the way P2 was handled by the SFI (Store for Interrupt) instruction. With that change, P2 now runs for a few seconds before somehow failing. The problem is obviously subtle, and is proving difficult to trap, even with special code inserted into the emulator to do so. Getting P2 to work is a relatively low priority, so this problem has been set aside for now.
- The other major deficiency in the Processor implementation at present is that the double precision arithmetic operators have never been finished. Their single-precision equivalents are currently standing in for them. One of Fausto's benchmarks requires double precision, and the compilers require double-precision in order to properly compile double-precision literals, so the priority of this issue is rising.
1 Has anyone else noticed that in the days of the B5500, "core" meant memory, but now it means processor?