Monday, August 24, 2015

Emulator Release 1.03 and Browser Disk Space Usage

Nigel and I are pleased to announce that version 1.03 of the retro-B5500 emulator was released on 22 August 2015. All changes have been posted to the repository for our GitHub project. The hosting site has also been updated with this release, and for those of you running your own web server, a zip file of the source can be downloaded from the GitHub repository [updated 2022-05-07].

This is a minor release containing corrections for a few issues we have discovered since version 1.02 was released in June. The most significant correction addresses excessive disk storage usage for emulated B5500 disk units.


Excessive Browser Disk Usage Fixed


The emulator uses an HTML5 API known as IndexedDB to provide persistent storage for B5500 disk devices. IndexedDB is a non-relational (NoSQL) database mechanism. It stores Javascript objects, indexed either by an external key value, or by possibly multiple data items internal to the object being stored. Instead of tables, IndexedDB has "object stores," which serve much the same purpose.

The emulator creates a small IndexedDB database named "retro-B5500-Config" to store system configuration data. It creates one additional IndexedDB database for each emulated disk system created by the configuration interface. Each of these subsystem databases contains a "CONFIG" store for its internal configuration, plus an "EUn" store for each disk Electronics Unit in the subsystem. Within an EU store, each disk sector is represented as an object, indexed by its zero-relative sector number. An SU is modeled simply as the range of sector numbers it represents.

The Problem


One of the puzzling (and quite disappointing) things several of us have noticed about the emulator is the very large amount of physical disk space it uses to store B5500 disk sector data. The amount of physical disk space required for the IndexedDB database is typically at least 30 times the size of the data being stored.

Each browser manages IndexedDB data internally in its own proprietary way, and nether of the two browsers that are known to support the emulator, Mozilla Firefox and Google Chrome, make it easy to see from the outside what's there or how it's stored. Firefox implements IndexedDB on top of SQLite. It is easy to see the EU structures in the SQLite database, but the objects themselves are very opaque, probably because they are compressed. Chrome implements IndexedDB using LevelDB, an open-source database manager developed by Google and based on its BigTable technology. Its internal structures are opaque from the outside, also due to compression.

In the past few months, several of us who have larger disk subsystems have been having increasing problems with Firefox accessing those subsystems. The first symptoms were that B5500 operations would simply grind to a halt in the middle of running ordinary work. The MCP was still active, but one of the Disk File Control Units (DFCU, i.e., DKA or DKB) would be hung, along with one of the I/O Units. A halt/load would bring the system back up, but often the system would lock up again fairly quickly. If you had multiple disk subsystems, deleting one of them would usually bring the system back to normal operation. This behavior was seen only with Firefox, not with Chrome.

After a couple of weekends of very frustrating investigation, I finally discovered that (a) Firefox was aborting a disk I/O due to a "Quota Exceeded" error, and (b) aborts are not reported to the IndexedDB onerror event, but rather to its onabort event, and the emulator was not catching the onabort event. The aborted I/O resulted in that I/O never being reported as complete, so the IO Unit stayed busy, and the MCP considered the I/O to still be in progress. Additional I/Os for that DCHA or EU queued up behind the aborted I/O, and eventually everything in the system went idle waiting for I/Os to complete. Not detecting the onabort event in the disk driver was a design error on my part.

The quota error reports that the emulator's "source" (web site) is using more local disk storage than it is allowed. I have not been able to find out exactly what that storage limit is in Firefox, plus, either the limit or the way it is enforced has been changing over the last few Firefox releases. Based on what a few of us have observed, the limit appears to be somewhere in the range of 500MB to 1GB of disk usage. With a 30X inflation factor, that relates to 30-60MB of B5500 data, which means that just loading the SYSTEM, SYMBOL1, and SYMBOL2 tape images would put you in range of the limit.

Knowing we were hitting a quota error and approximately where that error is triggered was useful, but it did not help in knowing what to do about it. My working assumption was that the method we were using to represent sector data was responsible for the 30X inflation factor, and was starting to think about different representations in order to reduce that factor.

The situation came to a head about two weeks ago when Firefox 40.0 was released. It refused to open most of my disk subsystems. I couldn't even halt/load the MCP in order to dump the files to tape. Fortunately, I had held off upgrading to FF 40 on my main development system until I found out about this problem, so was able to dump its disk subsystem while still on FF 39.

At this point something obviously had to be done about the way the emulator was storing disk sectors, so last weekend I started to build a testbed to evaluate different ways of representing sector data. As reported in more detail in this post on the forum, I used the old B5500ColdLoader script as a basis for my testbed, and decided to start by creating a disk subsystem using the current sector representation to obtain a baseline disk space usage.

To generate that baseline, I simply loaded the entire SYSTEM tape image, which is 12.4MB in size. When I checked the size of the resulting IndexedDB database to determine its inflation factor, I found -- surprise, surprise -- it wasn't inflated at all! In fact, it was only 10MB -- 80% the size of the raw data.

That result made me start looking very carefully at the way the ColdLoader and the emulator's disk driver stored data to IndexedDB. I had thought they were the same, but found there was one of those differences that you wouldn't believe could cause a problem until the evidence pushes the realization in your face.

The short story is that each IO Unit has an internal 16KB buffer it uses to convert between the 6-bit character codes in B5500 memory and the 8-bit ASCII codes used by the device drivers. The buffer is an HTML5 Uint8Array, which is a form of TypedArray object. The IO Unit passes this buffer during calls to the drivers, along with a length that indicates how much of the buffer is valid.

Since each disk sector is modeled as a separate object in IndexedDB, each sector must be stored to IndexedDB separately. Therefore, the disk driver must extract each sector of a multi-sector write from the IO Unit buffer and store it with a separate IndexedDB put() call. It implemented that extract using the subarray() method of Uint8Array:
eu.put(buffer.subarray(bx, bx+240), segAddr);
Alas, the resulting object that gets passed to put() is not a 240-byte Uint8Array, but rather an object that consists of the underlying array plus the starting and ending index values, as if it were something like this:
{data: buffer, start: bx, end: bx+240}
Thus, what got stored in the EU object store for that sector was not 240 bytes as I had assumed, but a copy of the entire 16KB IO Unit buffer, plus the two index values. No wonder we were seeing at least a 30X inflation factor in the data stored.

Firefox obviously does some compression on this object, but the degree of compression depends largely on what is in the IO Unit buffer. That buffer initially contains all zero bytes, and is simply overwritten by the data for each I/O. Thus, while the end of the buffer could be expected still to contain zeroes, the extent to which the front of the buffer had been overwritten by non-zero data, and therefore how much it could be compressed, would depend on the length of previous I/Os.

While storing these inflated sector objects had thus far only caused a problem in Firefox, they had the same effect in Chrome, and the sizes of Chrome's LevelDB databases were comparable to that of Firefox's SQLite databases. Chrome computes its quota threshold differently, however, based on the workstation's total available disk space. Apparently no one running the emulator has yet hit the Chrome quota limit.

The Solution


The solution to this problem is quite simple -- stop dragging that 16KB buffer around and storing it as part of every sector object. I did that by declaring a 240-byte Uint8Array object local to the disk driver, extracting sector data from the IO Unit buffer into that local object, and doing the IndexedDB put() on that local object, thus:
var sectorBuf = new Uint8Array(240);

...
while (segAddr <= endAddr) {
    for (x=0; x<240; ++x) {
        sectorBuf[x] = buffer[bx++];
    }
    eu.put(sectorBuf, segAddr);
    ++segAddr;
}
where bx is the current offset into the IO Unit buffer, segAddr is the current disk sector address, endAddr is the ending sector address for the I/O, and x is just a local index variable.

This approach introduces a little extra overhead to copy the bytes of each sector from the IO Unit buffer to the local buffer, but that overhead should be more than offset by the elimination of the 16KB IO Unit buffer in the stored object, the need to compress that buffer, and the extra I/O overhead to read and write an inflated sector object.

The nice thing about this solution is that it can be used with existing disk subsystems. Sector objects already present in the subsystem will be big and fat, but will still work. When a sector is written, it will stored in the new, more space-efficient format. Rewriting a sector will release some space to the underlying database's available pool, but it will not reduce the total amount of disk space used by the subsystem database. To do that, some sort of compaction process needs to take place, such as the vacuum command for SQLite. LevelDB will apparently compact its databases gradually over time automatically.

Experience to date with this solution indicates that my initial testbed experience is holding up. By completely dumping a disk subsystem to tape images, deleting and recreating it (to get an empty subsystem database), and reloading the dump, I am seeing reductions in disk space usage of 30-60X. In Firefox, the physical size of the reloaded databases continues to be about 80% of the size of the tape images used to load the databases. To cite one concrete example, the disk subsystem I use for most of my emulator development and testing went from 1.9GB down to 48MB, a reduction of almost 40X.  The dump tapes used to reload the subsystem totaled 58.7MB.

Even though this solution was easy to implement, works with existing disk subsystems, and adequately resolves the problems we have been having with recent versions of Firefox, I have not entirely given up on the idea of a new representation for sector data. There is a lot of overhead in the IO Unit involved in translating between 6- and 8-bit characters, packing and unpacking words, etc., and it may be that another approach will yield both better space and better processing efficiency.

Implementation


Since the fix for IndexedDB disk space usage in 1.03 is compatible with existing disk subsystems, there is nothing you need to do in preparation for moving to this release. If you are already having trouble with Firefox quotas, or have large disk subsystems and want to avoid having problems in the future, then you should consider dumping and reloading your disk subsystems. This section describes how to do that.

If you have already upgraded to Firefox 40 and cannot access your disk subsystems at all, then dumping the data is going to be problematic. See the next section for some ideas on things you can try and some background on where IndexedDB data is stored.

Assuming you can halt/load from your disk subsystem, however, then here are the steps to shrink the amount of disk space that subsystem is using.
  1. Dump the B5500 disk data: 
    1. Mount a blank tape in one of your tape drives. Make sure it is write-enabled (it will be by default) and the drive is in REMOTE status. 
    2. On the SPO, enter the following command: "?DUMP TO MYDUMP =/=; END". You may substitute any other tape name for MYDUMP, up to seven characters long.
    3. After the drive rewinds, click LOCAL and UNLOAD on the drive, then indicate you want to save the tape image.
    4. You can either save the page directly in Firefox (make sure you save it as type Text, not HTML), or copy/paste the text of the image to a text editor and save it from there. Make sure your editor is set not to trim trailing spaces from lines in the file or to replace spaces with tabs.
    5. If Library/Maintenance is not finished, mount another blank tape and repeat steps 3 and 4 above as necessary.
  2. Start the emulator, but do not power it on.
  3. Delete and recreate the disk subsystem:
    1. Enter the System Configuration tool by clicking the B5500 logo on the Operator Console panel.
    2. Select the appropriate system configuration, then click the EDIT button next to the Storage name for that configuration.
    3. In the Disk Storage Configuration window that opens, click the yellow DELETE button. Click through the "are you sure you want to do this" prompts. The Disk Storage Configuration window should close.
    4. Back on the System Configuration window, click the NEW button next to the Storage name field. The name of the subsystem just deleted will probably still display in the pull-down list. Enter the name of the disk subsystem you just deleted in the pop-up dialog that displays.
    5. Configure the new disk subsystem as desired in the resulting Disk Storage Configuration window. Click SAVE on that window when finished, then click SAVE on the System Configuration window.
  4. Cold-start the new disk subsystem. See the instructions for doing this in the Getting Started wiki page. You can either use the standard SYSTEM tape image to do this, or the first tape of the dump you just created. In that latter case, you will need to modify the cold-start deck to use the tape name of the dump.
  5. Once the system has halt/loaded, enter on the SPO, "?LOAD FROM MYDUMP =/=; END". Library/Maintenance will refuse to overwrite a few system files, such as the running MCP. Do not forget to CI the Intrinsics file once it has been reloaded.
This may read like a lot of work, but it usually goes quickly. You should repeat this process for each of your disk subsystems.

Recovering from Quota Exceeded Problems


If Firefox considers your disk storage already to have exceeded the quota limit, then you may not be able to run the emulator in order to dump the data. There are a couple of things you can try, however.
  • If you have multiple disk subsystems, try moving all but one of them out of the folder where Firefox maintains IndexedDB databases. That may reduce the disk space usage below the quota threshold, and allow you to halt/load from the remaining disk subsystem and dump it to tape. You may then be able to swap in the remaining disk subsystems one at a time and dump them as well.
  • If some disk subsystems are too large for the first suggestion to work, you can try downgrading temporarily to an earlier version of Firefox, and see if you can halt/load and dump the subsystem using that version. You can download older Firefox releases from https://ftp.mozilla.org/pub/mozilla.org/firefox/releases/.
On Windows 7, Firefox stores its IndexedDB data in the following location:
\Users\<user>\AppData\Roaming\Mozilla\Firefox\Profiles\<profile ID>\storage\default\<source>\idb\
where <user> is your Windows user name, <profile ID> is the profile name Firefox assigns (e.g., 0qus6gtz.default), and source is the host name of the web site from which you have loaded the emulator, e.g.,
http+++localhost
http+++www.phkimpel.us
Within the idb\ folder, each IndexedDB database is represented by a directory and an SQLite file. The names of these files are derived from the disk subsystem name as follows:

  • The first part of the name is a 10-digit number, probably derived from a numeric hash of the name.
  • The second part of the name is the disk subsystem name, but with the letters rearranged. This part consists of the characters from the first half of the name, with the remaining characters inserted between them in reverse order. For example, B5500DiskUnit is converted to 1182897429Bt5i5n0U0kDsi.
  • The directory name has the extension ".files" and the SQLite file has the extension ".sqlite".
If the database is presently open and being updated, there may be additional SQLite files with the same name but different extensions. 

I have found that you can move the file and directory for a database into and out of the idb\ folder, and Firefox will adapt to that, but it is best to do this when Firefox is completely shut down.

For Ubuntu Linux, the IndexedDB data is stored in this path under your home directory. Most Linux distributions will probably be similar:
~/.mozilla/firefox/<profile ID>/storage/default/<source>/idb/
For Apple Macintosh OS X, the IndexedDB data is stored in this path under your home directory:
~/Library/Application Support/Firefox/Profiles/<profile ID>/storage/default/<source>/idb/
Prior to Firefox 38, IndexedDB data was stored in the .../storage/persistent/ folder instead of .../storage/default/ for all host systems, and the directory for the database did not have the ".files" extension.


Other Corrections in Release 1.03


In addition to the disk space usage fix described above, this release has the following:

  1. Added onabort traps in B5500DiskUnit to catch QuotaExceeded errors. These are reported to the MCP as unit not-ready conditions, which will cause an error message to be printed on the SPO.
  2. Modified the delay-deviation adjustment mechanism in B5500SetCallback to avoid oscillating between positive and negative cumulative deviations. This should improve the responsiveness of the emulator's internal timing and multi-threading.
  3. Corrected tape reel angular motion in B5500MagTapeDrive, especially during reverse tape movement.
  4. Fixed a bug with reporting memory parity error during tape I/O (should that error ever occur, which at present it won't, as the emulator does not generate memory parity errors).
  5. Reset the Algol Glyphs option for card punch CPA in the default system configuration template. Newly-created configurations will no longer have this option set by default.
  6. Fixed tools/B5500LibMaintDecoder to examine an entire .bcd tape image file instead of just the first 64KB.
  7. Added USE SAVEPBT to the default cold-start options in tools/COLDSTART-XIII.card. This will cause printer-backup tapes to be marked as saved when they are released and not printed automatically.
  8. Eliminated the extraneous "schema update successful" alert when altering a disk subsystem configuration. This means there is one less dialog box you must click through when changing that configuration.
  9. Commited minor corrections supplied by Richard Fehlinger to source/B65ESPOL/SOURCE.alg_m.