B5500 Emulation Project: SWITCH vs. CASE, Part 2

This post continues the discussion of SWITCH vs. CASE as implemented in the Burroughs B5500 Extended Algol compiler. In Part 1, we examined the code generated for each of these constructs and analyzed their differences. In this second part, I will analyze what was wrong with the program I wrote to explore those constructs and describe how to fix it.

To briefly recap the discussion thus far, I wrote a small Algol program shortly after going to work for Burroughs in 1970. The ostensible purpose of this program was to examine the code for both the SWITCH and CASE constructs of the language to determine which was more efficient.

Today, being surrounded by such a glut of inexpensive, incredibly powerful computing devices that they literally have become hazardous waste, it is easy to forget how precious and expensive computer time was a few decades ago, and how difficult it often was to come by. The start of my career at Burroughs was blighted by assignment to a boring documentation project that offered no opportunity to program. Recreational programming was rarely an option in those days, so when the subject of SWITCH vs. CASE came up within another group in the office, I leaped at the chance to get a coding fix and help them decide which construct they should use.

Capitalizing on this opportunity, I decided to go beyond writing a program simply to analyze SWITCH and CASE, and add a few more things to see how they worked as well, including a couple of Stream Procedures that would attempt to dump the program's Program Reference Table, or PRT. In hindsight, that was more than a little foolish, as Stream Procedures, improperly used, could compromise the health of the entire system, and they were quite easy to use improperly.

I managed to demonstrate that ease on my first attempt by making a couple of really dumb mistakes in coding the Stream Procedures. I did not get the desired results, and the program aborted with an Invalid Index (array bounds violation) fault. Now, decades later, the goal is to figure out what happened and fix it.

The Dumb Mistakes

While the program and listing discussed in Part 1 were successful in illuminating how SWITCH and CASE worked, I didn't get the PRT dump I wanted. I got part of a dump before the program aborted with the Invalid Index interrupt, but it turns out that dump was not of the program's PRT.

What the program tried to do was copy the PRT to the array A in the program, and then format the words from that array as octal to a printer file. Copying the PRT was to be the responsibility of the MOVEPRT Stream Procedure. It turns out that I was on the right track, but made two serious errors.

The basic idea of the procedure is simple -- get the address of a word with a known offset within the PRT, adjust that address downward to the beginning of the PRT, then copy some number of words from that address to the destination array. The first variable declared in an Algol program is always at PRT offset 25 octal, so I chose that location as the base. The procedure has four parameters: a descriptor containing the address of the first variable, a descriptor for the destination array, and two integers, the first representing the number of words to copy divided by 64, and the second the number of words modulo 64. The reason for having two parameters is that repeat counts in B5500 Character Mode are limited to six bits -- values 0-63 -- so the div/mod parameters will be used as repeat counts for two nested loops.

The mistakes in this procedure are all on line 19 of the program:

SI ~ LOC PRT25; SI ~ SI - 21;

The first statement was intended to assign to the source index (SI) the address of the word at PRT offset 25 octal (the variable I in the program). Alas, what it assigned was the address in the stack of the PRT25 parameter itself. Dumb. I was confused about the semantics of LOC -- that keyword should not be there. The second statement was intended to back down that address by 21 decimal (25 octal) words to the beginning of the PRT. Alack, what it did was back down the address by 21 characters. Dumber.

The rest of the procedure was written correctly. The destination index (DI) was assigned the address of the array A, and the appropriate number of words gets moved by the "DS~...WDS" constructs. The parentheses indicate repeat loops, which are preceded by their repeat count, limited to the range 0-63. The hardware forces word transfers to begin on a word boundary, so even though the adjustment to SI above left it pointing in the middle of a word, whole words starting on word boundaries were transferred to the array.

At this point the alert reader may have noticed that the destination array is declared in the inner block as A[0:I], and the call on MOVEPRT at line 50 in the program uses I as the number of words to transfer, but I is never assigned a value in the program. How could that ever work? The answer lies in the second control card of the deck, "?COMMON = 100". That command stores the specified value in the first scalar variable declared in the program, before execution of the program begins. In this case, that store is to the integer I at PRT offset 25. Thus, A has dimensions of [0:100] and a length of 101; 100 words will be moved by MOVEPRT.

But what got moved to the array A? Since SI was adjusted backwards by 21 characters (two words plus five characters), it is pointing into the third word below the location of the PRT25 parameter in the stack. A word-oriented transfer adjusts the address, if necessary, forward to the next word boundary, so the transfer actually began two words below the location of that parameter and continued for 100 words. What the output in the original listing shows is a piece of the program's stack, starting one word below the stack frame for the call on MOVEPRT:

The first word in the output (all zeroes) is whatever was at top-of-stack before MOVEPRT was called.
The second word (beginning with a 6) is the Mark Stack Control Word (MSCW) that starts the stack frame for the call. The primary purpose of this word is to link to the prior stack frame, which is at address 12262 octal.
The third word (beginning with a 5) is the parameter PRT25. This is a data descriptor pointing to the variable I in the program, at address 13325 octal.
The fourth word is a data descriptor for the array A. The data for this array is present in memory at address 11737 octal.
The fifth word (value 1) is the value of I (100) divided by 64 and truncated to an integer. This word is the parameter N1.
The sixth word is the value 36 (100 mod 64 = 36, or 44 octal), although that may not be very obvious, as it is in B5500 floating-point notation. The RDV syllable that implements the Algol MOD operator produces a result in floating-point format, even if it is an integer value. This word is the parameter N2. The fact that this value is not a normalized integer is another problem, as discussed below.
The seventh word is the Return Control Word (RCW). The primary purpose of this word is to hold the procedure return address (12163 octal) and to link back to the MSCW (at 12264 octal).
Any local variables for the procedure would appear after the RCW, but this procedure has none. What we see in the rest of the output is whatever was left in the stack by prior push-pop activity.

The corrected statements for line 19 should look like this:

SI ~ PRT25;   8(SI ~ SI - 21);

Removing the LOC keyword causes SI to load the address of the variable passed as the parameter PRT25 (i.e., I), not the address of the parameter word itself. Adding a repeat of eight around the adjustment to SI decrements the index backward by 21 words instead of 21 characters. This could also have been written "21(SI~SI-8)", but the former involves larger decrements with fewer loop iterations, so is more efficient.

There is another bug concerning the MOVEPRT procedure, but it is in the call on line 50, not in the procedure itself. I discovered this as I was looking over the program's output in the original listing. As mentioned above, the value of N2 is in floating point format from that MOD operator used in the call, not in normalized integer format. The problem is that B5500 Character Mode doesn't know about floating-point numbers, and when presented with a dynamic repeat count, it simply takes the low-order six bits from the word. The low-order six bits of that floating-point word are zero, so wherever N2 is used in the Stream Procedure, the repeat will be executed zero times instead of 36 as intended. The same problem occurs with the MOD operator in the call to BINOCT on line 51.

One way to fix this problem is to compute the modulo count and assign it to an integer variable, then pass that integer variable as the parameter. Another way to force integer normalization is an integer divide (Algol DIV operator) by one on the result of the MOD operator. Since the variable K was not being used for anything at that point in the program, a quick-and-dirty solution is to use that variable in an in-line assignment solely for the side effect of generating an integer result, thus:

MOVEPRT (I, A[*], I DIV 64, (K ~ I MOD 64));
BINOCT (I DIV 64, (K ~ I MOD 64), A[*], B[*]);

Perhaps the best way to deal with the limit of 63 for Character Mode repeat counts is to do the div/mod inside the Stream Procedure, thus:

STREAM PROCEDURE MOVEPRT (PRT25, A, N);
  VALUE N;
BEGIN
  LOCAL N1, N2;
  SI ~ LOC N;   SI ~ SI + 6;          % POINT TO 7TH CHAR OF N
  DI ~ LOC N1;  DI ~ DI + 7;          % POINT TO 8TH CHAR OF N1
  DS ~ CHR;                           % MOVE SIX BITS TO N1
  DI ~ LOC N2;  DI ~ DI + 7;          % POINT TO 8TH CHAR OF N2
  DS ~ CHR;                           % MOVE SIX BITS TO N2
  
  SI ~ PRT25;   8(SI ~ SI - 21);      % POINT TO PRT (NO CHANGE)
  DI ~ A;                             % POINT TO ARRAY A (NO CHANGE)
  N1(2(DS ~ 32 WDS));   DS ~ N2 WDS;  % MOVE WORDS (AN IMPROVEMENT)
END MOVEPRT;

In this approach, the N1 and N2 parameters have been replaced by a single parameter, N. It still relies on the value of N being an normalized integer, but that is easier to accomplish in the call than with the MOD operator. N1 and N2 are now declared as words local to the Stream Procedure; they will be allocated in the stack frame for that procedure. Actually, since Character Mode did not use the stack as such, these locals are passed as hidden parameters by the caller. This allocation is done automatically by the compiler. The hidden parameter words are guaranteed to have a zero value upon entry to the procedure.

The partitioning of the count N into div-64 (in N1) and mod-64 (in N2) portions works as follows:

The address of the parameter word for N in the stack is stored in the Source Index, SI. This is an example of proper use of the LOC keyword -- SI is loaded with the location (address) of N rather than the contents of N. That address is then advanced by six characters to point to the second-lowest order character in the word. Assuming the count will be less than 4096, that character holds the binary value of the count, div 64.
Similarly, the Destination Index, DI, gets the address of the word allocated for N1. This address is advanced by seven characters to point to the low-order character in the word.
One character is moved to the Destination String (DS). This implicitly references SI and DI, and advances both by the number of characters moved. That results in the div-64 value being stored in the low-order six bits of N1.
Next, DI gets the address of the word allocated for N2, and that address is advanced to the low-order character in that word.
One character is again moved to DS. SI was left pointing to the next (low-order) character of N by the prior move, so this moves the mod-64 value to the low-order six bits of N2.

At this point, the values of N1 and N2 can be used in the same manner as in the original procedure. There is another optimization that can be made, however. The original procedure moved the mod-64 portion of the data by means of "N2(DS~WDS)". This specifies a loop repeating N2 times, moving one word on each iteration. It is significantly more efficient to write this as "DS~N2 WDS", which moves N2 words in one operation, avoiding the loop management overhead.

A similar technique can be used with BINOCT to pass a single parameter for the count and have it partitioned into div-64 and mod-64 values within the procedure.

Now that we have the problems with the Stream Procedures corrected, the next issue is the Invalid Index interrupt that aborted the program. The fault occurred during evaluation of one of the list elements in the WRITE statement at line 55, which is part of the FOR J loop that starts on line 52. The termination message on the listing gives us a strong clue as to what the problem is:

-INVALD INDEX CASESW /PAULROS= 3, S =   3, A =  63, 201 GEQ 201

As mentioned earlier, "S=3, A=63" refers to the decimal segment number and word offset within the segment where the interrupt occurred. The "= 3" preceding that refers to the program's mix number (equivalent to a PID in Unix/Windows). "201 GEQ 201" describes the bounds violation. The index value used was 201, which being zero-relative, exceeds the 201-word length of the array.

The problem is either the terminating value for the FOR loop, or the dimensionality of the array B -- take your pick. The loop is attempting to format the words from the array A as octal values. The BINOCT call at line 51 does the translation from binary to a character string representing the octal values of the words, but since we are translating from three-bit octades to six-bit characters, the destination string must have twice as much space as the source. Therefore, B should have twice the length of A, but it doesn't:

ARRAY A[0:I], B[0:2|I];

The variable I has the value 100, so A has a length of 101, and B has a length of 201. Oops.

As the joke goes, there are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

To fix this, we need either to make that length at least 202 or to terminate the loop one iteration earlier. The latter is probably the more correct solution (since we moved only I words from the PRT into A), but I chose the former. With either of those corrections, the program will complete without throwing the Invalid Index interrupt.

There is still one more bug in this program -- well, okay -- at least one more bug. If you look at the program's output on the original listing, it consists of two columns. The left column is intended to be the PRT offset in octal, with the right column containing the octal value of the word at that offset. Alas, the left column is all zeroes. The PRT offset is not being formatted properly.

The offset is formatted by the BINOCT call on line 54. One word is formatted from the address of J to two words at the address of Y. Since Z is declared immediately after Y, the second word of the formatted result (the one with the significant half of the offset) will be in Z. Alas, the WRITE statement at line 55 references Y, which coming from the high-order bits of J, contains all zeroes. The solution is simply to substitute Z for Y in the WRITE statement's list.

In sum, the fixes to the program from 1970 required the following changes (shown in red) to five lines:

13: ARRAY A[0:I], B[0:2|I+1];
19:   SI ~ PRT25;   8(SI ~ SI - 21);
50:   MOVEPRT (I, A[*], I DIV 64, (K ~ I MOD 64));
51:   BINOCT (I DIV 64, (K ~ I MOD 64), A[*], B[*]);
55:       WRITE (PR, F1, Z, B[J|2], B[J|2+1]);

That is a lot of bugs for a 55-line program, but perhaps is not so bad for a first attempt. Considering what I have been able to learn from it, not only in 1970, but more recently in the retro-B5500 emulator project, I count this as a very successful effort. That I finally got the program to work properly is just gravy.

Resources

In addition to the original listing cited in Part 1, you may be interested in the following files and documents, generated from the retro-B5500 emulator. Note that the listings from 1970 were produced by a B5500 running the Mark X system software release, probably with some local-site patches. The emulator is running the base Mark XIII software release from late 1971, so you should expect to see some slight differences in the output.

CASESW-PAULROSE-DECK.card --
The original card deck, as discussed in Part 1.
CASESW-PAULROSE-20131106-OUTPUT.txt --
Printer output from the emulator for that original card deck, also discussed in Part 1, with a compile listing showing the generated code, incorrect PRT dump, and Invalid Index abort.
CASESW-PAULROSE-CORR.card --
Card deck for the corrected program.
CASESW-PAULROSE-CORR-20131110-OUTPUT.txt --
Printer output from the emulator for the corrected program, with a standard compile listing, omitting the generated code.
CASESW-PAULROSE-CORR-20131110-SPO.txt --
SPO output for the compile and run of the corrected program.
CASESW-PAULROSE-LOCMOD.card --
Card deck for the corrected program with the revised method for computing div- and mod-64 repeat counts within Stream Procedures.
CASESW-PAULROSE-LOCMOD-20140308-OUTPUT.txt --
Printer output from the version of the corrected program with the revised Stream Procedures.

Postscript

About two weeks after my fling with the B5500 in October, 1970, I was released from the durance vile of that boring documentation project and transferred into a group at the Benchmark Facility at Burroughs Great Valley Laboratories, also in suburban Philadelphia. There, in preparation for eventually becoming the software tech at a customer site, I began learning a new machine, the Burroughs B6500. Now 43 years later, I'm still learning the B6500 -- in the guise of its modern direct descendants bearing the Unisys ClearPath logo.

That little piece of Algol discussed in these two posts was the last time I wrote a program for (or even used) a B5500 until I started working on the emulator two years ago. It's gratifying to have finally been able to get that last program right.

I have yet to code another SWITCH.

B5500 Emulation Project

Saturday, March 29, 2014

SWITCH vs. CASE, Part 2

The Dumb Mistakes

Resources

Postscript

No comments:

Post a Comment