Saturday, March 29, 2014

SWITCH vs. CASE, Part 2

This post continues the discussion of SWITCH vs. CASE as implemented in the Burroughs B5500 Extended Algol compiler. In Part 1, we examined the code generated for each of these constructs and analyzed their differences. In this second part, I will analyze what was wrong with the program I wrote to explore those constructs and describe how to fix it.

To briefly recap the discussion thus far, I wrote a small Algol program shortly after going to work for Burroughs in 1970. The ostensible purpose of this program was to examine the code for both the SWITCH and CASE constructs of the language to determine which was more efficient.

Today, being surrounded by such a glut of inexpensive, incredibly powerful computing devices that they literally have become hazardous waste, it is easy to forget how precious and expensive computer time was a few decades ago, and how difficult it often was to come by. The start of my career at Burroughs was blighted by assignment to a boring documentation project that offered no opportunity to program. Recreational programming was rarely an option in those days, so when the subject of SWITCH vs. CASE came up within another group in the office, I leaped at the chance to get a coding fix and help them decide which construct they should use.

Capitalizing on this opportunity, I decided to go beyond writing a program simply to analyze SWITCH and CASE, and add a few more things to see how they worked as well, including a couple of Stream Procedures that would attempt to dump the program's Program Reference Table, or PRT. In hindsight, that was more than a little foolish, as Stream Procedures, improperly used, could compromise the health of the entire system, and they were quite easy to use improperly.

I managed to demonstrate that ease on my first attempt by making a couple of really dumb mistakes in coding the Stream Procedures. I did not get the desired results, and the program aborted with an Invalid Index (array bounds violation) fault. Now, decades later, the goal is to figure out what happened and fix it.

The Dumb Mistakes

While the program and listing discussed in Part 1 were successful in illuminating how SWITCH and CASE worked, I didn't get the PRT dump I wanted. I got part of a dump before the program aborted with the Invalid Index interrupt, but it turns out that dump was not of the program's PRT.

What the program tried to do was copy the PRT to the array A in the program, and then format the words from that array as octal to a printer file. Copying the PRT was to be the responsibility of the MOVEPRT Stream Procedure. It turns out that I was on the right track, but made two serious errors.

The basic idea of the procedure is simple -- get the address of a word with a known offset within the PRT, adjust that address downward to the beginning of the PRT, then copy some number of words from that address to the destination array. The first variable declared in an Algol program is always at PRT offset 25 octal, so I chose that location as the base. The procedure has four parameters: a descriptor containing the address of the first variable, a descriptor for the destination array, and two integers, the first representing the number of words to copy divided by 64, and the second the number of words modulo 64. The reason for having two parameters is that repeat counts in B5500 Character Mode are limited to six bits -- values 0-63 -- so the div/mod parameters will be used as repeat counts for two nested loops.

The mistakes in this procedure are all on line 19 of the program:
SI ~ LOC PRT25; SI ~ SI - 21;
The first statement was intended to assign to the source index (SI) the address of the word at PRT offset 25 octal (the variable I in the program). Alas, what it assigned was the address in the stack of the PRT25 parameter itself. Dumb. I was confused about the semantics of LOC -- that keyword should not be there. The second statement was intended to back down that address by 21 decimal (25 octal) words to the beginning of the PRT. Alack, what it did was back down the address by 21 characters. Dumber.

The rest of the procedure was written correctly. The destination index (DI) was assigned the address of the array A, and the appropriate number of words gets moved by the "DS~...WDS" constructs. The parentheses indicate repeat loops, which are preceded by their repeat count, limited to the range 0-63. The hardware forces word transfers to begin on a word boundary, so even though the adjustment to SI above left it pointing in the middle of a word, whole words starting on word boundaries were transferred to the array.

At this point the alert reader may have noticed that the destination array is declared in the inner block as A[0:I], and the call on MOVEPRT at line 50 in the program uses I as the number of words to transfer, but I is never assigned a value in the program. How could that ever work? The answer lies in the second control card of the deck, "?COMMON = 100". That command stores the specified value in the first scalar variable declared in the program, before execution of the program begins. In this case, that store is to the integer I at PRT offset 25. Thus, A has dimensions of [0:100] and a length of 101; 100 words will be moved by MOVEPRT.

But what got moved to the array A? Since SI was adjusted backwards by 21 characters (two words plus five characters), it is pointing into the third word below the location of the PRT25 parameter in the stack. A word-oriented transfer adjusts the address, if necessary, forward to the next word boundary, so the transfer actually began two words below the location of that parameter and continued for 100 words. What the output in the original listing shows is a piece of the program's stack, starting one word below the stack frame for the call on MOVEPRT:
  • The first word in the output (all zeroes) is whatever was at top-of-stack before MOVEPRT was called.
  • The second word (beginning with a 6) is the Mark Stack Control Word (MSCW) that starts the stack frame for the call. The primary purpose of this word is to link to the prior stack frame, which is at address 12262 octal.
  • The third word (beginning with a 5) is the parameter PRT25. This is a data descriptor pointing to the variable I in the program, at address 13325 octal.
  • The fourth word is a data descriptor for the array A. The data for this array is present in memory at address 11737 octal.
  • The fifth word (value 1) is the value of I (100) divided by 64 and truncated to an integer. This word is the parameter N1.
  • The sixth word is the value 36 (100 mod 64 = 36, or 44 octal), although that may not be very obvious, as it is in B5500 floating-point notation. The RDV syllable that implements the Algol MOD operator produces a result in floating-point format, even if it is an integer value. This word is the parameter N2. The fact that this value is not a normalized integer is another problem, as discussed below.
  • The seventh word is the Return Control Word (RCW). The primary purpose of this word is to hold the procedure return address (12163 octal) and to link back to the MSCW (at 12264 octal). 
  • Any local variables for the procedure would appear after the RCW, but this procedure has none. What we see in the rest of the output is whatever was left in the stack by prior push-pop activity.
 The corrected statements for line 19 should look like this:

SI ~ PRT25;   8(SI ~ SI - 21);

Removing the LOC keyword causes SI to load the address of the variable passed as the parameter PRT25 (i.e., I), not the address of the parameter word itself. Adding a repeat of eight around the adjustment to SI decrements the index backward by 21 words instead of 21 characters. This could also have been written "21(SI~SI-8)", but the former involves larger decrements with fewer loop iterations, so is more efficient.

There is another bug concerning the MOVEPRT procedure, but it is in the call on line 50, not in the procedure itself. I discovered this as I was looking over the program's output in the original listing. As mentioned above, the value of N2 is in floating point format from that MOD operator used in the call, not in normalized integer format. The problem is that B5500 Character Mode doesn't know about floating-point numbers, and when presented with a dynamic repeat count, it simply takes the low-order six bits from the word. The low-order six bits of that floating-point word are zero, so wherever N2 is used in the Stream Procedure, the repeat will be executed zero times instead of 36 as intended. The same problem occurs with the MOD operator in the call to BINOCT on line 51.

One way to fix this problem is to compute the modulo count and assign it to an integer variable, then pass that integer variable as the parameter. Another way to force integer normalization is an integer divide (Algol DIV operator) by one on the result of the MOD operator. Since the variable K was not being used for anything at that point in the program, a quick-and-dirty solution is to use that variable in an in-line assignment solely for the side effect of generating an integer result, thus:

MOVEPRT (I, A[*], I DIV 64, (K ~ I MOD 64));
BINOCT (I DIV 64, (K ~ I MOD 64), A[*], B[*]);

Perhaps the best way to deal with the limit of 63 for Character Mode repeat counts is to do the div/mod inside the Stream Procedure, thus:

STREAM PROCEDURE MOVEPRT (PRT25, A, N);
  VALUE N;
BEGIN
  LOCAL N1, N2;
  SI ~ LOC N;   SI ~ SI + 6;          % POINT TO 7TH CHAR OF N
  DI ~ LOC N1;  DI ~ DI + 7;          % POINT TO 8TH CHAR OF N1
  DS ~ CHR;                           % MOVE SIX BITS TO N1
  DI ~ LOC N2;  DI ~ DI + 7;          % POINT TO 8TH CHAR OF N2
  DS ~ CHR;                           % MOVE SIX BITS TO N2
  
  SI ~ PRT25;   8(SI ~ SI - 21);      % POINT TO PRT (NO CHANGE)
  DI ~ A;                             % POINT TO ARRAY A (NO CHANGE)
  N1(2(DS ~ 32 WDS));   DS ~ N2 WDS;  % MOVE WORDS (AN IMPROVEMENT)
END MOVEPRT;

In this approach, the N1 and N2 parameters have been replaced by a single parameter, N. It still relies on the value of N being an normalized integer, but that is easier to accomplish in the call than with the MOD operator. N1 and N2 are now declared as words local to the Stream Procedure; they will be allocated in the stack frame for that procedure. Actually, since Character Mode did not use the stack as such, these locals are passed as hidden parameters by the caller. This allocation is done automatically by the compiler. The hidden parameter words are guaranteed to have a zero value upon entry to the procedure.

The partitioning of the count N into div-64 (in N1) and mod-64 (in N2) portions works as follows:
  • The address of the parameter word for N in the stack is stored in the Source Index, SI. This is an example of proper use of the LOC keyword -- SI is loaded with the location (address) of N rather than the contents of N. That address is then advanced by six characters to point to the second-lowest order character in the word. Assuming the count will be less than 4096, that character holds the binary value of the count, div 64.
  • Similarly, the Destination Index, DI, gets the address of the word allocated for N1. This address is advanced by seven characters to point to the low-order character in the word.
  • One character is moved to the Destination String (DS). This implicitly references SI and DI, and advances both by the number of characters moved. That results in the div-64 value being stored in the low-order six bits of N1.
  • Next, DI gets the address of the word allocated for N2, and that address is advanced to the low-order character in that word.
  • One character is again moved to DS. SI was left pointing to the next (low-order) character of N by the prior move, so this moves the mod-64 value to the low-order six bits of N2.
At this point, the values of N1 and N2 can be used in the same manner as in the original procedure. There is another optimization that can be made, however. The original procedure moved the mod-64 portion of the data by means of "N2(DS~WDS)". This specifies a loop repeating N2 times, moving one word on each iteration. It is significantly more efficient to write this as "DS~N2 WDS", which moves N2 words in one operation, avoiding the loop management overhead.

A similar technique can be used with BINOCT to pass a single parameter for the count and have it partitioned into div-64 and mod-64 values within the procedure.

Now that we have the problems with the Stream Procedures corrected, the next issue is the Invalid Index interrupt that aborted the program. The fault occurred during evaluation of one of the list elements in the WRITE statement at line 55, which is part of the FOR J loop that starts on line 52. The termination message on the listing gives us a strong clue as to what the problem is:

-INVALD INDEX CASESW /PAULROS= 3, S =   3, A =  63, 201 GEQ 201

As mentioned earlier, "S=3, A=63" refers to the decimal segment number and word offset within the segment where the interrupt occurred. The "= 3" preceding that refers to the program's mix number (equivalent to a PID in Unix/Windows). "201 GEQ 201" describes the bounds violation. The index value used was 201, which being zero-relative, exceeds the 201-word length of the array.

The problem is either the terminating value for the FOR loop, or the dimensionality of the array B -- take your pick. The loop is attempting to format the words from the array A as octal values. The BINOCT call at line 51 does the translation from binary to a character string representing the octal values of the words, but since we are translating from three-bit octades to six-bit characters, the destination string must have twice as much space as the source. Therefore, B should have twice the length of A, but it doesn't:

ARRAY A[0:I], B[0:2|I];

The variable I has the value 100, so A has a length of 101, and B has a length of 201. Oops.

As the joke goes, there are only two hard things in Computer Science: cache invalidation, naming things, and off-by-one errors.

To fix this, we need either to make that length at least 202 or to terminate the loop one iteration earlier. The latter is probably the more correct solution (since we moved only I words from the PRT into A), but I chose the former. With either of those corrections, the program will complete without throwing the Invalid Index interrupt.

There is still one more bug in this program -- well, okay -- at least one more bug. If you look at the program's output on the original listing, it consists of two columns. The left column is intended to be the PRT offset in octal, with the right column containing the octal value of the word at that offset. Alas, the left column is all zeroes. The PRT offset is not being formatted properly.

The offset is formatted by the BINOCT call on line 54. One word is formatted from the address of J to two words at the address of Y. Since Z is declared immediately after Y, the second word of the formatted result (the one with the significant half of the offset) will be in Z. Alas, the WRITE statement at line 55 references Y, which coming from the high-order bits of J, contains all zeroes. The solution is simply to substitute Z for Y in the WRITE statement's list.

In sum, the fixes to the program from 1970 required the following changes (shown in red) to five lines:

13: ARRAY A[0:I], B[0:2|I+1];
19:   SI ~ PRT25;   8(SI ~ SI - 21);
50:   MOVEPRT (I, A[*], I DIV 64, (K ~ I MOD 64));
51:   BINOCT (I DIV 64, (K ~ I MOD 64), A[*], B[*]);
55:       WRITE (PR, F1, Z, B[J|2], B[J|2+1]); 

That is a lot of bugs for a 55-line program, but perhaps is not so bad for a first attempt. Considering what I have been able to learn from it, not only in 1970, but more recently in the retro-B5500 emulator project, I count this as a very successful effort. That I finally got the program to work properly is just gravy.

Resources

In addition to the original listing cited in Part 1, you may be interested in the following files and documents, generated from the retro-B5500 emulator. Note that the listings from 1970 were produced by a B5500 running the Mark X system software release, probably with some local-site patches. The emulator is running the base Mark XIII software release from late 1971, so you should expect to see some slight differences in the output.

Postscript

About two weeks after my fling with the B5500 in October, 1970, I was released from the durance vile of that boring documentation project and transferred into a group at the Benchmark Facility at Burroughs Great Valley Laboratories, also in suburban Philadelphia. There, in preparation for eventually becoming the software tech at a customer site, I began learning a new machine, the Burroughs B6500. Now 43 years later, I'm still learning the B6500 -- in the guise of its modern direct descendants bearing the Unisys ClearPath logo.

That little piece of Algol discussed in these two posts was the last time I wrote a program for (or even used) a B5500 until I started working on the emulator two years ago. It's gratifying to have finally been able to get that last program right.

I have yet to code another SWITCH.

Sunday, March 23, 2014

SWITCH vs. CASE, Part 1

I wrote a small program in 1970, and finally got it to work a few months ago. Here's the story...

The story has gotten a little long in the telling, so I have divided it into two parts. The nature of the division will become clear shortly. This post represents Part 1. Part 2 will be published next week.

Background

In the Spring of 1970, I graduated from the University of Delaware with a degree in Chemical Engineering and a well-developed aversion to anything having to do with chemistry, which persists to this day. I had become increasingly interested in computers and software, though. During my senior year, the University acquired a Burroughs B5500, which really grabbed my interest. To make a long story short, upon graduation I went to work for the Burroughs Corporation near Philadelphia, Pennsylvania.

Just as I began that job, an economic recession took hold in the United States, and Burroughs was hurting a bit. To avoid having to lay me off, my management assigned me to a technical documentation project for a message-switching system. That assignment was really boring, carried on into October, and offered no opportunities to program -- let alone use -- computers. It was depressing, but not having much in the way of options, I stuck with it.

Another group in the office was just starting a project to redesign a law-enforcement information system. That system was based on the B5500 and written in Algol. I would overhear the other group's discussions, and with my job not requiring a lot of concentration, occasionally pay attention. At some point a question arose within this group as to which was more efficient, the Algol SWITCH construct or the then-new CASE construct.
  • SWITCH is a standard Algol construct, and acts somewhat like an array of labels -- you use it with a one-relative index value in a <designational expression>, generally as part of a GO TO statement to implement a multi-way branch. In its simplest form it is similar to the "computed go-to" of FORTRAN or "go-to depending on" of COBOL. 
  • The B5500 Algol CASE statement was a Burroughs extension to standard Algol, and is much like a modern case or C-style switch statement, but the cases are not labeled -- each statement in the body of the CASE is implicitly numbered starting from zero, only one of which is selected for execution based on the index value. BEGIN/END pairs could be used, of course, to create a block or compound statement as one selection in the CASE body. 
  • A significant difference between the two is that using an out-of-range index value with a SWITCH effectively made its GO TO  a no-op. Using an out-of-range index with a CASE statement terminated the program.

I got caught up in this discussion, and someone (probably me, as I was desperate for an opportunity -- any opportunity -- to do some programming) suggested writing a test to try both approaches and examine the code generated by each one. In any case, I spent a little time playing hookey from what I was supposed to be doing and teamed up with one of the programmers from the other group, Rose, who had an account on the division's B5500. I probably volunteered to write the program and keypunch it. More likely, I begged to do it.

The Program

Young whippersnapper that I was, I decided to expand my charter a bit and try to find out more about the B5500 while I had the chance. I had done some Algol programming on the B5500 at Delaware, and as a student employee in the Computer Center there had even made some minor changes to the XALGOL compiler. I had only a rough idea, though, of the inner workings of the machine and its instruction set. Thus, I decided I would put in a few constructs I was interested in knowing more about, including Stream Procedures. Not only that, but in a moment of complete recklessness, I decided I would try to have the program dump its own PRT.

On the B5500, the PRT, or Program Reference Table, is an area of memory that stores the the global variables plus some information the MCP uses to manage execution of the program. In an Algol program, the PRT holds the declarations of the outer block; in COBOL, the Data Division declarations. The processor's R register points to the base of this area. The other major data area for a program is the stack, which immediately precedes the PRT in memory. That arrangement allows the R register to serve as a limit register for the S (top-of-stack) address register, to detect stack overflows.

In order to access the PRT as a vector of words, it was necessary to use a Stream Procedure. The intended purpose of Stream Procedures in Burroughs Extended Algol is to provide access to the processor's Character Mode capabilities. Character Mode operations consist largely of manipulations of a source address (SI, the Source Index) and a destination address (DI, the Destination Index), plus various ways to test data and transfer (or stream) it from source to destination. Most operations can start and end in the middle of a word; data transfers can take place in units of words, characters, or bits.

Character Mode was sort of a last-minute add-on to the design of the B5000, the original product that became the B5500. It was taken from the design for an earlier machine, possibly the 4111, which was never built. It was quite powerful, and gave efficient character and bit manipulation capabilities to what otherwise was a strictly word-oriented scientific machine. The most significant characteristic of Character Mode, though, is that all of the nice bounds protection built into Word Mode was bypassed. Stream Procedures could potentially address and manipulate anything in memory. This was a capability that could be used for good or evil, and it was. You could implement sophisticated parsing and data movement operations in Character Mode via Stream Procedures. You could also crash the system with it.

Thus, my self-expanded charter for that test program was more than a little dangerous, especially since I didn't really know what I was doing. I don't think Rose knew all of what I was trying to do, either. There was exactly one B5500 for my whole division of Burroughs. It was quite busy, doing everything from payroll to (quite literally) rocket science, and having it crashed by a new-hire trying to satisfy his curiosity would not exactly have been a career-enhancing move. Oblivious to these risks, I wrote the program, keypunched it onto cards, and put it in the inter-office mail to be run at division headquarters. Here is what the card deck would have looked like, using the character-coding conventions of our B5500 emulator:

 1: ?USER=LANZA  ;COMPILE CASESW   /PAULROSE  NO 85410800  ALGOL
 2: ?COMMON = 100
 3: ?DATA
 4: $CARD LIST SINGLE PRT DEBUGN
 5: %    CASE VS. SWITCH      10/01/70              ROSE & PK
 6: BEGIN
 7: INTEGER I, J, K;
 8: FILE OUT PR 18 (2,15);
 9: BEGIN         %%% INNER BLOCK %%%
10: REAL X, Y, Z;
11: LABEL L1, L2, L3;
12: SWITCH S ~ L1, L2, L3;
13: ALPHA ARRAY A[0:I], B[0:2|I];
14: FORMAT F1 (X20,O,X5,2O);
15: 
16: STREAM PROCEDURE MOVEPRT (PRT25, A, N1, N2);
17:   VALUE N1, N2;
18: BEGIN
19:   SI ~ LOC PRT25;   SI ~ SI - 21;
20:   DI ~ A;
21:   N1(2(DS ~ 32 WDS));   N2(DS ~ WDS);
22: END MOVEPRT;
23: 
24: STREAM PROCEDURE BINOCT (N1, N2, S, D);
25:   VALUE N1, N2;
26: BEGIN
27:   SI ~ S;
28:   DI ~ D;
29:   N1(32(32(DS~ 3 RESET; 3(IF SB THEN DS ~ SET ELSE DS ~ RESET;
30:                           SKIP SB))));
31:   N2(16(DS ~ 3 RESET;   3(IF SB THEN DS ~ SET ELSE DS ~ RESET;
32:                           SKIP SB)));
33: END BINOCT;
34: 
35: L1:
36:   J ~ 3;   GO TO S[J];
37: L2:
38:   CASE J MOD 10 OF
39:     BEGIN
40:       J ~ 3;
41:       K ~ J;
42:       X ~ K +J;
43:       Y ~ X ~ SQRT(X);
44:       ;
45:       Z ~ 2|Y + 6.0;
46:       ;
47:       K ~ 5000;
48:     END CASE;
49: L3:
50:   MOVEPRT (I, A[*], I DIV 64, I MOD 64);
51:   BINOCT (I DIV 64, I MOD 64, A[*], B[*]);
52:   FOR J ~ 0 STEP 1 UNTIL I DO
53:     BEGIN
54:       BINOCT (0, 1, J, Y);
55:       WRITE (PR, F1, Y, B[J|2], B[J|2+1]);
56:     END;
57:   END INNER BLOCK;
58: END.
59: ?END
Note that in our emulator, we use the tilde (~) to represent the B5500 left-arrow for assignment, and the vertical bar (|) to represent the small-cross for multiplication.

There may have been an initial compile that had syntax errors -- I don't clearly remember -- but on 3 October 1970 this program compiled successfully and ran. The card deck above was reconstructed from a listing of that run, which you can view here. Since we were interested in how the two constructs worked at the machine level, that listing includes the generated code, enabled by the DEBUGN option on the $-card at line 4. The instruction mnemonics are described in the B5500 Reference Manual. The notations on the listing are mine from 1970.

Alas, while the program ran, it did not do what I had intended, and as you can see from the end of the listing, it aborted with an Invalid Index interrupt (i.e., an array bounds violation) at "S=3,A=63". That stands for "segment 3, word offset 63" (decimal). If you look on the listing at the four-digit numbers on the far right of the source code lines, offset 63 for segment 3 is within the code generated for sequence number 00051000 (the WRITE statement on line 55 in the card deck above). Specifically, the fault occurred at the instruction "0373 DESC 0036" (Descriptor Call on PRT offset 36 octal, the array "B"). The 0373 is an octal syllable offset from the beginning of the segment. At four syllables per word, offset 63 (decimal) times four is 252, which is 374 octal. Interrupt state is stored by the processor in a manner similar to that for a subroutine call, so the "return" address for the interrupt is one after the syllable that caused the fault. No return is possible from this type of error, however.

It turns out that I was really lucky with this run, because the Stream Procedure MOVEPRT did not work properly at all. Fortunately, what it did was benign, and while it accessed memory locations it shouldn't have, at least it did not overwrite anything it shouldn't have. The invalid index did not have anything to do with the dumb mistakes I made in MOVEPRT -- that was due to an entirely different dumb mistake, as will be seen in Part 2.

The remainder of this Part 1 post will analyze how the SWITCH and CASE constructs work, and what we learned about them from this little program. Next week, Part 2 will analyze the dumb mistakes I made in writing the program, what actually happened when it ran, and what it took to make the program work properly.

SWITCH Declarations and Invocations

The bulk of the program above consists of an inner block that begins at line 9. Execution for that inner block begins at line 35. The SWITCH S is declared at line 12 to select among three labels, L1, L2, and L3, based on index values 1, 2, and 3, respectively. That SWITCH is used on line 36, where the integer variable J is set to 3 just before the statement "GO TO S[J]".

If you look at the listing, the SWITCH declaration generates some code. The way that code works will be clearer if we first examine what happens when the SWITCH is invoked. From line 36 in the program:

J ~ 3;   GO TO S[J];
    0160  LITC  0003  0014
    0161  LITC  0026  0130
    0162  ISD         4121
    0163  OPDC  0026  0132
    0164  LITC  0021  0104
    0165  ISD         4121
    0166  LITC  0035  0164
    0167  LBU         6131
    0170  NOP         0055
    0171  NOP         0055
    0172  NOP         0055

The first three syllables handle the assignment to J by (a) pushing the literal value 3 onto the stack, (b) pushing a literal value for PRT offset 26 octal (J) onto the stack, and (c) executing the Integer Store Destructive (ISD) syllable. The "destructive" indicates that both the PRT offset and value will be deleted from the stack when ISD completes.

The code for the SWITCH invocation starts at offset 0163:
  •  The Operand Call (OPDC) syllable copies the value of the switch index at PRT offset 26 octal (J) and pushes that value onto the stack. Then the offset for PRT location 21 octal is pushed, followed by another integer-store syllable. This copies the value of J into a word in the lower part of the PRT that is reserved for the MCP and compilers -- variables in the outer block of an Algol program are assigned higher PRT locations beginning at 25 octal.
  • At offset 0166, the value 35 octal is pushed onto the stack, followed by a Long Backward Unconditional (LBU) branch instruction. The "long" branches compute a destination address by taking from the stack an offset in words that is relative to the location of the branch itself, and branching to the first syllable in that destination word. The LBU is at offset 0167 octal, which is word 35 octal plus syllable 3. Going backwards 35 octal words lands us at offset 0 in the code segment, which is the start of the code for the SWITCH declaration, discussed immediately below.
  • The No Operation (NOP) syllables after the LBU have a purpose, which will be explained shortly.

Note that the code shown above is not quite what you will see on the listing. The Burroughs Algol compilers were (and still are) strictly one-pass affairs. The compiler generates code as it is reading and parsing the input source lines. This means that forward references, such as a branch to a point in the program that has not been encountered yet, cannot be resolved until later.

To deal with this, the compiler resorts to what I call "back-patching." When it encounters a forward reference, it makes an entry in its symbol table for the as-yet unresolved destination point and reserves syllables at the current place in the instruction stream for the instructions that will ultimately need to go there. It often stores linkage data in that reserved space, so that multiple references to the same unresolved address can be chained together. After the compiler encounters the destination point in the input source, it reaches back into the previously-emitted instruction stream to fix up the syllables that had been reserved earlier, overwriting those syllables with the correct opcodes and address offsets.

This behavior can be really confusing when you first look at it in a code listing, especially since the data that is initially emitted in the reserved spaces is often formatted as if it were instructions, and the fix-ups are output in the code listing intermixed with whatever else the compiler is generating at the moment. You have to pay attention to the octal offset on the left side of the lines of generated code to understand what is really happening. In the examples here, I have unraveled all of that out-of-order generation so the code will be easier to follow.

With that introduction, the code generated for the SWITCH declaration is this:

SWITCH S ~ L1, L2, L3;  
    0000 LITC 0000 0000  
    0001 OPDC 0021 0106  
    0002 GEQ       0125  
    0003 OPDC 0021 0106  
    0004 LITC 0003 0014  
    0005 GTR       0225  
    0006 LOR       0215  
    0007 OPDC 0021 0106  
    0010 DUP       2025  
    0011 ADD       0101  
    0012 BFC       0231  
    0013 LITC 0156 0670  
    0014 BFW       4231  
    0015 LITC 0031 0144  
    0016 LFU       6231  
    0017 LITC 0033 0154  
    0020 LFU       6231  
    0021 LITC 0047 0234  
    0022 LFU       6231  

The LBU syllable from the SWITCH invocation branches to offset 0000 in the SWITCH declaration.
  • The SWITCH declaration begins by pushing a zero onto the stack, followed by the value from PRT offset 21 (the copy of the value of the switch index, J). 
  • The Greater-Than-or-Equal (GEQ) syllable tests whether the second word in the stack is greater than or equal to to the top word in the stack; if so it pushes a one in the stack, otherwise it pushes a zero. In both cases, the original values are popped from the stack before the result value is pushed. On the B5500, a binary value is considered to be "true" if its low-order bit is set, so the zero and one values correspond to Algol FALSE and TRUE, respectively. All other bits in the word are ignored. Another way to remember this is that on the B5500 and its descendants, the truth is always odd.
  • At offset 0003, the value of the switch index at PRT location 21 octal is again pushed onto the stack, followed by a literal 3 and the Greater-Than (GTR) syllable. This works just like GEQ except for the difference in the relation of the two top-of-stack values being tested. 
  • We now have two Boolean values on the top of the stack, which are combined using the Logical OR (LOR) syllable. Once again, both original values are popped from the stack before the result value is pushed. What these two tests and the LOR have accomplished is to determine whether zero is greater-than-or-equal to the switch index (i.e., index<1) or the index is greater than 3. Since the SWITCH has three elements, this tests whether the index falls outside the valid range for the SWITCH.
  • With the result of the LOR remaining on top-of-stack, an OPDC 21 at offset 0007 pushes another copy of the index on top of that, followed by Duplicate (DUP) and Add (ADD) syllables. DUP simply makes a copy of the word on top-of-stack, and the ADD sums the two copies, effectively multiplying the value of the index by two. 
  • This is followed at offset 0012 by a Branch Forward Conditional (BFC) syllable. Unlike the "long" branch above which only branches to word boundaries, this is a syllable-oriented branch. The word at top of stack is the number of syllables to branch, relative to the location of the syllable after the branch. With a conditional branch, the second word in the stack holds the condition, which in this case is the result of the LOR above. The branch takes place if the condition is false (i.e., its low-order bit is zero), which is what you want for IF statements. Whether the branch occurs or not, both the branch offset and condition are popped from the stack.
  • Following the conditional branch, starting at offset 0013, are four pairs of literal-call/branch syllables. If the condition resulting from the LOR above is true (i.e., the index is out of bounds), the branch will not take place, so control simply proceeds in sequence. This will result in a branch forward of 156 octal syllables, or to offset 173 octal, which is the first instruction after the code generated for the SWITCH invocation on line 36. Thus, if J is out of bounds, the GO TO on line 36 is effectively a no-op, as the semantics of Algol require.
  • If the conditional branch is taken due to J being within bounds, one of the next three pairs of literal-call/branch syllables will be selected. As it takes two syllables to effect a branch -- one to push an offset and one to do the branch -- the value of the switch index had to be multiplied by two (DUP, ADD) to obtain the correct offset. The LFU opcode is a Long Forward Unconditional word-oriented branch. 
  • Note that the literal values for the relative offsets used by the branches will take control to the locations for labels L1, L2, and L3, respectively. The compiler emits NOPs as necessary to align the code following these labels on word boundaries, as the long branches require.

So much for how the SWITCH works in this program. But wait -- it is possible to use a SWITCH in multiple places in a program. In the code above, if the switch index is out of bounds, there is a branch to a fixed location. How can that be right if you use the SWITCH more than once?

The answer to that is an example of back-patching at its finest. For the first invocation of the SWITCH, the compiler generates exactly what is shown above. If it encounters a second invocation of the SWITCH, the compiler changes its strategy and fixes up both the code for the SWITCH declaration and first use of the SWITCH to use the new strategy. The fixed-up code for the SWITCH declaration will look like this:

SWITCH S ~ L1, L2, L3;  
    0000  LITC  0000  0000  
    0001  OPDC  0021  0106  
    0002  GEQ         0125  
    0003  OPDC  0021  0106  
    0004  LITC  0003  0014  
    0005  GTR         0225  
    0006  LOR         0215  
    0007  OPDC  0021  0106  
    0010  DUP         2025  
    0011  ADD         0101  
    0012  BFC         0231  
    0013  LITC  0017  0074  
    0014  RTS         1235
    0015  LITC  0034  0160  
    0016  RTS         1235
    0017  LITC  0035  0164  
    0020  RTS         1235
    0021  LITC  0036  0170  
    0022  RTS         1235  

and an invocation of the SWITCH will look like this:

J ~ 3;   GO TO S[J];
    0160  LITC  0003  0014
    0161  LITC  0026  0130
    0162  ISD         4121
    0163  OPDC  0026  0132
    0164  LITC  0021  0104
    0165  ISD         4121
    0166  OPDC  0032  0152
    0167  XRT         0061
    0170  LOD         2021
    0171  BFW         4231
    0172  NOP         0055

The differences from the original code are shown in red. What the compiler has done is convert the code for the SWITCH into a subroutine.
  • It is not shown here, but the compiler also allocates four additional PRT locations to hold Program Control Words (PCWs) for the entry point to the subroutine that SWITCH S has become and the locations of the labels L1, L2, and L3. Like all control words for the B5500, these will have their high-order (flag) bit set. The branch syllables are sensitive to the flag bit and can accept either an integer offset or a PCW on the top-of-stack as a branch destination. Executing an OPDC that references a PCW results in a subroutine call to that code location (OPDC is the do-it-all kid -- it will also index arrays if its top-of-stack operand is a data descriptor).
  • With the code for the SWITCH declaration reconfigured as a subroutine, testing the bounds of the index and selecting a pair of syllables to branch works the same as before, but instead of directly branching, the selected syllables return a PRT offset value as the subroutine result. RTS is the Return from Subroutine syllable, which takes the value on top-of-stack (the result of the selected LITC syllable in this case), cuts back the stack used by the subroutine, and branches to the return address, leaving the original top-of-stack value at the new top-of-stack location.
  • The code that invokes the SWITCH now does an operand call on the PCW for the SWITCH's subroutine, which will return the PRT offset for one of the label PCWs, as selected by the index value. XRT (Set Variant) extends the range of PRT addressing for the next syllable (it is not needed in this small program, but might be in programs having more than 512 words in their PRT). LOD (Load Operand) takes a PRT offset on top-of-stack and replaces it with the value at that PRT location, which will be one of the label PCWs. BFW (Branch Forward) is an unconditional syllable-oriented branch. It can accept either an integer offset or a PCW as its operand.
  • Note how the NOPs at offset 0170-0171 have been overwritten by the extra code needed to implement the subroutine-based switch invocation code. The one-pass compiler could not know whether the SWITCH would be referenced more than once, so initially emitted code that efficiently supported the simpler scenario.

In this example, the PCW for the SWITCH's subroutine is at PRT offset 32 octal. That subroutine will return one of 17, 34, 35, or 36 octal, depending upon the value of J. Offsets 34, 35, and 36 represent the PCWs for labels L1, L2, and L3 respectively, but offset 17, which is returned if J is out of bounds, is in the area of the PRT reserved for the MCP and compilers. What is that? It turns out that the word at PRT offset 17 octal always contains a zero. Thus, if the SWITCH index is out of bounds, the BFW will branch forward zero syllables, which is effectively a no-op, and control continues with the next statement in the program.

Switches can be even more complex than we have seen here, as the elements of a SWITCH declaration can themselves be designational expressions (e.g., invocations of some other SWITCH) which must be evaluated at run time. Investigation of how that works is left as an exercise, dear reader, to you.

CASE Statements

In contrast to SWITCH, the implementation of CASE statements is simple and straightforward. For each statement (which may be a compound statement or a block) that is immediately subordinate to the CASE statement, the compiler determines a relative branch offset. It then constructs a one-dimensional array of those offsets, indexed by the zero-relative case value. The array is stored in the object file for the program, and is pointed to by a data descriptor that is placed in the PRT. Indexing that descriptor by the CASE expression yields the offset to the appropriate statement.

The CASE statement in the program above has seven subordinate statements, and the compiler generates syllable offsets of 0, 5, 10, 17, 40, 26, 40, and 35 decimal, respectively, for them. The code to execute the CASE statement looks like this (again, with the back-patching unraveled for clarity):

CASE J MOD 10 OF
      0170  OPDC  0026  0132
      0171  LITC  0012  0050
      0172  RDV         7001
  BEGIN
      0173  OPDC  0042  0212
      0174  BFW         4231
    J ~ 3;
      0175  LITC  0003  0014
      0176  LITC  0026  0130
      0177  ISD         4121
      0200  LITC  0043  0214
      0201  BFW         4231
    K ~ J;
      0202  OPDC  0026  0132
      0203  LITC  0027  0134
      0204  ISD         4121
      0205  LITC  0036  0170
      0206  BFW         4231
    X ~ K +J;
      0207  OPDC  0027  0136
      0210  OPDC  0026  0132
      0211  ADD         0101
      0212  LITC  0032  0150
      0213  STD         0421
      0214  LITC  0027  0134
      0215  BFW         4231
    Y ~ X ~ SQRT(X);
      0216  MKS         0441
      0217  OPDC  0032  0152
      0220  OPDC  0043  0216
      0221  LITC  0032  0150
      0222  SND         1021
      0223  LITC  0033  0154
      0224  STD         0421
      0225  LITC  0016  0070
      0226  BFW         4231
    ;
    Z ~ 2|Y + 6.0;
      0227  LITC  0002  0010
      0230  OPDC  0033  0156
      0231  MUL         0401
      0232  DESC  1777  7777
      0233  ADD         0101
      0234  LITC  0034  0160
      0235  STD         0421
      0236  LITC  0005  0024
      0237  BFW         4231
    ;
    K ~ 5000;
      0240  DESC  1777  7777
      0241  LITC  0027  0134
      0242  ISD         4121
      0243  LITC  0000  0000
      0244  BFW         4231
  END CASE;

The statement begins by computing the CASE index. The OPDC pushes the value of J onto the stack, LITC pushes the value 10 decimal onto the stack, and RDV (Remainder Divide) implements the MOD operator. The heavy lifting is done by the next syllable, another OPDC for PRT offset 42 octal:
  • That PRT location contains the data descriptor for the array of syllable offsets. 
  • When OPDC detects that its operand on the top-of-stack is an unindexed data descriptor, it applies the second word in the stack as an index to that descriptor, computes the memory address of the indexed element of the array, and loads the value at that address onto top-of-stack after popping both the descriptor and index value. 
    • A data descriptor contains the length, address, and presence status of the data for the array it describes.
    • If the value of the index value is less than zero or greater than or equal to the length of the array stored in the descriptor, OPDC will raise an Invalid Index interrupt and quit. Unless this fault is trapped by the program, the MCP will terminate the program.
    • Initially, the presence status of the descriptor is "absent" to indicate that the contents of the array it describes are not present in memory. Thus, the first time we execute this CASE statement, the OPDC will simply raise a Presence Bit interrupt and quit, allowing the hardware to branch to the appropriate interrupt vector.
    • The MCP will respond to the P-bit interrupt by allocating an area in memory of appropriate size (seven words in this case), reading the array element values from the object file on disk into that new area, fixing up the descriptor with the address of the new area, setting the presence bit [2:1] in the descriptor to indicate it now points to a real memory address, and exiting back into the program to restart the OPDC syllable, which this time will complete the index-and-load operation.
    • It is possible that the array of syllable offsets may be forced out of memory later due to pressure from other memory allocation activity, in which case the MCP simply deallocates the memory area and fixes up the descriptor for it in the program's PRT to point back to the copy of the data in the object code file. The area does not need to be written out to disk, since the system considers it to be read-only, and therefore not dirty. The next time the CASE statement is executed (if ever), the same P-bit process will bring the array back into memory from the code file.
The final result, whatever machinations are involved, is that the syllable offset ends up on top-of-stack. The syllable after the OPDC, BFW, uses that offset to branch to the beginning of the selected subordinate statement in the CASE statement. The BFW is at syllable offset 174 octal in its code segment, so the offsets in the array identified above would yield branch locations of 175, 202, 207, 216, 245, 227, 245, and 240 octal, respectively. Recall that syllable branches are relative to the syllable after the branch.

After the end of each of the subordinate statements, the compiler inserts a branch to the code for the statement following the CASE statement. These are shown in blue in the code above.

Note that two of the subordinate statements in the CASE statement are empty statements, represented by just their delimiting semicolons. For these the compiler generates no code, just an offset that branches around the CASE statement to the syllable at 0245.

SWITCH vs. CASE

So which is better, SWITCH or CASE? I have forgotten what Rose and I concluded in 1970, but looking at it afresh, my answer now is that it depends on what you are trying to do and how you are using the multi-way branch. Both constructs are very efficient in certain cases.

SWITCH must store the index value in PRT cell 21 octal and then fetch it three times. If the SWITCH is referenced more than once in a program, it must be invoked in a subroutine call, which is not an inexpensive operation. Those issues aside, the actual dispatch to a location based on the index value is very efficient. You can also do things with a SWITCH that you cannot with a CASE statement, including nesting designational expressions that have expressions for indexes, all of which will be evaluated dynamically at the time the SWITCH is invoked.

CASE is more efficient in the way that it computes the ultimate branch location, but there is significant overhead the first time it is used to handle the P-bit interrupt for its array of offsets and bring it into memory. That overhead can be repeated later, possibly multiple times, if the array is pushed out of memory. CASE is an in-line construct, so it is not subject, lexically, to multiple references. Of course, code segments can be pushed out, too, so the code for the SWITCH is not entirely immune to overlay by memory allocation pressure.

Using an out-of-bounds index with a SWITCH results in no branch occurring at all. This can be considered a feature or a bug, depending on your point of view. Because CASE obtains its branch offset by indexing an array, an out-of-bounds index causes an Invalid Index interrupt, which unless trapped, will abort the program. The programmer may need to insert bounds tests before the CASE statement to protect against aborts. Eventually, a numbered CASE statement with provision for a default case was implemented in Algol for the B6700/7700, but that does not appear ever to have been (officially) implemented for B5500 Algol.

I matured as a programmer during the post-Dijkstra, Go-To-Considered-Harmful era, so my personal preference would be to ignore the relatively minor performance-difference issues, use the CASE statement, and eschew the SWITCH, labels, and go-to statements altogether.


That completes the analysis of CASE vs. SWITCH. Tune in next week for Part 2, where I will deconstruct my less-than-stellar 1970 programming abilities, show what went wrong in the execution of the program, and demonstrate how to fix it.

Resources

In addition to the original listing, cited earlier in this post, you may be interested in the following files and documents, generated from the retro-B5500 emulator. Note that the listings from 1970 were produced by a B5500 running the Mark X system software release, probably with some local-site patches. The emulator is running the base Mark XIII software release from late 1971, so you should expect to see some slight differences in the output.
  • CASESW-PAULROSE-DECK.card --
    The original card deck, as displayed above.
  • CASESW-PAULROSE-20131106-OUTPUT.txt --
    Printer output from the emulator for that original card deck, with a compile listing showing the generated code, incorrect PRT dump, and Invalid Index abort. The emulator had exactly the same problems with my program that a real B5500 did.