Sunday, March 23, 2014

SWITCH vs. CASE, Part 1

I wrote a small program in 1970, and finally got it to work a few months ago. Here's the story...

The story has gotten a little long in the telling, so I have divided it into two parts. The nature of the division will become clear shortly. This post represents Part 1. Part 2 will be published next week.

Background

In the Spring of 1970, I graduated from the University of Delaware with a degree in Chemical Engineering and a well-developed aversion to anything having to do with chemistry, which persists to this day. I had become increasingly interested in computers and software, though. During my senior year, the University acquired a Burroughs B5500, which really grabbed my interest. To make a long story short, upon graduation I went to work for the Burroughs Corporation near Philadelphia, Pennsylvania.

Just as I began that job, an economic recession took hold in the United States, and Burroughs was hurting a bit. To avoid having to lay me off, my management assigned me to a technical documentation project for a message-switching system. That assignment was really boring, carried on into October, and offered no opportunities to program -- let alone use -- computers. It was depressing, but not having much in the way of options, I stuck with it.

Another group in the office was just starting a project to redesign a law-enforcement information system. That system was based on the B5500 and written in Algol. I would overhear the other group's discussions, and with my job not requiring a lot of concentration, occasionally pay attention. At some point a question arose within this group as to which was more efficient, the Algol SWITCH construct or the then-new CASE construct.
  • SWITCH is a standard Algol construct, and acts somewhat like an array of labels -- you use it with a one-relative index value in a <designational expression>, generally as part of a GO TO statement to implement a multi-way branch. In its simplest form it is similar to the "computed go-to" of FORTRAN or "go-to depending on" of COBOL. 
  • The B5500 Algol CASE statement was a Burroughs extension to standard Algol, and is much like a modern case or C-style switch statement, but the cases are not labeled -- each statement in the body of the CASE is implicitly numbered starting from zero, only one of which is selected for execution based on the index value. BEGIN/END pairs could be used, of course, to create a block or compound statement as one selection in the CASE body. 
  • A significant difference between the two is that using an out-of-range index value with a SWITCH effectively made its GO TO  a no-op. Using an out-of-range index with a CASE statement terminated the program.

I got caught up in this discussion, and someone (probably me, as I was desperate for an opportunity -- any opportunity -- to do some programming) suggested writing a test to try both approaches and examine the code generated by each one. In any case, I spent a little time playing hookey from what I was supposed to be doing and teamed up with one of the programmers from the other group, Rose, who had an account on the division's B5500. I probably volunteered to write the program and keypunch it. More likely, I begged to do it.

The Program

Young whippersnapper that I was, I decided to expand my charter a bit and try to find out more about the B5500 while I had the chance. I had done some Algol programming on the B5500 at Delaware, and as a student employee in the Computer Center there had even made some minor changes to the XALGOL compiler. I had only a rough idea, though, of the inner workings of the machine and its instruction set. Thus, I decided I would put in a few constructs I was interested in knowing more about, including Stream Procedures. Not only that, but in a moment of complete recklessness, I decided I would try to have the program dump its own PRT.

On the B5500, the PRT, or Program Reference Table, is an area of memory that stores the the global variables plus some information the MCP uses to manage execution of the program. In an Algol program, the PRT holds the declarations of the outer block; in COBOL, the Data Division declarations. The processor's R register points to the base of this area. The other major data area for a program is the stack, which immediately precedes the PRT in memory. That arrangement allows the R register to serve as a limit register for the S (top-of-stack) address register, to detect stack overflows.

In order to access the PRT as a vector of words, it was necessary to use a Stream Procedure. The intended purpose of Stream Procedures in Burroughs Extended Algol is to provide access to the processor's Character Mode capabilities. Character Mode operations consist largely of manipulations of a source address (SI, the Source Index) and a destination address (DI, the Destination Index), plus various ways to test data and transfer (or stream) it from source to destination. Most operations can start and end in the middle of a word; data transfers can take place in units of words, characters, or bits.

Character Mode was sort of a last-minute add-on to the design of the B5000, the original product that became the B5500. It was taken from the design for an earlier machine, possibly the 4111, which was never built. It was quite powerful, and gave efficient character and bit manipulation capabilities to what otherwise was a strictly word-oriented scientific machine. The most significant characteristic of Character Mode, though, is that all of the nice bounds protection built into Word Mode was bypassed. Stream Procedures could potentially address and manipulate anything in memory. This was a capability that could be used for good or evil, and it was. You could implement sophisticated parsing and data movement operations in Character Mode via Stream Procedures. You could also crash the system with it.

Thus, my self-expanded charter for that test program was more than a little dangerous, especially since I didn't really know what I was doing. I don't think Rose knew all of what I was trying to do, either. There was exactly one B5500 for my whole division of Burroughs. It was quite busy, doing everything from payroll to (quite literally) rocket science, and having it crashed by a new-hire trying to satisfy his curiosity would not exactly have been a career-enhancing move. Oblivious to these risks, I wrote the program, keypunched it onto cards, and put it in the inter-office mail to be run at division headquarters. Here is what the card deck would have looked like, using the character-coding conventions of our B5500 emulator:

 1: ?USER=LANZA  ;COMPILE CASESW   /PAULROSE  NO 85410800  ALGOL
 2: ?COMMON = 100
 3: ?DATA
 4: $CARD LIST SINGLE PRT DEBUGN
 5: %    CASE VS. SWITCH      10/01/70              ROSE & PK
 6: BEGIN
 7: INTEGER I, J, K;
 8: FILE OUT PR 18 (2,15);
 9: BEGIN         %%% INNER BLOCK %%%
10: REAL X, Y, Z;
11: LABEL L1, L2, L3;
12: SWITCH S ~ L1, L2, L3;
13: ALPHA ARRAY A[0:I], B[0:2|I];
14: FORMAT F1 (X20,O,X5,2O);
15: 
16: STREAM PROCEDURE MOVEPRT (PRT25, A, N1, N2);
17:   VALUE N1, N2;
18: BEGIN
19:   SI ~ LOC PRT25;   SI ~ SI - 21;
20:   DI ~ A;
21:   N1(2(DS ~ 32 WDS));   N2(DS ~ WDS);
22: END MOVEPRT;
23: 
24: STREAM PROCEDURE BINOCT (N1, N2, S, D);
25:   VALUE N1, N2;
26: BEGIN
27:   SI ~ S;
28:   DI ~ D;
29:   N1(32(32(DS~ 3 RESET; 3(IF SB THEN DS ~ SET ELSE DS ~ RESET;
30:                           SKIP SB))));
31:   N2(16(DS ~ 3 RESET;   3(IF SB THEN DS ~ SET ELSE DS ~ RESET;
32:                           SKIP SB)));
33: END BINOCT;
34: 
35: L1:
36:   J ~ 3;   GO TO S[J];
37: L2:
38:   CASE J MOD 10 OF
39:     BEGIN
40:       J ~ 3;
41:       K ~ J;
42:       X ~ K +J;
43:       Y ~ X ~ SQRT(X);
44:       ;
45:       Z ~ 2|Y + 6.0;
46:       ;
47:       K ~ 5000;
48:     END CASE;
49: L3:
50:   MOVEPRT (I, A[*], I DIV 64, I MOD 64);
51:   BINOCT (I DIV 64, I MOD 64, A[*], B[*]);
52:   FOR J ~ 0 STEP 1 UNTIL I DO
53:     BEGIN
54:       BINOCT (0, 1, J, Y);
55:       WRITE (PR, F1, Y, B[J|2], B[J|2+1]);
56:     END;
57:   END INNER BLOCK;
58: END.
59: ?END
Note that in our emulator, we use the tilde (~) to represent the B5500 left-arrow for assignment, and the vertical bar (|) to represent the small-cross for multiplication.

There may have been an initial compile that had syntax errors -- I don't clearly remember -- but on 3 October 1970 this program compiled successfully and ran. The card deck above was reconstructed from a listing of that run, which you can view here. Since we were interested in how the two constructs worked at the machine level, that listing includes the generated code, enabled by the DEBUGN option on the $-card at line 4. The instruction mnemonics are described in the B5500 Reference Manual. The notations on the listing are mine from 1970.

Alas, while the program ran, it did not do what I had intended, and as you can see from the end of the listing, it aborted with an Invalid Index interrupt (i.e., an array bounds violation) at "S=3,A=63". That stands for "segment 3, word offset 63" (decimal). If you look on the listing at the four-digit numbers on the far right of the source code lines, offset 63 for segment 3 is within the code generated for sequence number 00051000 (the WRITE statement on line 55 in the card deck above). Specifically, the fault occurred at the instruction "0373 DESC 0036" (Descriptor Call on PRT offset 36 octal, the array "B"). The 0373 is an octal syllable offset from the beginning of the segment. At four syllables per word, offset 63 (decimal) times four is 252, which is 374 octal. Interrupt state is stored by the processor in a manner similar to that for a subroutine call, so the "return" address for the interrupt is one after the syllable that caused the fault. No return is possible from this type of error, however.

It turns out that I was really lucky with this run, because the Stream Procedure MOVEPRT did not work properly at all. Fortunately, what it did was benign, and while it accessed memory locations it shouldn't have, at least it did not overwrite anything it shouldn't have. The invalid index did not have anything to do with the dumb mistakes I made in MOVEPRT -- that was due to an entirely different dumb mistake, as will be seen in Part 2.

The remainder of this Part 1 post will analyze how the SWITCH and CASE constructs work, and what we learned about them from this little program. Next week, Part 2 will analyze the dumb mistakes I made in writing the program, what actually happened when it ran, and what it took to make the program work properly.

SWITCH Declarations and Invocations

The bulk of the program above consists of an inner block that begins at line 9. Execution for that inner block begins at line 35. The SWITCH S is declared at line 12 to select among three labels, L1, L2, and L3, based on index values 1, 2, and 3, respectively. That SWITCH is used on line 36, where the integer variable J is set to 3 just before the statement "GO TO S[J]".

If you look at the listing, the SWITCH declaration generates some code. The way that code works will be clearer if we first examine what happens when the SWITCH is invoked. From line 36 in the program:

J ~ 3;   GO TO S[J];
    0160  LITC  0003  0014
    0161  LITC  0026  0130
    0162  ISD         4121
    0163  OPDC  0026  0132
    0164  LITC  0021  0104
    0165  ISD         4121
    0166  LITC  0035  0164
    0167  LBU         6131
    0170  NOP         0055
    0171  NOP         0055
    0172  NOP         0055

The first three syllables handle the assignment to J by (a) pushing the literal value 3 onto the stack, (b) pushing a literal value for PRT offset 26 octal (J) onto the stack, and (c) executing the Integer Store Destructive (ISD) syllable. The "destructive" indicates that both the PRT offset and value will be deleted from the stack when ISD completes.

The code for the SWITCH invocation starts at offset 0163:
  •  The Operand Call (OPDC) syllable copies the value of the switch index at PRT offset 26 octal (J) and pushes that value onto the stack. Then the offset for PRT location 21 octal is pushed, followed by another integer-store syllable. This copies the value of J into a word in the lower part of the PRT that is reserved for the MCP and compilers -- variables in the outer block of an Algol program are assigned higher PRT locations beginning at 25 octal.
  • At offset 0166, the value 35 octal is pushed onto the stack, followed by a Long Backward Unconditional (LBU) branch instruction. The "long" branches compute a destination address by taking from the stack an offset in words that is relative to the location of the branch itself, and branching to the first syllable in that destination word. The LBU is at offset 0167 octal, which is word 35 octal plus syllable 3. Going backwards 35 octal words lands us at offset 0 in the code segment, which is the start of the code for the SWITCH declaration, discussed immediately below.
  • The No Operation (NOP) syllables after the LBU have a purpose, which will be explained shortly.

Note that the code shown above is not quite what you will see on the listing. The Burroughs Algol compilers were (and still are) strictly one-pass affairs. The compiler generates code as it is reading and parsing the input source lines. This means that forward references, such as a branch to a point in the program that has not been encountered yet, cannot be resolved until later.

To deal with this, the compiler resorts to what I call "back-patching." When it encounters a forward reference, it makes an entry in its symbol table for the as-yet unresolved destination point and reserves syllables at the current place in the instruction stream for the instructions that will ultimately need to go there. It often stores linkage data in that reserved space, so that multiple references to the same unresolved address can be chained together. After the compiler encounters the destination point in the input source, it reaches back into the previously-emitted instruction stream to fix up the syllables that had been reserved earlier, overwriting those syllables with the correct opcodes and address offsets.

This behavior can be really confusing when you first look at it in a code listing, especially since the data that is initially emitted in the reserved spaces is often formatted as if it were instructions, and the fix-ups are output in the code listing intermixed with whatever else the compiler is generating at the moment. You have to pay attention to the octal offset on the left side of the lines of generated code to understand what is really happening. In the examples here, I have unraveled all of that out-of-order generation so the code will be easier to follow.

With that introduction, the code generated for the SWITCH declaration is this:

SWITCH S ~ L1, L2, L3;  
    0000 LITC 0000 0000  
    0001 OPDC 0021 0106  
    0002 GEQ       0125  
    0003 OPDC 0021 0106  
    0004 LITC 0003 0014  
    0005 GTR       0225  
    0006 LOR       0215  
    0007 OPDC 0021 0106  
    0010 DUP       2025  
    0011 ADD       0101  
    0012 BFC       0231  
    0013 LITC 0156 0670  
    0014 BFW       4231  
    0015 LITC 0031 0144  
    0016 LFU       6231  
    0017 LITC 0033 0154  
    0020 LFU       6231  
    0021 LITC 0047 0234  
    0022 LFU       6231  

The LBU syllable from the SWITCH invocation branches to offset 0000 in the SWITCH declaration.
  • The SWITCH declaration begins by pushing a zero onto the stack, followed by the value from PRT offset 21 (the copy of the value of the switch index, J). 
  • The Greater-Than-or-Equal (GEQ) syllable tests whether the second word in the stack is greater than or equal to to the top word in the stack; if so it pushes a one in the stack, otherwise it pushes a zero. In both cases, the original values are popped from the stack before the result value is pushed. On the B5500, a binary value is considered to be "true" if its low-order bit is set, so the zero and one values correspond to Algol FALSE and TRUE, respectively. All other bits in the word are ignored. Another way to remember this is that on the B5500 and its descendants, the truth is always odd.
  • At offset 0003, the value of the switch index at PRT location 21 octal is again pushed onto the stack, followed by a literal 3 and the Greater-Than (GTR) syllable. This works just like GEQ except for the difference in the relation of the two top-of-stack values being tested. 
  • We now have two Boolean values on the top of the stack, which are combined using the Logical OR (LOR) syllable. Once again, both original values are popped from the stack before the result value is pushed. What these two tests and the LOR have accomplished is to determine whether zero is greater-than-or-equal to the switch index (i.e., index<1) or the index is greater than 3. Since the SWITCH has three elements, this tests whether the index falls outside the valid range for the SWITCH.
  • With the result of the LOR remaining on top-of-stack, an OPDC 21 at offset 0007 pushes another copy of the index on top of that, followed by Duplicate (DUP) and Add (ADD) syllables. DUP simply makes a copy of the word on top-of-stack, and the ADD sums the two copies, effectively multiplying the value of the index by two. 
  • This is followed at offset 0012 by a Branch Forward Conditional (BFC) syllable. Unlike the "long" branch above which only branches to word boundaries, this is a syllable-oriented branch. The word at top of stack is the number of syllables to branch, relative to the location of the syllable after the branch. With a conditional branch, the second word in the stack holds the condition, which in this case is the result of the LOR above. The branch takes place if the condition is false (i.e., its low-order bit is zero), which is what you want for IF statements. Whether the branch occurs or not, both the branch offset and condition are popped from the stack.
  • Following the conditional branch, starting at offset 0013, are four pairs of literal-call/branch syllables. If the condition resulting from the LOR above is true (i.e., the index is out of bounds), the branch will not take place, so control simply proceeds in sequence. This will result in a branch forward of 156 octal syllables, or to offset 173 octal, which is the first instruction after the code generated for the SWITCH invocation on line 36. Thus, if J is out of bounds, the GO TO on line 36 is effectively a no-op, as the semantics of Algol require.
  • If the conditional branch is taken due to J being within bounds, one of the next three pairs of literal-call/branch syllables will be selected. As it takes two syllables to effect a branch -- one to push an offset and one to do the branch -- the value of the switch index had to be multiplied by two (DUP, ADD) to obtain the correct offset. The LFU opcode is a Long Forward Unconditional word-oriented branch. 
  • Note that the literal values for the relative offsets used by the branches will take control to the locations for labels L1, L2, and L3, respectively. The compiler emits NOPs as necessary to align the code following these labels on word boundaries, as the long branches require.

So much for how the SWITCH works in this program. But wait -- it is possible to use a SWITCH in multiple places in a program. In the code above, if the switch index is out of bounds, there is a branch to a fixed location. How can that be right if you use the SWITCH more than once?

The answer to that is an example of back-patching at its finest. For the first invocation of the SWITCH, the compiler generates exactly what is shown above. If it encounters a second invocation of the SWITCH, the compiler changes its strategy and fixes up both the code for the SWITCH declaration and first use of the SWITCH to use the new strategy. The fixed-up code for the SWITCH declaration will look like this:

SWITCH S ~ L1, L2, L3;  
    0000  LITC  0000  0000  
    0001  OPDC  0021  0106  
    0002  GEQ         0125  
    0003  OPDC  0021  0106  
    0004  LITC  0003  0014  
    0005  GTR         0225  
    0006  LOR         0215  
    0007  OPDC  0021  0106  
    0010  DUP         2025  
    0011  ADD         0101  
    0012  BFC         0231  
    0013  LITC  0017  0074  
    0014  RTS         1235
    0015  LITC  0034  0160  
    0016  RTS         1235
    0017  LITC  0035  0164  
    0020  RTS         1235
    0021  LITC  0036  0170  
    0022  RTS         1235  

and an invocation of the SWITCH will look like this:

J ~ 3;   GO TO S[J];
    0160  LITC  0003  0014
    0161  LITC  0026  0130
    0162  ISD         4121
    0163  OPDC  0026  0132
    0164  LITC  0021  0104
    0165  ISD         4121
    0166  OPDC  0032  0152
    0167  XRT         0061
    0170  LOD         2021
    0171  BFW         4231
    0172  NOP         0055

The differences from the original code are shown in red. What the compiler has done is convert the code for the SWITCH into a subroutine.
  • It is not shown here, but the compiler also allocates four additional PRT locations to hold Program Control Words (PCWs) for the entry point to the subroutine that SWITCH S has become and the locations of the labels L1, L2, and L3. Like all control words for the B5500, these will have their high-order (flag) bit set. The branch syllables are sensitive to the flag bit and can accept either an integer offset or a PCW on the top-of-stack as a branch destination. Executing an OPDC that references a PCW results in a subroutine call to that code location (OPDC is the do-it-all kid -- it will also index arrays if its top-of-stack operand is a data descriptor).
  • With the code for the SWITCH declaration reconfigured as a subroutine, testing the bounds of the index and selecting a pair of syllables to branch works the same as before, but instead of directly branching, the selected syllables return a PRT offset value as the subroutine result. RTS is the Return from Subroutine syllable, which takes the value on top-of-stack (the result of the selected LITC syllable in this case), cuts back the stack used by the subroutine, and branches to the return address, leaving the original top-of-stack value at the new top-of-stack location.
  • The code that invokes the SWITCH now does an operand call on the PCW for the SWITCH's subroutine, which will return the PRT offset for one of the label PCWs, as selected by the index value. XRT (Set Variant) extends the range of PRT addressing for the next syllable (it is not needed in this small program, but might be in programs having more than 512 words in their PRT). LOD (Load Operand) takes a PRT offset on top-of-stack and replaces it with the value at that PRT location, which will be one of the label PCWs. BFW (Branch Forward) is an unconditional syllable-oriented branch. It can accept either an integer offset or a PCW as its operand.
  • Note how the NOPs at offset 0170-0171 have been overwritten by the extra code needed to implement the subroutine-based switch invocation code. The one-pass compiler could not know whether the SWITCH would be referenced more than once, so initially emitted code that efficiently supported the simpler scenario.

In this example, the PCW for the SWITCH's subroutine is at PRT offset 32 octal. That subroutine will return one of 17, 34, 35, or 36 octal, depending upon the value of J. Offsets 34, 35, and 36 represent the PCWs for labels L1, L2, and L3 respectively, but offset 17, which is returned if J is out of bounds, is in the area of the PRT reserved for the MCP and compilers. What is that? It turns out that the word at PRT offset 17 octal always contains a zero. Thus, if the SWITCH index is out of bounds, the BFW will branch forward zero syllables, which is effectively a no-op, and control continues with the next statement in the program.

Switches can be even more complex than we have seen here, as the elements of a SWITCH declaration can themselves be designational expressions (e.g., invocations of some other SWITCH) which must be evaluated at run time. Investigation of how that works is left as an exercise, dear reader, to you.

CASE Statements

In contrast to SWITCH, the implementation of CASE statements is simple and straightforward. For each statement (which may be a compound statement or a block) that is immediately subordinate to the CASE statement, the compiler determines a relative branch offset. It then constructs a one-dimensional array of those offsets, indexed by the zero-relative case value. The array is stored in the object file for the program, and is pointed to by a data descriptor that is placed in the PRT. Indexing that descriptor by the CASE expression yields the offset to the appropriate statement.

The CASE statement in the program above has seven subordinate statements, and the compiler generates syllable offsets of 0, 5, 10, 17, 40, 26, 40, and 35 decimal, respectively, for them. The code to execute the CASE statement looks like this (again, with the back-patching unraveled for clarity):

CASE J MOD 10 OF
      0170  OPDC  0026  0132
      0171  LITC  0012  0050
      0172  RDV         7001
  BEGIN
      0173  OPDC  0042  0212
      0174  BFW         4231
    J ~ 3;
      0175  LITC  0003  0014
      0176  LITC  0026  0130
      0177  ISD         4121
      0200  LITC  0043  0214
      0201  BFW         4231
    K ~ J;
      0202  OPDC  0026  0132
      0203  LITC  0027  0134
      0204  ISD         4121
      0205  LITC  0036  0170
      0206  BFW         4231
    X ~ K +J;
      0207  OPDC  0027  0136
      0210  OPDC  0026  0132
      0211  ADD         0101
      0212  LITC  0032  0150
      0213  STD         0421
      0214  LITC  0027  0134
      0215  BFW         4231
    Y ~ X ~ SQRT(X);
      0216  MKS         0441
      0217  OPDC  0032  0152
      0220  OPDC  0043  0216
      0221  LITC  0032  0150
      0222  SND         1021
      0223  LITC  0033  0154
      0224  STD         0421
      0225  LITC  0016  0070
      0226  BFW         4231
    ;
    Z ~ 2|Y + 6.0;
      0227  LITC  0002  0010
      0230  OPDC  0033  0156
      0231  MUL         0401
      0232  DESC  1777  7777
      0233  ADD         0101
      0234  LITC  0034  0160
      0235  STD         0421
      0236  LITC  0005  0024
      0237  BFW         4231
    ;
    K ~ 5000;
      0240  DESC  1777  7777
      0241  LITC  0027  0134
      0242  ISD         4121
      0243  LITC  0000  0000
      0244  BFW         4231
  END CASE;

The statement begins by computing the CASE index. The OPDC pushes the value of J onto the stack, LITC pushes the value 10 decimal onto the stack, and RDV (Remainder Divide) implements the MOD operator. The heavy lifting is done by the next syllable, another OPDC for PRT offset 42 octal:
  • That PRT location contains the data descriptor for the array of syllable offsets. 
  • When OPDC detects that its operand on the top-of-stack is an unindexed data descriptor, it applies the second word in the stack as an index to that descriptor, computes the memory address of the indexed element of the array, and loads the value at that address onto top-of-stack after popping both the descriptor and index value. 
    • A data descriptor contains the length, address, and presence status of the data for the array it describes.
    • If the value of the index value is less than zero or greater than or equal to the length of the array stored in the descriptor, OPDC will raise an Invalid Index interrupt and quit. Unless this fault is trapped by the program, the MCP will terminate the program.
    • Initially, the presence status of the descriptor is "absent" to indicate that the contents of the array it describes are not present in memory. Thus, the first time we execute this CASE statement, the OPDC will simply raise a Presence Bit interrupt and quit, allowing the hardware to branch to the appropriate interrupt vector.
    • The MCP will respond to the P-bit interrupt by allocating an area in memory of appropriate size (seven words in this case), reading the array element values from the object file on disk into that new area, fixing up the descriptor with the address of the new area, setting the presence bit [2:1] in the descriptor to indicate it now points to a real memory address, and exiting back into the program to restart the OPDC syllable, which this time will complete the index-and-load operation.
    • It is possible that the array of syllable offsets may be forced out of memory later due to pressure from other memory allocation activity, in which case the MCP simply deallocates the memory area and fixes up the descriptor for it in the program's PRT to point back to the copy of the data in the object code file. The area does not need to be written out to disk, since the system considers it to be read-only, and therefore not dirty. The next time the CASE statement is executed (if ever), the same P-bit process will bring the array back into memory from the code file.
The final result, whatever machinations are involved, is that the syllable offset ends up on top-of-stack. The syllable after the OPDC, BFW, uses that offset to branch to the beginning of the selected subordinate statement in the CASE statement. The BFW is at syllable offset 174 octal in its code segment, so the offsets in the array identified above would yield branch locations of 175, 202, 207, 216, 245, 227, 245, and 240 octal, respectively. Recall that syllable branches are relative to the syllable after the branch.

After the end of each of the subordinate statements, the compiler inserts a branch to the code for the statement following the CASE statement. These are shown in blue in the code above.

Note that two of the subordinate statements in the CASE statement are empty statements, represented by just their delimiting semicolons. For these the compiler generates no code, just an offset that branches around the CASE statement to the syllable at 0245.

SWITCH vs. CASE

So which is better, SWITCH or CASE? I have forgotten what Rose and I concluded in 1970, but looking at it afresh, my answer now is that it depends on what you are trying to do and how you are using the multi-way branch. Both constructs are very efficient in certain cases.

SWITCH must store the index value in PRT cell 21 octal and then fetch it three times. If the SWITCH is referenced more than once in a program, it must be invoked in a subroutine call, which is not an inexpensive operation. Those issues aside, the actual dispatch to a location based on the index value is very efficient. You can also do things with a SWITCH that you cannot with a CASE statement, including nesting designational expressions that have expressions for indexes, all of which will be evaluated dynamically at the time the SWITCH is invoked.

CASE is more efficient in the way that it computes the ultimate branch location, but there is significant overhead the first time it is used to handle the P-bit interrupt for its array of offsets and bring it into memory. That overhead can be repeated later, possibly multiple times, if the array is pushed out of memory. CASE is an in-line construct, so it is not subject, lexically, to multiple references. Of course, code segments can be pushed out, too, so the code for the SWITCH is not entirely immune to overlay by memory allocation pressure.

Using an out-of-bounds index with a SWITCH results in no branch occurring at all. This can be considered a feature or a bug, depending on your point of view. Because CASE obtains its branch offset by indexing an array, an out-of-bounds index causes an Invalid Index interrupt, which unless trapped, will abort the program. The programmer may need to insert bounds tests before the CASE statement to protect against aborts. Eventually, a numbered CASE statement with provision for a default case was implemented in Algol for the B6700/7700, but that does not appear ever to have been (officially) implemented for B5500 Algol.

I matured as a programmer during the post-Dijkstra, Go-To-Considered-Harmful era, so my personal preference would be to ignore the relatively minor performance-difference issues, use the CASE statement, and eschew the SWITCH, labels, and go-to statements altogether.


That completes the analysis of CASE vs. SWITCH. Tune in next week for Part 2, where I will deconstruct my less-than-stellar 1970 programming abilities, show what went wrong in the execution of the program, and demonstrate how to fix it.

Resources

In addition to the original listing, cited earlier in this post, you may be interested in the following files and documents, generated from the retro-B5500 emulator. Note that the listings from 1970 were produced by a B5500 running the Mark X system software release, probably with some local-site patches. The emulator is running the base Mark XIII software release from late 1971, so you should expect to see some slight differences in the output.
  • CASESW-PAULROSE-DECK.card --
    The original card deck, as displayed above.
  • CASESW-PAULROSE-20131106-OUTPUT.txt --
    Printer output from the emulator for that original card deck, with a compile listing showing the generated code, incorrect PRT dump, and Invalid Index abort. The emulator had exactly the same problems with my program that a real B5500 did.

No comments:

Post a Comment