Tuesday, May 29, 2018

The CUBE Library Tapes

Nigel and I are very pleased to report that three magnetic tapes in the collection of the Computer History Museum have been successfully read, and the resulting image files made available to several of us in the community working on Burroughs B5500 emulators and software restoration.

These tapes represent version 13 of the B5500 CUBE Library from February 1972. The files from these tapes are now available in the B5500-software repository:
https://github.com/retro-software/B5500-software/tree/master/CUBE-Library-13
CUBE, the Cooperating Users of Burroughs Equipment, was a U.S. based user organization, active from the 1960s through the early 1990s. After the merger of Burroughs and Sperry Univac in 1986, CUBE and USE (the Sperry user group) eventually merged to form the UNITE organization.

One of the functions that CUBE performed during the 1960s and early '70s was to maintain a library of programs for the B5000 and B5500 computer systems. These programs were donated both by Burroughs and by CUBE members. The library was freely available on magnetic tape to CUBE members and Burroughs support staff. The library eventually grew to occupy three reels of 7-track, odd-parity magnetic tape.

The remainder of this post describes the acquisition and preparation of the files from these tapes for the B5500-software repository.


Acquisition and Imaging


Tapes for version 13 of the library were acquired several years ago by Jim Haynes from the B5500 site at the University of California at Santa Cruz. He donated them to the Computer History Museum in Mountain View, California. The CHM was finally able to read these tapes in May 2018, producing binary images in .tap (taput) format. For information on the .tap format, see [1].

The .tap format was not originally designed to support the even- and odd-parity encoding available on 7-track tapes, and there is some difference of opinion on how it should be used for 7-track images. The CUBE Library tape images produced by the CHM do not record the parity. Each 6-bit character frame is stored right-justified in 8-bit bytes with two leading zero bits. The data is encoded in B5500 Internal Code (BIC). See Appendix A in [2].

The .tap format is not presently supported by the retro-b5500 emulator. Using a program I wrote for the modern Unisys MCP systems [3], I converted the .tap files to the .bcd format [4] used by retro-b5500 and several other emulators.

The tape images are identified as follows:
  • CUBE_LBR: labeled CASTC, 56-word (448-character) blocks, 2.8MB
  • CUBEA13: labeled CUBEA13/FILE000, B5500 Library/Maintenance, 12.0MB
  • CUBEB13: labeled CUBEB13/FILE000, B5500 Library/Maintenance, 10.4MB

The CUBE_LBR Tape Image


CUBE_LBR is in CAST format. The tape label has a creation date of 1976-06-10, but that is probably the date the tape was last copied. It does not appear that any of the files had been updated since at least the late 1960s. In fact, it appears that most of programs were originally written for the B5000, as they have dates that precede first customer shipment of the B5500 in February 1965. See the appendix on CAST tapes below for more information on this format.

Imaging of the CUBE_LBR tape detected errors in two areas of the tape. The first error was in the PTS041A module, in the block containing sequence numbers "KPKP0024" through "KPKP0028". These records are part of a large comment. The second error was in the PTS051 module, in the block containing sequence numbers "ANTP  11" through "ANTP  13", and including the two blank records on either side of those sequence numbers. These records occurred at the end of another large comment.

There were actually three blocks with errors in this second area of the tape, but two appear to be duplicates of the middle one, probably introduced by positioning errors during tape error retry, either when the tape was written or when it was recently imaged. It was not unusual for tape drives to detect erroneous blocks differently reading in the reverse direction, so this could explain how the duplication occurred. The two duplicated blocks caused module boundaries to be offset by 10 records in all the modules after that point on the tape.

Some research showed that identical comments were part of similar routines in other modules on the tape. I was able to correct the .tap image using a hex editor, drop the two duplicated blocks, and reconstruct the corrupted records from the comments elsewhere on the tape image. The .bcd image and extracted files discussed below were then generated from that corrected .tap image.

The CUBEA13 and CUBEB13 Tape Images


CUBEA13 and CUBEB13 are standard B5500 Library/Maintenance tapes. These two tapes were imaged without error. Their tape labels indicate creation dates of 1972-08-07 and 1972-08-05 respectively, but none of the files on these tapes has a creation date later than February 1972. Files from these tapes can be loaded to disk using the B5500 ?LOAD and ?ADD control card commands. See page 4-15ff in [5].

The individual files have been extracted from these tape images and converted to standard text file format in the Files/ subdirectory of the repository discussed below. Directories of the files on each tape are also discussed below.


Library Contents


CUBEA13 and CUBEB13 contain identical copies of an index file for the library (CUBELIB/INDEX) and an Algol program that can sort and format the index in multiple ways (CUBELST/Q000007). This program was used to generate listings of the index that you can view in the  CUBELIB.LIST.txt and CUBELIB.LIST-1.txt files in the repository. These files appear to match the scans of listings [6] and [7] on bitsavers, respectively.

The CUBE-Library-13 node of the B5500-software repository contains the .tap and .bcd tape images, directories of the files on each tape, and the two index listings cited above. The Files/ subdirectory of that node in the repository contains the individual files from the tape images, converted to standard text file format and encoded in ASCII. The following substitutions used by the retro-b5500 emulator were made for the five B5500 character glyphs that do not have ASCII equivalents:
  • ~  left-arrow (Algol assignment operator)
  • |  small-cross (Algol multiply operator)
  • {  less-than-or-equal operator
  • }  greater-than-or-equal operator
  • !  not-equal operator
The disk file system for the B5500 used a two-part name, the Multi-file identifier (MFID) and the File Identifier (FID), written MFID/FID. In the CUBE Library index, the MFID is the name of the program or package. The FID is a seven-character string termed the CUBE ID. The first two characters of this ID are a letter-number code denoting a classification scheme. The remaining characters (usually numeric) are a unique value within the classification code. The classifications are shown in the index listing CUBELIB.LIST.txt.

On the B5500, these files would be named MFID/FID. In Windows and Unix-like file systems, however, treating the MFID as a directory name would result in a large number of directories containing a single file. This would make the library files a little tedious to navigate, and would make it more difficult to find files by the CUBE ID in their second name. Therefore, the text files in the Files/ subdirectory have been named in the form MFID-FID.ext, where ".ext" attempts to identify the files by language or usage:
  • .alg: B5500 Extended Algol
  • .cob: B5500 COBOL (not COBOL-68)
  • .dat: Data or text in ASCII (no binary encoding)
  • .for: B5500 FORTRAN IV
  • .gtl: GTL (Georgia Tech Language)
  • .mca: Westinghouse Research MCALGOL
  • .sno: SNOBOL
  • .wipl: WIPL (Wisconsin Interactive Problem-Solving Language)
  • .xal: B5500 XALGOL (Compatible Algol)
Most files on the CUBEA13 and CUBEB13 tapes use the CUBE ID as the FID, but quite a few of the files (especially data files) do not. These are noted within the file descriptions in the index. To add to the confusion, files on the CUBE_LBR CAST tape have only one name. The CUBE ID, where available in the library index, has been appended as the FID to the names of these in the Files/ subdirectory of the repository.

In preparing the files for this repository I noticed a number of discrepancies between the files actually on the tape images and the entries in the index:

The following files are on CUBE_LBR but are not in the index:
MRS115MRS117MRS125MRS138MSS002,
ORS023ORS029ORS033ORS036PTS074URS046
The following files are in the index but not on any tape: 
PTS061/T200023
SYSX-O400001 (noted in the index as available separately)
The following files have different names between CUBE_LBR and the index (as noted above, several other files have different names on disk but they are identified as such in the index):
DSS029 on tape is DS029/T300001 in index
DSS030
on tape is DS030/T300002 in index
DSS031 on tape is DS031/T300003 in index
PTS024 on tape is PTS024R/E200008 in index

Library Highlights


The CUBE Library contains quite a bit of interesting software. The CUBE_LBR CAST tape image contains the Burroughs Mathematical Library, written entirely in Extended Algol. This tape has both individual subroutines that can be included in a program at compile time (see the discussion on CAST tapes and "$$" cards below) and full programs. Some of the modules contain sample data.

CUBE_LBR contains routines for special functions, differentiation, integration, interpolation, curve and surface fitting, matrix operations, statistical analysis, and a rather daunting set of programs for civil and chemical engineering (e.g., PTS051/T200019, "Non-Adiabatic Flash Calculations for Hydrocarbon Mixtures"), plus more.

As mentioned earlier, most of the programs on this tape appear to have been written for the B5000, as most have dates in the index that precede the introduction of the B5500 (and its Head-per-Track disks) in February 1965. I have been unable to locate any references to disk files in these programs. If this assertion of their origin is correct, this is the only B5000 software we have found to date.

I tried compiling the PTS051 program mentioned above. It produced syntax errors due to declaring procedure array parameters by value, something neither the B5000 nor B5500 supported. It is likely that early versions of the Algol compiler simply ignored call-by-value specifications for array parameters, and enforcement of that restriction was implemented in a later version. Removing the call-by-value specification for those arrays allowed the program to compile and run. See the compile deck and listing in the CAST-Examples subdirectory of the repository.

The CUBEA13 and CUBEB13 tapes contain many more mathematical programs and procedures, including double-precision transcendental functions written by NASA, a complex-number pre-processor for Algol programs, polynomial operations, Fourier approximation, matrix operations, statistical analysis, simulation, and a variety of miscellaneous utilities. I was particularly pleased to see a couple of disk directory utilities I had last used in 1970 at the University of Delaware.

The following programs and packages on these two tapes are of particular note:

MCALGOL -- An enhancement to Extended Algol by Larry McGuown at Westinghouse Research in Pittsburgh, Pennsylvania.

SNOBOL -- An implementation of the string-processing language by John Chambers at the University of Wisconsin. This had previously been transcribed by Richard Cornwell from the scan of a listing on bitsavers.org. That transcription is now in the versioned history of MCALGOL-L200009.alg SNOBOL-L200010.alg [corrected 2019-01-31] in the repository.

APL -- An APL interpreter for the B5500, written by Gary Kildall (of CP/M fame) and others at the University of Washington. A slightly earlier version of this program had also been previously transcribed by Hans Pufal and Fausto Saporito from a listing sent to me by Ed Vandergriff. Since that transcription has a separate provenance, it is being maintained separately in the repository. Documentation for this version is available on bitsavers.org.

GTL -- Georgia Tech Language. Another clone of Extended Algol by Martin Alexander at the Georgia Institute of Technology in Atlanta, Georgia. This compiler has significant extensions for strings, records, complex arithmetic, list processing, plex processing, and extended I/O features. Documentation is also available on bitsavers.org.

ALTRAN -- A program to translate Burroughs Algol to FORTRAN! This was written in GTL, probably at Georgia Tech. The author is unknown, but it appears to be based on an eponymous B5500 Algol program by Wayne Wilner. Wilner went on to become one of the architects, with Bob Barton, for the Burroughs B1000-series interpreter-based computer systems.

OMNITAB -- A command-driven program for statistical manipulations, originally written by the U.S. National Bureau of Standards (now the National Institute for Standards and Technology) in Washington, D.C., and converted for the B5500 by the Naval Air Test Center in Patuxent River, Maryland.

R/C -- REMOTE/CARD, a remote text editing and job submission program, more like RJE than timesharing, written by Ron Brody at the Burroughs Research Center in Paoli, Pennsylvania. This program and its documentation were also previously transcribed by Richard Cornwell from the scan of a listing on bitsavers.org. That transcription is now in the versioned history of RCSY94-Z100006.alg in the repository.

WIPL -- Wisconsin Interactive Problem-Solving Language, written by Ed Harris and Bob Janoski at the University of Wisconsin. This is a somewhat BASIC-like interactive programming language.

ELIZA -- The famous (or infamous) robotic psychiatrist. This version is written in GTL by Charles Fricks, probably at Georgia Tech.

XREF/JONES -- A documentation and cross-reference utility written by Glen Jones and Joan Dunshee at Burroughs in Pasadena, California. This is an earlier version of the same program on the Mark XIII SYSTEM tape. Documentation for the program can be generated by running the program against its own source file.

There's lots more -- the library is 320 files in all, so this collection of software is going to keep us busy for a while.

Appendix: CAST Tapes


The CUBE_LBR tape image in the library is in CAST format, a sequential source archive originating from the days of the B5000. The B5000 did not have the large Head-per-Track disks that were introduced with the B5500, only two relatively small drums, so source programs had to be maintained either as card decks or on tape.

The CAST format allows multiple source modules to be maintained as a single file. CAST files were originally on tape, but on the B5500, could also be stored on disk. These files are maintained by a standard Burroughs utility program, MAKCAST/DISK. The Algol and COBOL compilers understand this format and can compile programs and include individual routines directly from CAST tapes or disk files.

CAST tapes have a directory on the front of the tape that identifies the files stored on that tape. The directory includes the relative record number of the start of each source module. This allows MAKCAST/DISK and the compilers to use the Algol SPACE statement to position the tape to individual files relatively efficiently. If the CAST file is on disk, SPACE provides random access to the modules. If the file is on tape, it can take up to five minutes to traverse a full reel.

Here is what I have deduced for the format of CAST tapes:
  1. The tape is labeled with standard B5500 tape labels.
  2. Tapes are written with fixed-length 448-character (56 word) blocks.
  3. The first three blocks on the tape contain a directory of the files on the tape:
    • The first word of the first directory block appears to be a binary count of the number of blocks in the directory. This appears to be a fixed value of 3, however, and is hard-wired into the MAKCAST/DISK utility program.
    • Entries in the tape directory are variable length, consisting of N+4 characters, where N is the number of characters in the library module name.
    • The first character in an entry is the binary length of the module name. This length is followed immediately by the characters of the name.
    • Following the name are three characters that specify a big-endian 18-bit binary number -- the 1-relative logical record number on the tape where the module starts. This number is relative to the first non-directory block on the tape (i.e., the block following the directory blocks).
    • Directory entries are not split across tape blocks. If there is insufficient room at the end of a block for the next entry, a zero-length entry is inserted at the end of that block and the entry is stored at the beginning of the next block.
  4. The remainder of the tape after the directory blocks consists of blocks containing the text of the library modules.
  5. The first word of each of these text blocks is the big-endian binary value of the 1-relative record number of the first logical record in the block, using the same relative basis as in the 18-bit directory record numbers.
  6. The remainder of the block consists of five logical records of 88 characters (11 words) each (thus 5x88+8=448). The first 80 characters of a logical record hold a card image. The last eight characters of a logical record do not appear to be used and are zero.
  7. The library on tape is terminated by a physical tape mark and ending tape label.
  8. A 2400-foot reel of tape could hold almost 110,000 records at 800 BPI. The maximum capacity of a library is limited by the three directory blocks and the size of the 18-bit record number in the directory entries.
The MAKCAST/DISK program is described on page 5-5ff in [5].

The Algol and COBOL compilers use "$$" cards to include source modules from a CAST tape or disk file into the program being compiled. These cards are described on page 4-41ff in [5].

Sample card decks showing basic use of the MAKCAST/DISK utility program and use of "$$" cards during compilation can be found in the CAST-Examples subdirectory of the repository.


References


[1] TAP Tape Image Format: http://simh.trailing-edge.com/docs/simh_magtape.pdf
[2] B5500 Internal Code Table: http://bitsavers.org/pdf/burroughs/B5000_5500_5700/1021326_B5500_RefMan_May67.pdf, Appendix A.
[3] TAPBCD Tape Image Conversion Program: https://github.com/retro-software/B5500-software/tree/master/Unisys-Emode-Tools/TAPBCD.alg_m
[4] BCD Tape Image Format: http://www.piercefuller.com/oldibm-shadow/tool.html
[5]B5500/B5700 Operations Manual: http://bitsavers.org/pdf/burroughs/B5000_5500_5700/1024916_B5500_B5700_OperMan_Sep68.pdf
[6] CUBE Library Index Listing: http://bitsavers.org/pdf/burroughs/B5000_5500_5700/listing/CUBE_13_Library_Feb72.pdf
[7] CUBE Library Index Listing: http://bitsavers.org/pdf/burroughs/B5000_5500_5700/listing/CUBE_Library_Listing.pdf

No comments:

Post a Comment