image of READY prompt

The Wang 2200 was a pretty rock solid machine, especially compared to some of the micros that followed it in the mid- to late 70's. However, that doesn't mean that the Wang computer and its particular peculiar dialect of BASIC were perfect either.

Here is a short collection of outright bugs and some surprising artifacts of Wang BASIC. Unless otherwise noted, all are from Wang BASIC as implemented on the 2200T CPU; not all of them will behave identically with other microcode variations.

In particular, Wang BASIC-2 suffers none of the bugs here (just some of the unusual dialect issues), and was an amazingly robust interpreter. That was due, in part, to higher expectations and coding practices of the Wang engineers, and that the VP had a writable microcode store, enabling frequent updates and bug fixes.

In all cases below, lines that begin with ":" indicate the lines which the user has typed; the other lines are the computer's response.

#1 – Make the Interpreter Spew Trash (link)

bug image

This trick comes courtesy of Carl Coffman. Wang BASIC, like most interpreted BASICs, has both program mode and immediate mode execution. Immediate mode execution means that a single line program can be entered and executed immediately without being part of a stored program. Some examples are as follows:

:A=1:PRINT A+1
 1     1
 2     4
 3     9

Although branching is not allowed as it makes no sense for a one line program, the FOR/NEXT construct is allowed. However, the Wang BASIC interpreter didn't count on sneaky folks performing a FOR/NEXT loop as two separate immediate commands. Type the following two commands into a real Wang 2200 (or use the emulator) and watch the interpreter spew out some junk to the screen until it comes back to its senses. Miraculously, the program that was in memory appears unharmed. That is simply a bug.

:FOR I=1 TO 10

Interestingly, this fails only for this case. Try this program -- it works.

       ^ERR 26

This can go even one further. The immediate mode NEXT I can branch back into a suspended program.

:10 FOR I=1 TO 3
:30 STOP
:40 PRINT "Here"
:50 NEXT I
:60 PRINT "Done"

Note that the string "Here" isn't printed, meaning that the NEXT I really does loop right back to line 20. However, when the terminal count is reached, it drops back into immediate mode instead of continuing on at line 60.

One final "FOR loop funny" is that FOR loops with the same index can be nested, although the index variable gets modified by the inner loop and the outer loop abides by it.

:10 FOR I=1 TO 30 STEP 10
:20 FOR I=I TO I+2
:30 PRINT I,
:40 NEXT I
:60 NEXT I
 1               2               3
 13              14              15
 25              26              27

What this shows is that the FOR loop control stack doesn't check for nesting on the same index.

Also see Stupid Trick #21.

#2 – Code Entry for People with Poor Peripheral Vision (link)

This isn't a bug, just misapplication of a feature. Wang BASIC allowed specifying the width of a terminal or printer since various options existed, so BASIC had no way of knowing a priori what the correct width was. The SELECT statement was used for this purpose. SELECT could be applied to a specific device, and it could be applied in specific situations, such as console output, listing, or printing.

In immediate mode type this command: SELECT CI 005(1). Then start typing in your program. All output appears in the first column, since the above command claimed that the CRT had only one column.

#3 – "Marked GOTO" (link)

Wang BASIC has a feature called a "marked GOSUB". Rather than using a line number as the target of the GOSUB command, a numeric label, from 0 to 255, is used instead. An example would help.

:10 PRINT "Starting"
:20 GOSUB'5
:30 END
:100 DEFFN'5
:110 PRINT "Hello"

OK. However, this can be abused by using the RETURN CLEAR command. The RETURN CLEAR command pops the return address off of the call stack. It can be useful for aborting a deeply nested subroutine when some error occurs. By combining this command with the marked GOSUB, we make a "marked GOTO".

:20 IF I<4 THEN 30:GOSUB'3
:30 I=I+1:GOTO 10
:40 PRINT "This should never happen":END
:80 DEFFN'3
:90 RETURN CLEAR: REM we never intend to return
:110 I=I+2
:120 GOTO 10:REM we never run out of stack space

#4 – Code in a REM (link)

This isn't a bug at all; Wang documented it. However, Wang BASIC is the only BASIC I've seen that can terminate a REM with a statement separator, and it is surprising.

:10 REM This is a comment:PRINT "Hello"

#5 – FORM=STOP (link)

Surprisingly, the following is a legal Wang BASIC fragment:


This is legal despite the fact that Wang variables can have only a single letter or a single letter plus one digit, and despite the fact that the keyword STOP is used. This is because the code isn't an assignment at all, but gets parsed by the interpreter as a FOR statement.

:100 FOR M = S TO P

Yes, it isn't much, but back in 1978 I was excited when I figured this one out.

#6 – Comma Chameleon (link)

Wang's PRINTUSING allowed a number of formatting options, including inserting commas into numeric fields. It didn't require that you had to follow conventions, though.

:10 PRINT USING 20, 123456789
:20 % #,#,#,#,#,#,#,#,#

Commas aren't allowed after the decimal point, however, and are assumed to separate the next image specifier.

#7 – Microcode Abuse (link)

The $GIO instruction allows executing synthetic microcode programs for performing fast I/O to peripherals, such as serial ports and reel-to-reel tape drives. However, they keyboard and display are just peripherals as far as the microcode is concerned. With some cunning, I'm sure some interesting feats could be achieved, such as high speed screen drawing, but I must admit to just speculating here. Some rainy day I'll make an attempt and report my results here.

#8 – PACK/UNPACK Numeric Scaling (link)

The PACK and UNPACK commands allow encoding a number or an array of numbers into a string array, using whatever precision is desired. Part of the PACK operation is a format specifier saying whether to encode the number as fixed or float, and how many digits of precision to keep. UNPACK does the reverse, but since the format of the PACK operation isn't encoded in the packed data, the format must also be specified by the UNPACK instruction to recover the data.

Where the abuse comes in is that it is possible to specify different formats when PACKing and UNPACKing. One trick would be to scale an array of numbers by a power of 10 like this:

10 DIM A$(100)10:REM 1000 bytes
20 DIM B(125), C(125):REM 125 numbers each
30 FOR I=1 TO 125:B(I)=RND(1):NEXT I
40 PACK (+#.###########) A$() FROM B()
50 UNPACK (+##.##########) A$() TO C()

This program sets up an array of 125 random numbers in B(), then scales them by a factor of 10 and puts them in C().

I timed this and it is about 50% faster than using the scalar MAT multiply function

40 MAT C=(10)*B

However, for this trick to work, one must know the magnitude of the numbers being operated on otherwise severe truncation could result. But that just leads to the next idea: if you want to truncate an array of numbers to some number of digits (say, two), this mechanism would be many times faster than looping over the array and doing something like:

100 FOR I=1 TO 125
110 C(I) = INT(B(I)*100)/100
120 NEXT I

#9 – Exposing a Terminal Disease (link)

bug image

This one is a real bug. Not that interesting, but a bug anyway.

The bug isn't in the terminal, which is simply a video display driven from the CRT controller in the CPU. It isn't a CRT controller bug, since every time a character is sent to the controller, the cursor is automatically incremented and is supposed to just wrap modulo the screen width. It is up to the BASIC microcode to know when to insert carriage returns and line feeds. This is especially true in light of the earlier SELECT 005(<width>) discussion.

The CRT interprets some bytes with ASCII values less than 32 as control codes. Some of those control codes are just ignored, some cause a carriage return, a line feed, clearing the screen, or cursor motions. HEX(08) is the code for cursor left, and HEX(09) is the code for cursor right. Normally the cursor gets bumped to the next line of the display when a character is emitted in the last column of the display, but if these two codes were used before that, the CR/LF is emitted at the wrong time.

:10 PRINT HEX(03);:REM clear the screen
:20 PRINT HEX(0909090909090909);:REM move the cursor right 8 spaces
:30 PRINT "!";
:40 FOR I=1 TO 62:PRINT "a";:NEXT I
:50 FOR I=1 TO 8:PRINT "b";:NEXT I

The "!" came out first. Note that the a's wrap around on the same line until 64 characters total have been output, then an implicit CR/LF is issued, causing the final 7 b's to appear on the second line.

Explanation of Bug

The code in the BASIC interpreter that figures out where the cursor is on the display doesn't account for the horizontal cursor motions properly. Apparently is knew that HEX(03) caused the cursor to get reset, but it didn't account for the HEX(09)'s adjusting the cursor's horizontal offset.

#10 – Input/Output = Output/Input (link)

bug image

When someone walks away from their Wang for a minute, type this as an immediate mode command: SELECT PRINT 001. When the user comes back to the computer, he will be able to list and edit the program as normal. However, when he runs his program, nothing shows up. This is because this command has redirected the output of PRINT statements to the keyboard! The keyboard, of course, can't use the data, but it does happen to return the right status so that BASIC thinks the data has been printed and doesn't lock up the machine.

Typing SELECT PRINT 005 puts everything back right.

Similarly, one could enter SELECT LIST 001. Everything operates as before, except the LIST command mysteriously doesn't work when it is invoked.

Another one is: PRINT HEX(03);":";:SELECT CO 001. In this case, PRINTing and LISTing work OK, but none of the keystrokes of the user and error messages appear. The gibberish at the beginning clears the screen and prints a fake prompt so that the evidence of what just happened isn't on screen.

More maliciously, wait until a user has been entering their program for 15 minutes without saving their work and type in SELECT CI 005.

This tells the computer to take console input from the CRT (005) instead of the keyboard (001). Since the CRT can't return the proper status, they machine locks up waiting for input that will never come. A RESET will clear the screen, but it is still left with console input selected from the CRT. The only way out of it is to power cycle.

#11 – Inert Code (link)

More an oddity than anything, this "trick" exposes a strange feature of Wang BASIC. The BASIC manual mentions that the PRINTUSING image statement can't have any literal '#' in it for obvious reasons, but it also says it can't have any colons either.

The reason for that restriction is that the entire line is parsed, and anything after a colon must be a legal statement. Despite that requirement, those statements after the colon are never executed.

:10 PRINT USING 20, 123
:20 %This is a test ###:PRINT "HI"
:30 PRINT "Done"
This is a test 123

#12 – Copy Machine (link)

Someone who wrote this code might be surprised if they expected the final value of A$ to be "aabcdefg...". The code appears to set A$ to the alphabet, then copy bytes 1 through 25 of the string to bytes 2 through 26.

:10 DIM A$26
:20 A$="abcdefghijklmnopqrstuvwxyz"
:30 PRINT A$
:40 STR(A$,2,25)=STR(A$,1,25)
:50 PRINT A$

Explanation of Bug

What has happened is that the BASIC interpreter copies the first byte of the string to the location of the second byte. Then the second byte is copied to the location of the third byte, etc. Notice that each time the "a" is copied to the next position. Just about anybody who has written a block move routine will have made this mistake once. If the range is overlapping, an in-place copy routine like this must copy from the end of the string to the beginning otherwise this behavior results.

lightbulb image

This could be turned into a feature. While the above code could be more directly accomplished with this:

40 INIT("a") A$

The INIT statement can deal only with a single fill byte. If you want to fill with a pattern, do this:

:10 DIM A$33
:20 A$="abc"
:30 STR(A$,4,30)=STR(A$,1,30)
:40 PRINT A$

It should be noted that Wang didn't consider this to be a bug. It was documented behavior, and the exact same behavior is seen in BASIC-2 as well.

#13 – Spacing Out (link)

The Wang interpreter tokenizes its input (that is, PRINT becomes a single byte in memory). However, it is conservative about spaces, even spaces that appear before the line number. It can be used for indenting, or making program lines line up. For example:

:10 REM Normal practice of putting in limited spacing
:50 FOR A=1 TO 10
:100 PRINT "A=";A
:150 PRINT "A*A=";A*A
:200 NEXT A
:250 END
10 REM Normal practice of putting in limited spacing
50 FOR A=1 TO 10
100 PRINT "A=";A
150 PRINT "A*A=";A*A
200 NEXT A
250 END

By using some spacing, things look nicer:

: 10 REM Spaces make line numbers line up, plus indenting
: 50 FOR A=1 TO 10
:100    PRINT "A=";A
:150    PRINT "A*A=";A*A
:150 NEXT A
:200 END
 10 REM Spaces make line numbers line up, plus indenting
 50 FOR A=1 TO 10
100    PRINT "A=";A
150    PRINT "A*A=";A*A
150 NEXT A
200 END

This can be abused:

                                                10 A=1
                              20 FOR B=1 TO 10
                     30 A=A*2
          40 PRINT A

One restriction is that lines that are named as a THEN/GOTO/GOSUB target must not have any leading spaces, otherwise the syntax check that runs at the beginning of a program run won't find the named line and will complain.

More fun can be had by abusing the Wang keyboard. The keyboard has an "index" and "reverse index" key. They correspond to HEX(0A) and HEX(0C), respectively. HEX(0A) is the linefeed character, and HEX(0C) is the reverse linefeed character. By typing

10 REM This <LF>code lives       <RLF>on two <LF>lines.

(<LF>=linefeed, <RLF>=reverse linefeed), the listing looks like:

10 REM This on two
code lives         lines.

For some reason, the first <LF> also causes a carriage return when it is listed. When you use the line editor to edit this line, things get mighty confusing.

OK, we're just getting started. <LF> and <RLF> would seem to be the only two magic characters that could be entered into a line. It turns out that the special function keys can be defined to insert a string literal, even during program entry. For example, enter this bit of code.

1000 DEFFN'0 "TAN(A)"

Whenever the special function key 0 is pressed, TAN(A) appears in a flash as if the user had typed it in keystroke by keystroke. This form of DEFFN' command is even more general, as HEX() constants can be used as well. This is where the abuse comes in.

1000 DEFFN'0 HEX(03)

HEX(03)is the clear screen command. We can use this feature to mess up the listing in arbitrary ways. Here's a simple one. "<SF0>" below means press the special function key 0 with the above mapping defined.

10 FOR A=1 TO 10:REM<SF0>

This program can be run and it prints a list of the numbers 1 to 10. If you list it, or any line, the screen is cleared.

It is even possible to make the listing show an entirely different program by placing a decoy program after the <SF0>. For example, map <SF0> to be HEX(08), backspace, and enter this program (each "#" is the mapped backspace and each "." is a real space):

10 FOR I=1 TO 10:REM ####################for(i=0;i<10;i++) [
20 PRINT I;:REM #############..printf("%d ",i);
30 NEXT I:REM ###########]...........

When you run the program, it produces a list of the numbers 1 to 10, but when you list it, it looks like C code!

10 for(i=0;i<10;i++) {
20   printf("%d ",i);
30 }

#14 – Token on Something (link)

bug image

This builds on the previous trick, Spacing Out.

Wang BASIC internally uses codes HEX(00) to HEX(7F) to represent the ASCII characters of the program. They keywords of the language, such as PRINT , and END, even TO may be entered as sequences of ASCII characters, but when the interpreter first parses the line, these each get converted to a single byte representation. HEX(80) to HEX(FF) are used for encoding these keywords. This trick, common in many BASIC interpreters, saves a lot of space and makes subsequent interpretation much faster.

Some Wang keyboards have keywords as the shifted equivalent of some key on the keyboard, (say, SHIFT-P is PRINT ). All that is happening is that the one byte token representing a keyword is directly deposited in the command line, vs having to be converted later.

The bug comes about because not all codes are used. HEX(F0) to HEX(FE) apparently are not committed. Since the DEFFN'0 HEX(..) mechanism allows depositing an arbitrary byte in the input buffer, the bug becomes manifest.

1000 DEFFN'0 HEX(F1)

Then type

10 <SF0>

All sorts of garbage spews out, and you get a syntax error number. The syntax number can be avoided by putting the magic byte into a REM comment, or inside the quotes of a PRINT statement. Different codes produce different spew.

#15 – Hidden Code (link)

bug image

This trick builds (to a crescendo) on the previous two tricks.

Wang BASIC converts all line number references, such as the one at the beginning of each line, or after a GOTO, GOSUB, or THEN, to a three byte sequence, FF ab cd, where ab cd is the BCD representation of the line number. For example, line 1520 would be encoded as FF 15 20.

Using the same DEFFN'0 HEX(....) trick, we can insert arbitrary line numbers, including illegal ones.

10 PRINT "Normal code"
1000 DEFFN'0 HEX(FFC1C1)

This shows up as "<1<1" (a choice other than C1C1 would produce something else weird). Enter this illegal line by hitting <SF0> then entering your executable code:

<SF0> PRINT "Magic code here":RETURN

When you LIST the program, the illegal line number doesn't appear, and that the reference to that line would appear to be a syntax error. Then when you RUN, the hidden line shows that is is still present.

10 PRINT "Normal code"
20 GOSUB <1<1:END
1000 DEFFN'0 HEX(FFC1C1)
Normal code
Magic code here

Other illegal lines may or may not show up in the listing, depending on where it falls. If the chosen line is FF 00 CC, for example, it will appear between line 99 and line 100 of your listing. The reason the code didn't show up at all in the original example is that when you type LIST, the interpreter must silently convert it to be LIST 0,9999, and since C1C1 is after 9999, it doesn't show up. One can still get those hidden lines to list out by explicitly naming the line or giving a range using an even greater illegal line number.

#16 – Transposition Error (link)

Page 13 of the Matrix manual has a very specific warning about not doing a matrix transpose operation under some conditions:

With any 32K System 2200, an array to be transposed must not be the first variable or array defined in the program; it must be preceded in the program by a variable or array defined with at least 8* (column-dimension-of-array-to-be-transposed - 1) bytes.

That sounded like an invitation to me, so here is a sample program and the results:

:10 DIM A(3,3)
:20 DIM B(3,3)
:30 FOR R=1 TO 3:FOR C=1 TO 3
:40 A(R,C)=3*(R-1)+C
:70 MAT B=TRN(A)
 1               2               3
 4               5               6
 7               8               9
 0               0               0               0
 0               0               0               0
 0               0
 0               0               0               0
 0               0               0               0
 0               0
 ... (a 10x10 array of 0's is printed)
 0               0               0               0
 0               0               0               0
 0               0
 1               4               7
 2               5               8
-0.00010000E-;>  3               6

The A(,) array has been redimensioned to a 10x10 array. B(3,1) is an illegal value, and it isn't really the transpose. Different sized arrays result in different illegal values. The illegal value doesn't appear to depend on the array values.

Update 2006/03/07:

Although the manual cautions only about TRN operation on a 32 KB system, other matrix operations are at risk too.

:_   (despite the prompt, the machine is hung)

A warm reset won't fix the hang; it needs to be power cycled.

:10 DIM A(4,4)
 1       1       1       1
 1       1       1       1
 1       1       1       1
 1       1       1       1

No problems there.

:10 DIM A(4,4)

 1       0       0       0
 0       1       0       0
 0       0       1       0
 0       0       0       1
(nothing there)

Note that

10 MAT A=IDN(10,10)

hangs, while

10 MAT A=IDN(9,9)

is OK, and

10 DIM A(9,9)
20 MAT A=IDN(9,9)

hangs. The unused portion of the array is what matters.

Explanation of Bug

What is magic about 32 KB, and why does it matter that the array be the first one declared? The 2200 CPU has a 16 bit address, but the unit of memory addressing is the nibble. Also, the variable table (where each entry in the table consists of the variable name and its value) is located at the end of memory. The variable appearing in the program will be located in the highest location in memory. Apparently some of the MAT routines do pointer arithmetic and don't handle on the possibility of pointers wrapping off the end of memory.

#17 – PRINT (ab)USING (link)

Here are some more oddities about the PRINTUSING and image statement that I've figured out.

10 PRINT "HI":% This is pretty much like a REM

In the example above, the image declaration (the "%" and later) can't be used by a PRINTUSING statement, as it must be the first statement after the line number, but in this atypical usage, it functions like a REM statement. When line 10 is executed, the PRINT happens, but the image is ignored.

The BASIC manual that image statements must not use field declarations for numbers that are wider than 16 digits. Let's see what happens.

:10 % ###.####################
:20 PRINTUSING 10, 123 + 1/3

Explanation of Bug

You can see that the mantissa is internally represented as 16 nibbles. You can see that there are only 13 significant digits, with the trailing digits filled with zeros. Filling in leading "#"s with spaces or zeros as appropriate, each "#" of the PRINTUSING causes the next nibble to be printed. However, when there are more than 16 digits after the first significant digit, the nibble pointer wraps around and starts printing the digits again.

One other oddity of PRINTUSING is that if the width of the image exactly matched the declared width of the output device, then each PRINTUSING referring to the image would produce two line feeds after each statement.

#18 – Executing Buggy Code (link)

Whenever a program is going to be run, the BASIC interpreter does a pass through the program source to ensure that all line number references (GOTO, GOSUB, IF ... THEN, KEYIN, ON...GOTO, ON...GOSUB, PRINTUSING) are valid and that all PRINTUSING line references start with an image specification. If any line is referenced by the program but doesn't exist, the program doesn't even begin running. Also, if any line contains a syntax error, the program will refuse to run. I believe that first pass is also used to build up symbol tables for the variables used by the program. For larger programs, there is a noticeable delay from the time "RUN" is entered until the program actually does anything visible because of this first pass.

Normally, this checking is a good thing. With other BASICs, it is maddening to be running a program for a while only to hit a syntax error just because the code path never meandered onto that particular line in error. Sometimes, though, it is nice to be able to run a program even though there is an error. There is a way around the strict error checking.

Wang BASIC has an unusual feature called the "named GOSUB". Named subroutines can optionally take an argument list. A subroutine is marked with a integer label from 0 to 255, like this:

100 DEFFN'3(N)
110 PRINT "The square root of"; N; "is"; SQR(N)

This can be called from within a program, like this

10 GOSUB'3(5)

What is particularly Wang-ish is that there are 16 special function keys (32 if used in combination with the SHIFT key), and these can be used to directly call the routine with whatever parameters are on the command line. This feature makes the 2200 act like a very sophisticated calculator.

:2 (then press special function key 3)
The square root of 2 is 1.4142135624

To get back to the story, in order to make these keys as responsive as possible, Wang BASIC doesn't go through the usual syntax check. Bringing this all together, if you want to execute a program that has some syntax errors that you feel you can ignore for the time being, just start your program with a function key:

20 ... (the rest of your program on following lines) ...

Press special function key 0 to run the program. The first line "catches" the special function key 0 action, then tells BASIC to ignore the fact that it was in a subroutine and to just continue on.

Be warned, though, that the first pass that has been skipped won't be able to build up a proper symbol table, so if your program references any new variables, who knows what will happen:

:20 PRINT "This is a test"
:30 A=1
:40 PRINT "A=";A
:50 PRINT "B=";B
:_(press special function key 0)
This is a test
A= 0
B= 1


#19 – COMmotion and disArray (link)

bug image

This trick comes courtesy of Georg Schäfer, of Bergisch Gladbach, Germany. He writes:

Yesterday I played a little with your emulator (works great!) and especially wanted to try an old trick that I learned on our old Model B at School back in 1976. I wasn't sure whether this will work on a Model T, because my machine still waits for repair, and it doesn't work with BASIC 2 on my VP, but it did on your emulator!

It was the very, very first command I ever saw executed on the Wang and even on any computer, and I still can remember the moment, when a guy from another class showed it to us.

It was a 8 K machine, and he typed:

    COM A$(87,87)1

You can try it on your emulator, it shows the same nice effect :-))).

For the 32 K machine you can use the following:

    COM A$(228,141)1

We called this special system error the "NESY-Error", because the name of the "inventor" was Jasper Neumann.

This causes the BASIC interpreter to repeatedly emit

    ^ERR 01

until the machine is reset. Using DIM instead of COM does not have the same effect.

On the 32 KB machine, END reports 32070 bytes free, yet 228*141*1 is slightly more than this at 32148 bytes. Using COM A$(114,141)2 or COM A$(57,141)4 also trigger the bug.

#20 – Array Dumbensions (link)

bug image

This is a bug that is related to the previous trick – by dimensioning an array slightly bigger than what is allowed, the interpreter corrupts itself. On a 32KB 2200T CPU, type the following:

:10 DIM A$(200,10)16
:20 DIM B$(4)20
:30 PRINT "This is line 30"
:40 PRINT "This is line 40"
This is line 30
This is line 40

^ERR 01
10 DIM A$(200,10)16
20 DIM B$(4)20
30 PRINT "This is line 30"

What happened to line 40? Run it again and line 30 disappears. If the DIM statements are COM statements instead, the results are a bit different.

#21 – The Point of No Return (link)

bug image

While disassembling the Wang 3300 Extended BASIC interpreter, I spotted a bug. I also knew that even though the 3300 CPU and the 2200 CPU were very different, the BASIC interpreter implementations have a lot in common. So I tried out the following code, and indeed the 2200 has the same bug:

:10 GOTO 30
:30 GOSUB 20

At this point text is spewed out and an ERR 15 message appears (although the error message might change depending on the exact state of the memory when the program runs).

I strongly suspect Stupid Trick #1 is a manifestation of the same bug.

:10 FOR I=1 TO 10

Explanation of Bug

When the interpreter sees the GOSUB statement, it locates the start of the next statement, pushes that location on a stack, then jumps to the subroutine; when the RETURN is seen, this stack is popped of any unfinished FOR loops, then the return address of the GOSUB is retrieved and execution resumes there. In the code above, there is no next line to return to – some meaningless address is pushed on the stack and RETURN causes the interpreter to resume interpretation there.

In the FOR code example, when the interpreter sees the FOR loop, it pushes a small data structure on a stack; this structure contains a token identifying the structure as belonging to a FOR loop, the address of the index variable (so the symbol table doesn't have be searched on each iteration), the end and step values, plus the address of the instruction immediately after the FOR instruction. If there is no ":" after the FOR command, the interpreter grabs the address of the next line, but there is none, just like in the GOSUB bug described here.

#22 – Transdimensional Array Assignment (link)

In Wang BASIC, matrices are declared to have either one or two dimensions. I've always thought of the MAT operations as working on 2D arrays, but it is also legal to use some of them on 1D arrays (also known as vectors) if it makes sense:

:10 DIM A(5)

Some operations, like TRN, IDN, and INV require 2D arrays in general, but surprisingly the following is legal:

10 DIM A(1),B(1)

This roundabout way to compute a reciprocal:

:10 DIM A(1)
:20 DIM B(1)
:30 A(1)=3
:40 MAT B=INV(A)
:50 PRINT B(1)

Lastly, there is this oddity, where a 1D array is assigned to a 2D array.

10 DIM A(1)
20 DIM B(1,1)
30 MAT B=A

Explanation of Bug

The symbol table entry for an array consists of a field indicating that the item is a 1D or 2D numeric array, a byte holding the first letter of the variable name, a nibble holding the optional second character of the name, total array size, first array dimension, second array dimension, then all the bytes required for the array values. Note that even vectors (1D arrays) have a slot holding the 2nd dimension, which is set to the value 1. In some instances the interpreter checks to see if the variable is 1D or 2D by looking at its symbol table type. In other cases it simply assumes it is a 2D array that happens to have a "flat" second dimension. For things like CON and ZER, that works out fine with no surprises, but that shortcut does lead to this weirdity.

#23 – Nothing For All Intents (link)

There is an odd thing in BASIC-2 Language Reference Manual, pointed out by Steve Powell, KCML expert. If you go to the index, there is a single entry under the letter Y, on page 451. The entry is "yurt", which is a domed tent used by the nomadic people of central Asia. The reference is to page the next page, 452, but it is blank. What could it mean?

I note that all other letters of the alphabet have at least one entry. Was this perhaps a placeholder? Or a workaround for some bug in their indexing software?

#24 – Fancy Filenames (link)

The Wang disk catalog uses eight 8 bit bytes to store the file name. Although there are a limited number of characters one can type directly from the keyboard, it is possible to produce any byte at all by mapping a hex string to a special function key. In that way, it is possible to produce names with symbols and, by setting the msb of a byte, an underlined character. In fact, it would even be possible to include control codes as part of the file name.

The down side is that when you want to load the file, you'd have to set up the right function key mapping again to specify the file you want to load.

:10 REM This is a test
:SAVE DCF "<press special function key 9>"

    COOL       P   00017  00019  00003

#25 – Phantom of the Keyword (link)

This trick is via Jasper Neumann, the very same one mentioned back in Trick #19.

Wang BASIC stores its keywords as a single byte in memory. For example, the PRINT keyword is encoded as HEX(A0). That is why tokenized words always appear a certain way no matter how they were entered. For instance, if you enter extra spaces or fail to include a space after the keyword, they both get normalized to the same representation in memory.

:10 P  R  I   N  T "This is line 10"
:20 PRINT"This is line 20"
:10 PRINT "This is line 10"
:20 PRINT "This is line 20"

Inside the machine there is a ROM which is encodes the mapping between the sequence of characters for a keyword and its encoding. Inspecting this table, the 2200 A/B/C/S/T machines had a keyword which was not documented by Wang: DRAM.

The DRAM keyword is not accepted by a 2200T CPU (and produces an ERR 51 code), but it does work on the 2200B. The syntax is:

    DRAM hh

Where hh are two hex digits, eg "00" or "0A" or "1C". In response, the command produces a dump of one page (256 nibbles, also known as 128 bytes) of DRAM.

:DRAM 00


There are some oddities. The hh argument is optional; if it isn't given, then 00 is assumed. That is reasonable enough.

But you can also give just a single digit, and oddly, it is interpreted as the high nibble. That is, DRAM A is interpreted as if you had typed DRAM A0.

Another oddity is that the parser doesn't strictly check hex digits are entered. DRAM 0Z is accepted as a legal command, as is DRAM GG. Seemingly any single or pair of letters and digits is allowed.

#26 – Some Summation (link)

lightbulb image

This trick comes via Carl Barnes.

Imagine you have an array of 250 numbers and you want to find their sum. The obvious way to do it is as follows:

:10 DIM A(250)
    ... (A is initialized somehow) ...
:100 REM Calculate sum of all entries in A()
:110 T=0
:120 FOR I=1 TO 250:T=T+A(I):NEXT I
:130 PRINT "Sum is";T

That works just fine. But there is a faster way to do it.

:10 DIM A(250),B(1,250),T(1)
    ... (A is initialized somehow) ...
:100 REM Calculate sum of all entries in A()
:110 MAT B=CON
:120 MAT T=B*A
:130 PRINT "Sum is";T(1)

Even though the first way is only summing the entries, while the second method has to multiply each entry and then sum, the second way is significantly faster. This is because the first way manages the loop overhead in BASIC and each index operation requires converting the floating point quantity in I to an array offset. The second way must do the multiply, but the loop overhead is managed in microcode and the array indexing is all done efficiently with simple pointers.

Simple Clever
2200T 3.2 s 0.65 s
2200VP 0.19 s 0.07 s

#27 – String Sushi (link)

lightbulb image

Admittedly, it would be a rare occasion where this trick would be useful. Wang BASIC allows redimensioning an array so long as the resulting array is not larger, in total, than the original array allocation. Let's use that ability to rearrange one string into N fixed size substrings without actually moving any data.

:10 DIM A$(1)16
:40 MAT REDIM A$(4)4
:60 MAT REDIM A$(1)16





#28 – Fast Multiway Value Check (link)

lightbulb image

Say you need to check check if a string is one of N values and then branching based on which value it matched. Say the user must enter a single letter as the result of a menu prompt, and a branch is taken the corresponding routine. A straight-forward way of doing that would be as follows.

:10 DIM A$1
:20 PRINT "Which action?"
:30 PRINT "   (R)ead"
:40 PRINT "   (W)rite"
:50 PRINT "   (T)est"
:60 PRINT "   (Q)uit"
:70 INPUT A$
:80 IF A$="R" THEN 200
:90 IF A$="W" THEN 300
:100 IF A$="T" THEN 400
:110 IF A$="Q" THEN 500
:120 PRINT "Bad input":GOTO 70

That works fine but is a bit verbose. Here is the idiomatic way to do it in Wang BASIC. Lines 80 and 90 are the important ones.

    10 DIM A$1
    20 PRINT "Which action?"
    30 PRINT "   (R)ead"
    40 PRINT "   (W)rite"
    50 PRINT "   (T)est"
    60 PRINT "   (Q)uit"
    70 INPUT A$
    80 ON POS("RWTQ"=A$) GOTO 200,300,400,500
    90 PRINT "Bad input":GOTO 70

Here is another slightly more complicated example, which works for BASIC-2. It would need some modifications for Wang BASIC. The user inputs a three digit disk drive address. Of course only certain addresses are valid, and this short bit of code detects if the address is legal or not, and if legal, the disk address is returned in binary in STR(B$,1,2).

:10 INPUT "Which disk drive",A$
:20 MAT SEARCH "310320326330350360370B10B20B26B30B50B60B70",=STR(A$,,3) TO B$ STEP 3
:30 IF STR(B$,,2)=HEX(0000) THEN 10
:40 PRINT VAL(STR(B$,,2),2)

#29 – Detecting the Screen Size (link)

The first Wang machines had a 64x16 screen display; later an 80x24 option was developed. Although programs can declare how wide the display should be via a SELECT PRINT 005(64) or a SELECT PRINT 005(80), how would a program know which one to specify? Even the Wang BASIC microcode needs to figure it out so it can set a sensible default width value.

The solution is not at all obvious. The 64x16 display controller card is write-only. PRINT statements direct a byte stream to the card and hardware on the card is responsible for making the right thing happen. There is no status which can be read from the card. The designers of the 80x24 display card added a bit of logic to respond to read requests to pull bit 5 of the input bus low when the card is addressed. Because the 64x16 display controller logic doesn't respond to read requests, that bit remains pulled high. Thus reading the card and looking at that one bit can be used to tell the two apart.

:10 REM determine if this display is 80x24 or 64x16
:20 $GIO /005 (7601, A$):A$=A$ AND HEX(10)
:30 IF STR(A$,1,1)=HEX(10) THEN W=80:ELSE W=64
:40 PRINT "The screen is";W;"characters wide"

Unfortunately, Wang BASIC doesn't support the $GIO command 7601, so this works only under BASIC-2. I'm not sure if Wang BASIC can determine the display width.

#30 – Boolean Weirdness (link)

One of the interesting features of Wang BASIC, and as far as I know is unique to Wang BASIC, are the string binary operators. As a refresher, Wang BASIC has commands such as:

:100 AND(A$,B$)

This function does a bit-wise logical AND on the corresponding bytes of A$ and B$, saving the result back in A$. There are also functions for OR, XOR, a generic BOOLn operator which provides 16 different options, as well as byte-wise add with and without carry, and bitwise rotation.

BASIC-2 preserves those operations and extends them with a new, more flexible syntax. As we shall see, that syntax can be misleading.

:10 A$=ALL(07)
:20 B$=ALL(11)
:50 A$=A$ XOR B$

Lines 10 and 20 initialize the strings A$ and B$ to all HEX(07) and HEX(11), respectively, as shown by the output of lines 30 and 40. Line 50 exclusive-ors each byte of the A$ string to the corresponding byte of the B$ string and saves it back to the A$ string. The boolean operators are AND, OR, XOR, and BOOL0 through BOOLF. The BOOLh operators can perform any of the 16 logically possible functions of two boolean values. In that regard, AND, OR, and XOR are simply shorthand versions of BOOL8, BOOLE, and BOOL6 respectively.

There are some other operators which are not boolean, but have a similar structure, and these operate by by byte as well. These are ADD, ADDC, SUB, SUBC, DAC, and DSC. We'll ignore those non-boolean operators for now and focus on some of the unexpected features using the boolean operators only.

First up, the first operand can be omitted. In this case, it may seem that what is going on is the left operand of the operator is assumed to be the same as the receiving variable. For instance, the following code will produce exactly the same result as the previous example.

:10 A$=ALL(07)
:20 B$=ALL(11)
:50 A$=XOR B$

That is a bit unusual in that in arithmetic statements, one is not allowed to make such a shortcut, but it easy enough to understand. However, picking at the corners we find out something else is going on.

:10 A$=ALL(07)
:20 B$=ALL(11)
:30 C$=ALL(82)
:40 A$=A$ XOR B$ OR C$

Here the result is the same as ((A$ XOR B$) OR C$). It is nice that the operators can be chained like that. Parenthesis are not allowed, however. OK, let's go a bit further.

:10 A$=ALL(01)
:20 A$=A$ XOR A$ XOR A$

Hey, wait a second. (0x01 ^ 0x01 ^ 0x01) should be 0x01. What is going on? It turns out that the syntax is misleading. In arithmetic statements, the values to the right of the equals sign are evaluated, and the resulting value is assigned to the receiving variable on the left. But that is not how the string operators work!

Instead, each operator works on the receiving variable before the next operator is evaluated. So this expression:

:20 A$=A$ XOR A$ XOR A$

is actually executed in three phases like this:

:20 A$=A$:A$=A$ XOR A$:A$=A$ XOR A$

Now we can see why the first operator can be skipped if it is the same as the receiving variable; such an assignment just wastes time. Second, the A$'s value changes after the first XOR operator to becomes all 00, so when the second XOR is performed, it is now XOR'ing with 00 instead of 01.

In Wang BASIC, the left side of an assignment can be a list of variables. The right of the equals sign is evaluated, and the resulting value is assigned to each of the variables in the list on the left hand side. For example,

:10 A,B=7
:30 A,B=A-5
:50 A$,B$,C$="test"
:60 PRINT A$,B$,C$
 7               7
 2               2
test            test            test

Using this in combination with the boolean operators, things get a bit weirder.

:10 A$=ALL(01)
:20 B$=ALL(02)
:30 A$,B$,C$=A$ XOR B$

01 xor 02 should produce the value 03, which does end up in C$, but A$ and B$ have different values. What is happening is that unlike the numerical multiple assignment case, where the right side is evaluated and them assigned to each receiver on the left, instead it performs the right hand set of operations on each of the receivers in turn, in reverse order starting with the last receiver in the list, and moving towards the first receiver. It is as if the interpreter acts as if it sees the following sequence of statements:

   :30 C$=A$:C$=C$ XOR B$ --> 01 XOR 02 --> 03
   :31 B$=A$:B$=B$ XOR B$ --> 01 XOR 01 --> 00
   :32 A$=A$:A$=A$ XOR B$ --> 01 XOR 00 --> 01

What if we swap the receiving variables? Is this interpretation correct?

:10 A$=ALL(01)
:20 B$=ALL(02)
:30 B$,A$,C$=A$ XOR B$

Again, pretend that the statement was duplicated for each receiver, in reverse order:

   :30 C$=A$:C$=C$ XOR B$ --> 01 XOR 02 --> 03
   :31 A$=A$:A$=A$ XOR B$ --> 01 XOR 02 --> 03
   :32 B$=A$:B$=B$ XOR B$ --> 03 XOR 03 --> 00

The real lesson: The syntax seduces you into thinking the interpreter will do something very different than it actually does, so don't do this. There are some other corners here which I haven't even mentioned! For instance, the receivers and operands can use whole array notation, eg,

:50 A$() = A$() XOR B$()

so an entire array can be operated on at once. But what if the receiver was a compound assignment like this:

:50 A$(),C$ = A$() XOR C$

where A$ is a 1000 character array and C$ is a 16 character simple string? Or what if the source and destination overlap? The ADDC and SUBC operators work from the end of the string to the front, but it seems the others operate from the left to the right.

:10 A$=ALL(01)
:20 STR(A$,2,15)=XOR STR(A$,1,15)
:40 A$=ALL(00):STR(A$,16,1)=HEX(01)
:50 STR(A$,1,15)=ADD STR(A$,2,15)


#31 – Recovering a Scrambled Program (link)

This trick comes from Joaquin "Elio" Fernandez, of Sisteco, Argentina.

Wang BASIC had an option to "protect" a program when saved to tape or disk. Such programs could be loaded and run, but they couldn't be listed. However, the protection was trivial to crack. The actual image of the program on tape or disk was unaffected, but simply a bit in the header blocks of the program was set to indicate the program was protected. Just reading the blocks and rewriting them with that bit clear allowed one to load the modified version and LIST is just fine.

BASIC-2 added a much more strongly encrypted form of protection called program scrambling. The program blocks saved to disk were "scrambled" via some encryption function. Tantalizingly, there is no explicit password, so if the encryption function was known, it would allow cracking all scrambled programs.

Elio worked for years as a Wang technician in Sisteco, Argentina, repairing a variety of Wang boards. He discovered a way to recover a scrambled program via a small bug in BASIC-2. But decrypting a whole program is a tedious, mind-numbing exercise.

First lets create an example program that we will save in scrambled mode.

:10 REM this is the contents of line 10
:20 REM and this happens to be line 20
:30 PRINT "Finally we have line 30"

     ^ERR A06
Finally we have line 30


Next we create a file which initially is empty. This is where we will eventually reconstruct the unprotected version of the program.


We load the protected program but don't run it. Then we do a LOAD RUN on the unprotected program. LOAD RUN would normally load the new program and run it, but in this case the new program is empty and it produces a P35 error, which we just ignore. Now comes the bug. Hit the EDIT then RECALL buttons and you will be presented with the first line of the protected program:


               ^ERR P35
*.âREM this is the contents of line 10_

You'll notice two "garbage" characters before the REM. These are actually the two bytes which encode the line number in BCD. In this case they are HEX(0010). For some characters you can simply look up the character in a map of all the characters and figure it out, but some are ambiguous, in which case since you are in the line editor, you can edit the line as follows:



That is, all the stuff after the two bytes was replaced by a double quote character, and before the two characters A$=" was inserted. The first four digits of the HEXPRINT output is the line number.

OK, so now you have the line number and the text of the line (you wrote it down before editing it, right?) Now retype the line; make sure it is exactly the same number of bytes as the original otherwise the subsequent steps will fail:

:10 REM this is the contents of line 10
:SAVE F "A10"

Now repeat the procedure, but this time load "A10" instead of the empty "A" file.


*. REM and this happens to be line 20

This time there is no error (because the program is non-empty), but hitting EDIT RECALL brings up line 20 of the protected program. Once again, write down, exactly, what is in the line, and use the HEXPRINT trick to figure out the line number. Then enter line 20, and save the program again, this time as "A20". By giving each version a new name it is possible to restart from an earlier point if you messed it up. Of course you will fill up your disk, so you can delete the older ones once you have made progress.

Just repeat this procedure one more time and you'll end up with the third and final line, and "A30" will be a copy of the original "SECRET" file but without protection.

One complicating factor. When you do "LOAD RUN" to bring in the unprotected version on top of the protected version, it will actually attempt to execute the program, which might cause complications. So it is probably best to save the real first line somewhere, and replace it "10 STOP:REM xxxx" with enough "x" padding to take up exactly as much space as the original line (keep in mind that atoms take a single byte). The whole trick relies on the unprotected version ending exactly on the right byte before the first undecoded line of the protected version such that EDIT/RECALL will bring up that next scrambled line.

Note that the script that comes with the Wang 2200 emulator has the ability to directly dump the contents of scrambled program files. Of course the trick is one needs to have a virtual disk image to use that program. It would be possible to write a BASIC program which unscrambled such files directly on a real 2200 disk.

#32 – Prime $GIO Example Program (link)

This is simply a hack for BASIC-2's version of $GIO. Enter this program into a VP or MVP:

10 DIM P$(16)16
20 P$(1)=HEX(4050 4072 4069 406D 4065 4020 406E 4075)
30 P$(2)=HEX(406D 4062 4065 4072 4073 403A 400D 400A)
40 P$(3)=HEX(4032 400D 400A 0200 0301 0B02 0400 0504)
50 P$(4)=HEX(0600 0705 1800 0F30 1BF2 0F30 1BF2 0F30)
60 P$(5)=HEX(1BF2 0F30 1BF2 0F31 1BF2 1924 1C20 E02A)
70 P$(6)=HEX(1C30 D07B 1924 1F24 E037 116C 117D 1965)
80 P$(7)=HEX(19B0 1944 19C7 1CD0 E031 1CC0 E031 1800)
90 P$(8)=HEX(1BE1 1BD1 1BC1 1B81 1B11 0F3B 1911 1C1F)
100 P$(9)=HEX(E050 0131 0F3A 1980 1C8F E050 0830 19C0)
110 P$(10)=HEX(1CCF E050 0C30 19D0 1CDF E050 0D30 19E0)
120 P$(11)=HEX(1800 1BE2 1BD2 1BC2 1B82 1B12 0F0D 42F0)
130 P$(12)=HEX(0F30 1CEF E062 1CDF E063 1CCF E064 1C8F)
140 P$(13)=HEX(E065 E066 42E0 42D0 42C0 4280 4210 0101)
150 P$(14)=HEX(1911 1E1B D079 112C 113D 1F0C D068 111E)
160 P$(15)=HEX(19C7 19E3 1CE0 E070 1CC0 E06D 1CD0 E06D)
170 P$(16)=HEX(E025 400A E025 1100 0000 0000 0000 0000)
180 $GIO /005 (P$(),R$) D$

What comes out when run?

Prime numbers:

The program tears through the early primes, but gets slower and slower as the candidate numbers get larger. If you are extremely patient, it would eventually print all primes less than 65536 before stopping. Note that the $GIO program doesn't check if there is any key press pending; to stop the program one must reset the CPU.

The hack here is that $GIO was not intended for general-purpose computing. It was meant to do mostly short sequences of commands to perform I/O with peripherals which aren't directly supported by BASIC-2 syntax. The $GIO instruction set doesn't even have an add operation.

As a comparison, a simple BASIC-2 program can find and print all prime numbers less than 1000 in under 8 seconds. This $GIO program takes 4m:20s to do the same! With a bit more code and some additional complexity, the program could be sped up a lot, but I don't really care enough to do it.

See this zip file for the $GIO program shown above, for the python program that generated it, and for the simple BASIC-2 prime finder used in the speed comparison.

#99 – Your Code Here (link)

If you recall any bugs in Wang BASIC, surprising features, or abusive coding practices, I'd like to hear about it.