People who worked on TOS are very clever.
(Originally published in Atariscne news)
Sure, it’s very easy to make fun of them. We all know the shortcomings of The Operating System (which is what the acronym stands for, according to a recent Leonard Tramiel interview): allegedly slow, allegedly buggy, gets in the way when you want to write games or demos. We all heard (or said) things like these at some point in our lives.
But, without going into much detail, as someone who’s spent quite some time pouring over the TOS source code, I can say that I’ve been very pleasantly surprised by it. For sure, any programmer who gets seriously close to the ST hardware can pretty much beat TOS in almost every respect in terms of speed. But what people don’t realise is that the TOS team had to stuff as much generic functionality as they could (because that’s what an Operating System does: it sacrifices speed for reusability and generability), inside a limited amount of space, on an unknown plaform that was being developed as they were programming it, in a very limited amount of time (it was reported that they shipped v1.0 in 6 months).
One person from that team was Dave Staugas. You might recognise him from a humble art package he developed.
But this is not the focus of this article. Instead we’re going to talk about this little snippet coming from a hex dump of TOS:
1C 1E 00 00 1E 1E 20 20 44 61 76 65 20 53 74 61 Dave Sta
55 67 61 73 20 6C 6F 76 65 73 20 42 65 61 20 48 Ugas loves Bea H
61 62 6C 69 67 20 4E 75 08 00 00 01 67 2C 3D 79 ablig Nu g,=y
I’m sure everyone has heard or seen this hidden message. The question we’ll try to answer is: how did he pull it off? Because even back then, if someone tried to hide a message the naive way, it’d get found out by others and then removed.
Let’s take this step by step. In assembly language the naive way to include a message like this would be to just include the string, like so:
dc.b "Dave StaUgas loves Bea Hablig"
And just be done with it. But again, this is easily found out by others at any point. A second thing to try would be a bit more sneaky:
dc.b $44, $61, $76, $65, $20, $53, $74, $61, $55, $67, $61, $73, $20, $6C, $6F, $76, $65, $73, $20, $42, $65, $61, $20, $48, $61, $62, $6C, $69, $67, $20
So pretty much just grab the hex values of the message and include those. But, again, this is too easy - especially for people in the 80s who used to be much more hardware about code, this would very easily be discovered by others as being ASCII characters, and be discarded.
This is what Dave did (note: this is wild speculation on my side, but I’m confident that this is pretty much what happened). Firstly, he found a really big file that had gnarly code and probably nobody wanted to touch it in fear of breaking something. The file in question is resposible for printing text in GEM - and if you’ve seen the capabilities of GEM text printing, you can imagine the complexity. I’ll just say that the file is about 2500 lines, which was gigantic for the machines of the time.
He then jumped around the middle of the file and found a good spot for the code: some tables are there:
wrmappin: dc.b op0,op0,op3,op3 ; replace mode
dc.b op4,op4,op7,op7 ; transparent mode
dc.b op6,op6,op6,op6 ; XOR mode
dc.b op1,op1,o_D,o_D ; inverse transparent mode
(although to be honest, any place in that file would do)
Finally, he typed in an innocent piece of code:
blt_adj: move.l -(a0),d0
neg.w -(a1)
moveq.l #$65,d3
movea.l (a3),a0
moveq.l #$61,d2
subq.w #2,-(a7)
bsr blt_adj+$81
movea.l $6F76(a4),a0
bcs blt_adj+$87
movea.l d2,a0
bcs blt_adj+$79
movea.l a0,a0
bsr blt_adj+$7E
bge blt_adj+$87
beq blt_adj+$40
rts
On the surface, this code looks legit: it has an entry point and a seemingly reasonable name “blt_adj”. “Oh, it probably adjusts the blit, no probs”. And he also returns from that subroutine with that rts in there. It looks the part.
But the code is complete garbage. If called it will very likely crash. But let’s assemble it and observe the output:
1148 000006F8 2020 blt_adj: move.l -(a0),d0
1149 000006FA 4461 neg.w -(a1)
1150 000006FC 7665 moveq.l #$65,d3
1151 000006FE 2053 movea.l (a3),a0
1152 00000700 7461 moveq.l #$61,d2
1153 00000702 5567 subq.w #2,-(a7)
1154 00000704 6173 bsr blt_adj+$81
1155 00000708 206C6F76 movea.l $6F76(a4),a0
1156 0000070C 65000073 bcs blt_adj+$87
1157 00000710 2042 movea.l d2,a0
1158 00000712 6561 bcs blt_adj+$79
1159 00000716 2048 movea.l a0,a0
1160 00000718 6162 bsr blt_adj+$7E
1161 0000071C 6C69 bge blt_adj+$87
1162 00000720 6720 beq blt_adj+$40
1163 00000724 4E75 rts
Now, let’s see the hex codes from the TOS ROM again:
$20, $20, 44, $61, $76, $65, $20, $53, $74, $61, $55, $67, $61, $73, $20, $6C, $6F, $76, $65, $73, $20, $42, $65, $61, $20, $48, $61, $62, $6C, $69, $67, $20
Looks familiar? That’s because they are the exact same bytes. So what our friend Dave did there was to find valid instructions from the 68000 processor that, when assembled, produce the ASCII charactres he wanted to encode. Of course, being a mere human, he couldn’t get it 100%, so his surname is spelled “StaUgas” with a capital “U”. He got the rest spot on though.
Another thing worth mentioning here. In hindsight one could very easily detect this snippet of code because it has no comments. The rest of the 2500 lines of code are very meticulously commented, because as I said the file contains very gnarly code, and it must have helped a lot if someone wanted to pick up and work on it in case something needed debugging. Under this lens, the above piece of code sticks out like fly in milk.
However, this has either never been discovered, or it has and people went “Great stuff, Dave” and let it be. Heck, I’d certainly do that too. Bea, I hope you stayed with him, just look what he did for you!
I’ll end with the same comment I wrote at the beginning: People who worked on TOS are very clever.