Just Some Thoughts

How Pitfall Builds its World

Games for the Atari 2600 were quite constrained. When Warren Robinett first pitched the idea that would become Adventure, a game where you would explore a world with many rooms and pick up items to help you along the way, he was denied because it wasn't thought feasible. And it made sense to do so. This was the late 70s; there had never been a game with multiple screens before. This was in the days of Space Invaders and Pac Man, when everything in a game was in front of the player at all times, so the fact that Adventure was able to have 30 rooms when it was finally released in 1980 was quite impressive.

screenshot of Adventure

The manual for adventure even had to explain the concept. It read

Each area shown on your television screen will have one or more barriers or walls, through which you CANNOT pass. There are one or more openings. To move from one area to an adjacent area, move "off" the television screen through one of the openings, the adjacent area will be shown on your television screen.

It was quite an innovation to have multiple rooms, and the fact that Adventure managed to have 30 was revolutionary. But Pitfall!, made by David Crane and released in 1983, had 255, all of which were much more elaborate (graphically speaking) than anything in Adventure. In this article we'll talk about how this was done.

screenshot of Pitfall!

But in order to fully appreciate the difficulty of such a feat let's note the difficulties faced by programmers for the Atari. The console itself had only 128 bytes of RAM. That's 1024 bits. For comparison, this sentence alone takes up more space if encoded in ASCII, let alone the UTF format in which it’s actually encoded*. Suffice it to say there was not much space on the Atari.

But that's no matter, surely the cartidge itself offers sufficient space, right? Well, sort of. Atari 2600 cartidges at this point generally had 4 kilobytes of ROM, the vast majority of which had to be dedicated to the actual code. If we ignore the need to store code, we could dedicate 16 bytes per room, but of course we can't just ignore the space needed to store the code.

So how did Crane get over the limited space when making the game?

Procedural Generation

The way you make a large world without storing much data is by having some code generate it for you.

The biggest problem with this, however, is that you generally need to save the data you generated. This is what games such as Rogue and Minecraft do. They randomly generate worlds in order to give variety to players, but save the data once it's generated. The limitations of the Atari do not afford this luxury.

Crane overcame this in two ways. The first was in the way he represented a room's layout in memory, and the second was the way in which he generated those representations. The way these representations are generated actually obviate the need to store anything but the current room in memory, but we'll get to that later. First we will look at how the current room is represented.

Representing the Room

Crane used a single byte to represent the layout of the current room. That may seem incredible given all that's going on in any given room, but it's actually quite simple.

The byte that holds the layout of the current room is split into four parts:

Bits 0 to 2: Objects

The first three bits control which object spawns. This is complicated by two things, both of which are controlled by bits 3 to 5.

First, a room may contain a treasure (the case if bits 3 to 5 are 101). If it does contain a treasure, then the usual item determined by the bits does not appear and the corresponding treasure is put in its place.

Second, if there are crocodiles (the case if bits 3 to 5 are 100) then no objects will spawn. Additionally, bits 0 to 2 being 010, 011, 110 or 111 cause there to be a vine that allows the player to swing over the crocs. Otherwise there will be no vine, forcing the player to jump on the heads of the crocodiles to get across.

The rules for items and treasures are

BitsItemTreasure
000one rolling logmoney
001two rolling logssilver
010two rolling logsgold
011three rolling logsring
100one stationary logmoney
101three stationary logssilver
110firegold
111snakering

(this was quite tricky to figure out).

Bits 3 to 5: Pit Type

Bits 3 to 5 control the type of pit or pits the player encounters.

BitsPit Type
000one hole in the ground
001three holes in the ground
010zero holes in the ground
011zero holes in the ground
100crocodiles in the water
101shifting tar pit with treasure
110shifting tar pit
111shifting quicksand

Shifting tar pits without a treasure (bits are 110) will always have a vine, whereas if there is a treasure (bits are 101) the shifting tar pit will not have a vine.

Bits 6 to 7: Trees

Bits 6 and 7 determine the pattern of the trees. This doesn't affect the gameplay at all, but gives the player the sense of changing locations. The tree patterns are all very similar, so I won't go into detail here, but if you want to see what they are for yourself you can look at the trees in rooms 1, 2, 3, and 5 for bit patterns of 11, 10, 00, and 01 respectively.

Bit 7: Underground Wall

Bit 7 is reused to control whether the wall in the underground is drawn on the left or right. It doesn't control whether or not there is a wall, that's elsewhere in the code, but if there is a wall, then this bit being a 0 puts the wall to the left, and this bit being a 1 puts the wall to the right.

And that's how a single byte determines the layout of the current room. But like I mentioned, only the current room is ever stored in memory. How this is made possible is by the way the rooms were generated.

Linear Feedback Shift Registers

The bytes that describe the room are generated by something that Crane called a polynomial counter, but what we now call a linear feedback shift register, or LFSR.

An LFSR is a way to generate pseudo random numbers from a seed by taking a binary number, performing a logical shift either left or right one, and then computing the input bit through a linear function of the original bits. Typically this function is a series of XORs.

As an example, let's use the LFSR in Pitfall!

When the player starts the game the room byte is set to C4 in hex (11000100 in binary, 196 in decimal). This is the seed. When the player goes one room to the right, the byte is shifted to the left, and the low bit (bit 0) becomes the XOR of bits 3, 4, 5, and 7. The formula for this is

b0b3' + b4' + b5' + b7'

Where '+' denotes XOR and the prime denotes a bit in the previous state. This pattern has the desirable property of being a maximal-length LFSR, which means that it will produce every combination of 8 bits save for all zeros. This allows the world in Pitfall! to have both the greatest number of rooms as well as an equal likelihood for any given string of bits (save for, again, all zeros).

So when you move to the right after the first room, the byte goes from 11000100 to 10001001. All the bits get shifted left, then bit 0 get set to 1, as 1 = 0 + 0 + 0 + 1.

This was implemented in 6502 assembly like so:

; room' = room << 1 | (bit3 + bit4 + bit5 + bit7)
LOOP_ROOM:
    LDA ROOM
    ASL
    EOR ROOM
    ASL
    EOR ROOM
    ASL
    ASL
    EOR ROOM
    ASL
    ROL ROOM
    DEX
    BPL LOOP_ROOM

ROOM is the byte that describes the current room. Before getting into how this works, it's important to note the last two lines, and why this is a loop. Crane wanted it so that if Pitfall Harry (the hero in Pitfall!) is in the underground, then going over a room actually transports him over three rooms. DEX decrements the X register and BPL branches if the preceding calculation wasn't negative, so Crane implemented this behavior by setting the X register to 2 before calling this subroutine if Harry was underground. Otherwise the X register is set to 0 and there's no looping.

So that's why it's a loop. The rest of the code is, as assembly code for the Atari often is, a bit dense. This isn't an article about 6502 assembly, so I won't go into too much detail, but basically what's going on is that the ASL (arithmetic shift left) commands are moving the bits into the correct places, and the EOR (exclusive or) commands are XOR-ing the bits. Finally, the ROL (rotate left) command shifts the ROOM byte to the left while inputting the carry bit into bit 0. That carry bit is a result of the previous EOR's and ASL's. And all of this together produces the desired behavior.

If we want to see every room that this generates, we can use the following 6502 assembly, which loops through the above code until the byte gets back to what it started as and stores every generated byte in order at addresses $00 to $FF (0 to 255).

    LDA #0          ; initialize address offset to 0
    TAX

define ROOM $00
define SEED $C4

    LDA #SEED
    STA ROOM

LOOP_ROOM:          ; do all the LFSR stuff
    ASL
    EOR ROOM
    ASL
    EOR ROOM
    ASL
    ASL
    EOR ROOM
    ASL
    ROL ROOM

    LDA ROOM
    INX             ; increment address offset
    STA $00,X       ; store generated byte

    CMP #SEED       ; stop if we complete a cycle
    BEQ STOP

    JMP LOOP_ROOM   ; get next room byte

STOP:
    BRK

But this doesn't get to why Crane's design was so genius. The above details what happens when you go right, but what about when you go left, back to where you came? The eight bits that describe that room were never stored in memory; only the current room is in memory. So how does Pitfall! handle going left? Well, with this LFSR:

b7b4' + b5' + b6' + b0'

What's special about this LFSR is that it is the inverse of the previous one. Every time you go left, this LFSR undoes what was last done by the LFSR used when you go right. From here on we'll refer to this LFSR as the left LFSR, and the previous one as the right LFSR.

The left LFSR was implemented like so on the 6502:

; room' = room >> 1 | ((bit4 + bit5 + bit6 + bit0) * 128)
LOOP_ROOM:
    LDA ROOM
    ASL
    EOR ROOM
    ASL
    EOR ROOM
    ASL
    ASL
    ROL
    EOR ROOM
    LSR
    ROR ROOM
    DEX
    BPL LOOP_ROOM

Let's just appreciate what Crane did for a moment. He found an LFSR that was both invertable and maximal-length. That's some impressive programming. But I won't just ask you to take my word that these two LFSRs are inverses, I'll prove it to you.

Pitfall's LFSRs are Invertable. Proof:

Consider a sequence of eight bits B = b7 b6 b5 b4 b3 b2 b1 b0. We'll use Br to denote B after applying the right LFSR and Bl to denote B after applying the left LFSR. What we want to show is that Brl = Blr = B. That is, we want to show that the result of applying the right and then the left LFSR, or the left and then the right, is the same as doing nothing.

To show that Brl = B. Recall that the right LFSR is

b0b3' + b4' + b5' + b7'

Applying this to B = b7 b6 b5 b4 b3 b2 b1 b0 we get the following:

Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
B b7 b6 b5 b4 b3 b2 b1 b0
Br__ b6 b5 b4 b3 b2 b1 b0 b3 + b4 + b5 + b7

Then, applying the left LFSR, which we recall is

b7b4' + b5' + b6' + b0'

To Br gets us

Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
B b7 b6 b5 b4 b3 b2 b1 b0
Br b6 b5 b4 b3 b2 b1 b0 b3 + b4 + b5 + b7
Brl___ 2(b3 + b4 + b5) + b7 = b7 b6 b5 b4 b3 b2 b1 b0

Which establishes the fact that Brl = B. Showing that Blr = B is much the same, so is left as an exercise for the reader. ∎

The above can also be verified with some simple code, so if you wish to try that here's a small JavaScript program that'll do the job. Also included is a function that lists all the rooms.

And that's how Pitfall! builds its world. A simple representation combined with an invertable linear feedback shift register.


Postscript: How I Figured All This Out

You would think all the information about a game as influential and popular as Pitfall! would be widespread and readily available. This is not the case.

The game's use of a LFSR is widely known, but how it was implemented is, as far as I can tell, not detailed anywhere save for in the actual assembly. But when I found an analalyzed and commented version of the assembly the description given for the LFSR was actually incorrect! At least, the description of the left LFSR was incorrect. But it was wrong in a fairly obvious way, and it wasn't hard to figure out the correct way.

Much more involved was figuring out how the byte was actually translated into rendering the world. Nowhere, not in any talk, any webpage, any book, or in the commented source code, was a description of what series of bits corresponded to what patterns in the room. At first I tried to go through the assembly, but assembly written for the Atari is so optimized and hacky and uses so many tricks that it became obvious to me that that would be way too much of a hassle.

So how did I do it? Well, I wrote a little program to generate the sequence of the LFSR (the JavaScript program linked to above) and I compared it to the rooms. Doing this for bit 7, which controlled the side of the screen the underground wall was drawn on, was easy, as were bits 6 and 7 controlling the trees. But for the others it was rather tedious. This map was an invaluable resource.

I'm surprised that, as far as I can tell, I'm the first to detail how exactly Pitfall! rendered its world, but I'm also kind of disappointed. If you haven't seen this GDC talk about preserving the history of games you absolutely should. Unlike many other disciplines the history of software is not being well preserved, even though it should be the easiest to preserve. We have the original source code for basically zero games for the Atari, NES, SNES, ColecoVision, you name it. Disassemblies are invaluable, don't get me wrong, but they're not the original. And they show nothing about the original comments.

We got really lucky that Microsoft released the source code for MS DOS, and maybe if we're lucky Activision and Atari and Nintendo have all their original code somewhere in a vault, which they'll release freely into the public for the good of mankind, but I'm not holding my breath. Everyone who is able should be working to preserve whatever piece of history they can, 'cause it's not gonna preserve itself.