Kid Icarus Revealed: 1-1

An introduction to ROM hacking and NES architecture.

Game overview

Some time ago, I decided to research a bit on one of my favourites videogames. Kid Icarus. At that time, I had been working over 2 years on the sequel of this magnificent game, and its Game Boy version, a game for Windows that uses DirectX 8.1, and titled “Kid Icarus: Erico”.

But parallel to this, I wanted to take a snap to the original game’s internals. My objetives were basically:

- Make a level editor.
- Find out something about how the code works. Locate and understand some key assembler routines, which let me, for example, patch the ROM, so the lava does not hurt Pit, or the yellow water recharges your life in a single step, and so on.
- Translate tha game into Spanish.

From now on, I suppose you have some knowledge on computer architecture, NES hardware, and, at least, know what assembler is. Else, read notes at end.

Game’s general overview

I starting working on the editor. Spending an unhealthy number of hours, I got to understand the way, somekind weird, in which levels information was stored. To a programmer used to powerful systems, such as PC, it could seem quite strange, but in a limited machine, like the NES was, one cannot simply store the level directly, without any kind of compression.

General scheme of the compression used is:

- Define 2x2 blocks (which I’ll call macros).
- Define structures made up from these macros.
- Define each screen as a list of structures, placed in a certain position, with a certain colour attribute.

For it to become clearer, macros from world 1 are seen in the right image. Each one is made up from 2x2 tiles. The tiles from world 1 are the following:

The tile’s colours where choosen by me, so don’t spend much time with it. In the other hand, the macro’s colours are real, based on one of 4 possible palettes. Each level can define 4 palettes for background drawing. Looking only to the first 2 rows, 4 possible palettes would be:

But you cannot just combine macros as you want. That’s far from reality. To save up ROM space, structures are made up from these macros. Each world defines a small bunch of structures. And the screens are constituted by them. Even if you edit the game, you would only be able to use the structures that are defined, or you’ll have to modify the structures themselves. Now on with an example:

Each level is formed by screens. I mean, the space that fits a TV screen. Lining up screens, is how levels are built. For example, the first 2 screens of level 2-1 are the followings:

The previous screen, the second of level 2-1, is made up from 6 different structures (remarked in red). The ice platform inside the yellow rectangle might seem like a different structure, because it’s only 3 blocks wide, while the others are 4. But really, this platform is 4 blocks too, but the rightmost block is hidden by the brown bricks platform.

This can be done because the structures are drawn in a certain order. In the ROM, a screen is composed by a list of structures, and the structures drawn after, may hide part of a previously drawn structure. In this case, the ice platform is the 13th structure, while the bricks platform is the 14th.

In total, there’re 17 structures, but taken from only 6 different ones.
- 4 ice platforms 4x1
- 1 ice platform 1x1
- 3 bricks platforms 4x1
- 4 stone blocks 2x4
- 4 columns 1x4

Each structure can use one of the 4 avaible palettes, that each level has defined. The structures, macros and tiles are the same for the 3 levels within a world. The fortresses are grouped appart in a different word, that, by the way, uses a level scheme totally different (as should be obvious if you played the game), as they require information about position of every room within the map, and their accesses (stairs and doors). The way of storing enemies position also changes, so it was a new challenge to decipher the fortresses system.

Macros

As I said before, each macro is a 2x2 tiles graphic. As each tile is 8x8 pixels, a macro is 16x16 pixels, and consist of 4 bytes, secuentially stored in the ROM. Take for example the following macro. In the right you can see how it is actually made up from 4 tiles.

Each of the 4 bytes that make a macro, is a pointer to a tile. I’ll call them tile pointers. The above image shows the order in which they are stored. These pointers are 8 bits, meaning that they can point to a maximun of 256 tiles. And so, meaning that there cannot be more than 256 tiles showed at the same time for backgropunds. And if they were more, they won’t be accessible, because macros would not be able to point to them.

So, how is this pointer interpreted? Easy. It is the position of the tile inside the pattern table, loaded in the VRAM. The pattern table is where all tiles are stored. There’re 2 pattern tables in the NES. One is normally used for backgrounds, and the other one for sprites.

Remember the tiles for world 1, that I already showed at the begining of this document, showed again to the right. I remarked in red the tiles used to build the macro. This is actually the pattern table used for backgrounds while in world 1.

The above macro, is defined in the ROM as the following 4 bytes:

7A 7A 6D 6E (hexadecimal)

If you look in the pattern table, you’ll be able to realize, that 7A is actually the square block, in the upper left corner of the macro. Then it comes another 7A, which is the upper right corner. 6D and 6E are the 2 bottom tiles of the macro.

The macro is stored in ROM offset $1A308. You can go and see it :-) Open ROM with an HEX editor, search for position $1A308, and you’ll see 7A 7A 6D 6E there. I encourage you to try if you’ve never done this. It’s cool :-)

Of course, if you change the pointers, you can change the macro. As easy as it sounds. You just need to know the tile position inside the pattern table, and you’ll be able to edit the macros at wish.

Structures

As I said before: - Each screen is made up from several structures. - The structures are build with the macros defined for that world, as we’ve just seen. The format in which macros are stored has already been discussed. And it is very straight fordward. Structures are somekind harder, as there are horizontal and vertical structures in the game, or both. This means that the format in which constitutive macros are stored, to create a structure, varies.

Go back to the image where I showed the second screen of level 2-1. The columns are vertical structures. The platforms, are horizontal structures. The purple stones, are both, as it has more than one macro width, and more than one macro height, like this structure:

Structures are divided into rows. Each row is constituted by a certain number of macros. There can be one or more rows, made up from one or more macros. For example, this structure from world 1 is 2x4 (2 macros per row, and 4 rows). It is stored in ROM offset $1A24C.

The format of each row is:
- Firts byte holds the number of macros that make the row.
- Then, come all macros to be drawn. From left to right. Each macro is represented by one byte, and I’ll call this bytes macro pointers.
Then, it may come another row, and so on, ultil the byte FF is reached (which means end-of-structure-data).

Each macro pointer holds the position the macro has in the list of macros. For example, the previous structure uses the same macro all the time. This macro is at position 5A, so its macro pointer would be 5A, and so the first row of the structure would be: 02 5A 5A

And the three other rows come later, so, the whole structure would be:

02 5A 5A 02 5A 5A 02 5A 5A 02 5A 5A FF

Just to make sure it is clear enough, consider a structure with only one row, like the brick platrform seen before. Horizontal structures like this are stored as:

- First byte holds how many macros will make the structure.
- The remaining bytes hold the macros to be drawn.
- Byte FF marks the end of the structure.

Vertical structures, like columns, that have several rows, each of which is made with a single macro, are stored like this:

- First byte is always one (width is one for vertical structures).
- Next byte holds macro to draw.
- Then comes a one, and macro, and so on (each one of these make one row).
- Byte FF marks end of structure, meaning there’re no more rows.

You can just change the macro pointers, to edit how an structure looks like. But you can also change the numer of rows, or width. The only limitation is that you cannot change the total size in bytes, that the structure had. If you edit the blue stones structure discussed above, which was:

02 5A 5A 02 5A 5A 02 5A 5A 02 5A 5A FF

You cannot use more bytes that those assigned for it. But you could, for example, make it 3x3, by replacing it with:

03 5A 5A 5A 03 5A 5A 5A 03 5A 5A 5A FF

But you won’t be allowed to make it 3x4, replacing it with:

03 5A 5A 5A 03 5A 5A 5A 03 5A 5A 5A 03 5A 5A 5A FF

Because last 4 bytes will override information stored just after the structure!! Remember this, if you didn’t know it before, because if you forget it, and you plan going into ROM hacking, black screens are going to come quickly ;-)

As a general rule, you CANNOT take more bytes that the ones that were originally reserved for a specific purpose in the ROM.

I’ll give you the list of macros and structures for every world later on, so I can save you some precious time. Just remember that if you edit a macro or structure, it will change everytime it appears in the game. It is defenitely not possible to just change a particular structure in the game, due to the compresion scheme I’m discussing.

Screens data

Now that we know how structures are made, we need to know how they are arranged to make screens. The rooms are constituted by a single screen, but levels are made up from several screens, one after another.

If you go back to the “general overview” section, and have a look at the two first screens of level 2-1, you’ll probably notice that one of them has more structures than the other. And you’ll be right. First screen has 21 structures, while the second screen has only 17.

Macros are always 4 bytes. Structure data have a variable size depending on the number of rows and columns. Screen data is also variable size, because you can place a different number of structures in it.

Let’s go into the screen data format.

The first byte is a complete mistery for me. I don’t really know what it is suppose to do. Not all screens start with the same byte. Possible values seem to be 00, 01, 02 and 03, or, in other words, only the lower 2 bits seem to be used. But the most interesting thing is that changing them does not apparently affect the game. For now, we’ll ignore this byte.

After this magic byte, come the list of structures that will be in the screen. For each structure, 3 bytes are used.

- Byte 0 is divided into upper 4 bits, and lower 4 bits (Y/X position)
The higher nibble is the Y coordinate of the structure.
The lower nibbler is the X coordinate of the structure.
- Byte 1 is the structure pointer.
- Byte 2 is the palette used to display the structure. As there’re only 4 possible paletes, only lower 2 bits are used. So, this byte can be 00, 01, 02 or 03.

This way, a screen with 17 structures will take up 3*17 = 51 bytes, plus the first magic byte. We will also have to add two more bytes, used to mark the end of the screen data. These bytes are always FD FF, and all screens end with them. So, total bytes needed for a 17 structures screen will be 51 + 1 + 2 = 54 bytes.

Well, we now know how a screen is builded!
From tiles we make macros, all of them being 2x2 tiles.
From macros we make structures, with different sizes.
From structures we make screens, with different number of structures.

But there’s something left. What we know is OK for rooms, but we need to know how screens are combined to make up a level. We’ll take a look at this now, but the hard job is done.

Levels

Basically, a level is a sequence of screens. In the ROM, the level data is stored in a pointer table. One pointer for each screen. So I’ll called then screen pointers. BUT they are different from the previously seen pointer. First of all, they are not 1 byte pointers, like tile pointers and macro pointer. Instead, they are 16 bits pointers, meaning that each one takes up 2 bytes.

Level 1-1 has 12 screens. So, this level data is a 12 pointers sequence, that is 24 bytes.

Level Pointers

Now, what do this pointers point? If you recall, tile pointers hold the ordinal position of the tile inside the pattern table. Macro pointers hold the ordinal position of the macro in the macro list, defined for that world. Screen pointer hold the start address in NES memory, of the screen data. So, these pointers are what are commonly referred as pointers (or memory pointers), in programming and assembler world. I suppose you are familiarized with pointers.

As I said, each screen pointer points to the address where the screen is stored at that time in the NES memory, but that it is NOT the same as the ROM address. Remember that NES game normally use mappers, and Kid Icarus is one of those. It uses a fairly common MMC1 mapper. NES can only manage 32kb of ROM code, so when dealing with Kid Icarus, which ROM is 128Kb size, the game is broken up in 16Kb chunks called banks. At a certain time, only 2 banks are accessible.

In Kid Icarus, last bank is always mapped, so you can always have access to it. MMC1 mapper is in charge of mapping one of the remaining 7 banks of ROM in the other bank slot in NES memory map. When the game needs to access another part of the ROM, it has to ask MMC1 to map the bank in which it is stored.

So, MMC1 swaps banks in and out of the limited memory map. This means that only one bank of the ROM is accessible at a time, along with the last bank, which is always mapped.

Going back to level pointers

Levels table for World 1 is:


0001a540h: 1A 7C 34 7C 52 7C 49 78 C5 77 49 78 C5 77 49 78
0001a550h: 10 78 49 78 10 78 49 78 10 78 49 78 10 78 49 78
0001a560h: 10 78 49 78 10 78 49 78 49 78 BD 70 0B 71 3E 71
0001a570h: 80 71 80 71 A4 71 EC 71 3A 72 73 72 A3 72 D3 72
0001a580h: 00 70 FF FF 09 73 4B 73 81 73 A5 73 CF 73 F9 73
0001a590h: 23 74 56 74 7D 74 A4 74 C5 74 E0 74 F8 74 39 70
0001a5a0h: FF FF 2E 75 3E 71 76 75 97 75 C4 75 0C 76 39 76
0001a5b0h: 39 76 81 76 97 75 B1 76 B1 76 DB 76 DB 76 DB 76
0001a5c0h: 17 77 81 76 4D 77 8C 77 78 70 FF FF

Level 1-1 starts at the first marked pointer. It points to NES address $BD70, where first screen data will be located. Next 16 bit word points to second screen, and so on, until it reaches $FF $FF. That means “end of level”. You can see that level 1 is composed by 12 pointers, 2 bytes each. Next marked pointer points level 1-2 first screen. And last marked pointer points to level 1-3 first screen.

Data before first marked pointer is different.
First 6 bytes are 16 bit pointers too, which I call start level pointer. They point to the begining of level pointers.
First start level pointer, $7C1A, points to level 1-1 first pointer, which we already know is the first marked pointer. That is, to begining of the level.
Second start level pointer, $7C34, does the same for level 1-2, and third start level pointer, $7C52, for level 1-3.

Now look that first marked pointed is stored at ROM address $1a56a. But the first start level pointer points to $7C1A, when it should be pointing to the marked pointer. As we’ll see next, one thing is the ROM address, and another thing is the real NES address, in which things are located.

Briefly:
For World 1, all world data is copied to a expansion RAM chip inside the cart, starting from NES address $7000. Now, World 1 data starts at ROM address $19950, so the byte stored there, will be available at NES address $7000. As ROM address $1a56a is $C1A bytes away from $19950, the byte stored there will also be $C1A bytes away from $7000, leading to NES real address $7C1A.

Understanding how NES works, and being able to do this memory translations is vital for ROM disassembling. An so, I’ll be talking a bit on this now.

Maybe you realized that after the 3 starting level pointers, there’re 36 bytes which I did not mention. Forget about them for the moment. I’m currently researching on it. It’s been quite hard. So, for now, I’ll say that they have to do with which room to load when we enter a door during a level.

David Senabre Albujer. 2006