Runtime code relocation by Cadaver ---------------------------------- 0. Introduction Starting from Metal Warrior 4, Covert Bitops games have loaded some code during runtime, often to implement title screens or story scripting, or enemy logic that is not required at every point in the game. This way, the amount of all- time resident code in the "main load" or "engine" is kept smaller. Metal Warrior 4 and Hessian both employed a 2 KB fixed-address buffer for the loadable code; only one such code chunk would be in memory at once and would always be fully overwritten when the next chunk was loaded. In this case, the code in each chunk could be pre-assembled to the fixed address. To avoid having to define fixed maximum sizes for various data and to be able to utilize free memory better, Hessian already used dynamic memory allocation for resource data like level maps and sprites, but from Steel Ranger onward, also code is considered a dynamically allocated resource. Because now the code load address is not known beforehand, or previously loaded code can move in the memory as resource chunks are loaded or unloaded, runtime code relocation becomes necessary. 1. Relocation algorithm The principle of relocation is simple, as long as some limitations are kept. It only requires recognizing 3-byte long instructions, which contain a 16-bit absolute address. If the address is a reference to the loadable code itself, or data it has included in the same resource chunk, it must be relocated. Loadable code may also access the variables and routines of the resident engine code, or the C64's hardware registers, and these accesses need to be left alone. Handling this properly requires just a memory range check: if the address is within the dynamic allocation area, it should be relocated. There must also be a way to detect the end of the code. The BRK instruction ($00) was chosen for simplicity. When a new piece of loadable code is loaded into memory for the first time, it uses the addresses it was initially assembled to. In Covert Bitops games $8000 is used, which is guaranteed to be inside the dynamic allocation area. The relocation algorithm becomes: - Get next instruction, until BRK encountered - Is it a 3-byte long instruction? - If yes, check the address. Is it within the dynamic allocation area? - If yes, add a delta value to the address to match the code's current load address 2. Limitations The code to be relocated must be unambiguous, ie. tricks that use for example the BIT instruction to skip the next instruction can not be allowed, as this gives multiple interpretations of the instructions. Including data tables that contain addresses (either 16-bit, or separated into 8-bit low and high tables) is a problem, as they can not be detected for relocation by just going through code. The algorithm simply chooses to not support this. This can be circumvented somewhat, as the dynamic allocation system supports querying the current address of any "object" inside a resource chunk. The loadable code can define extra objects it needs, for example text strings, that it can use when executing. In Covert Bitops games, loadable code is handcrafted in assembly to correspond to the resource chunk format, as some extra data like the total data size and number of resource objects are needed. See examples from the Steel Ranger demo source code and the c64gameframework example game. 3. Optimizations A naive implementation would use a 256-byte table to store the length of every instruction. In practice, regularities in the legal instruction set can be observed. Instructions ?c - ?f are always 3 bytes, and instructions ?4 - ?7 are always 2 bytes. For the remaining instructions, 4 lengths can be packed into one byte, so the final table becomes only 32 bytes. 4. The code This is adapted from file.s in the c64gameframework project. Two zeropage pointers (ptr - the code to relocate, and relocDelta - the relocation offset to add) are used, and the instruction length table is below the code. CodeRelocLoop: ldy #$00 lda (ptr),y ;Read instruction beq CodeRelocDone ;BRK - done lsr lsr lsr bcc LookupLength and #$01 ;Opcodes xc - xf always 3 bytes ora #$02 ;Opcodes x4 - x7 always 2 bytes bne HasLength LookupLength: tax lda (ptr),y and #$03 tay lda instrLenTbl,x ;4 lengths packed into one byte DecodeLength: dey bmi DecodeLengthDone lsr ;Shift until has the wanted lsr bpl DecodeLength DecodeLengthDone: and #$03 HasLength: cmp #$03 ;3 byte long instructions need bne NotAbsolute ;relocation ldy #$02 lda (ptr),y ;Read absolute address highbyte cmp #>fileAreaStart ;If not reference to self, skip bcc NoRelocation cmp #>fileAreaEnd bcs NoRelocation dey lda (ptr),y ;Add relocation offset to the adc relocDelta ;address sta (ptr),y iny lda (ptr),y adc relocDelta+1 sta (ptr),y NoRelocation: lda #$03 NotAbsolute: clc ;Proceed to next instruction adc ptr ;(length of current in A) sta ptr bcc CodeRelocLoop inc ptr+1 bne CodeRelocLoop CodeRelocDone: rts instrLenTbl: dc.b $99,$99,$9a,$dd,$9b,$99,$9a,$dd dc.b $99,$99,$9a,$dd,$99,$99,$9a,$dd dc.b $aa,$99,$da,$dd,$aa,$99,$9a,$dd dc.b $aa,$99,$9a,$dd,$aa,$99,$9a,$dd