Skip to main content

Rando SeedInfo ARCP Structure Proposal

This page describes an alternative way of storing ARC patching instructions in a TPRando seed GCI.

Note: You will see this section of the seed GCI referred to as the ARCP section (for arc patching).
This naming is inspired by what you already see in the game files, such as RARC, J3D2, BMDR, etc.

Motivation

A smaller GCI is generally more desirable than a large one, but the GCI's block count is especially important for console players.

If a user is putting 10 seed GCIs on their memory card as advertised, increases in block size are multiplied by 10.

That is to say, an increase in block size from 3 to 4 is more like changing from 30 to 40. Likewise, shaving off a block from the size is more like shaving off 10 blocks.

So it is reasonable to shave off blocks from the size if we are able to.

(This is why I think there should be an option to generate seed GCIs which leave out the image data.)

Benefits

  • With the current data of 332 patches (288 items and 44 message indexes), we cut the Arc Patch section of the seed GCI down to approximately 27% of its current size (from 0x2980 bytes to 0xB40).
    • The will allow Seed GCIs to likely be reduced to a single block (not counting the imageData/comments block).
  • Only do as many my_DVDConvertPathToEntrynum calls as absolutely necessary instead of one per patch.
    • This goes from 332 calls to roughly 124.
  • Instead of scanning through every patch every time you might want to apply one, simply scan by entryNum then immediately apply all patches once/if you find a match.
  • Supports patching arcs in ways more complex than simply 1, 2, or 4 bytes.

Trade-offs

  • Need to generate the arc entryNum lookup table at runtime (only do this once).

At a high level

When "arcA" is loaded, we check if this arc needs to be patched. If it does, we apply the appropriate patches.

This means that we need a list of arc identifiers, with each entry mapping to a list of patches to apply:

  • arcsToPatch => [arcA, arcD, arcQ, arc7, arcC, arc2, ...]

Within the list, each arc must map to a list of patches:

  • arcA => [patch0, patch1, patch2, ...]

The problem

This would be simple enough, but the problem is that we must wait until runtime to generate the arc identifiers.

For example:

  • We want to patch /res/Stage/D_MN10/R00_00.arc.
  • At runtime, we are notified that arc 000005AC was just loaded.
  • We scratch our head, because we only know the string path of the arc, not its runtime identifier.

In other words, we need this:

  • arcsToPatch => [arcA, arcD, ...]
  • arcA => [patch0, patch1, patch2, ...]

but we have this:

  • arcsToPatch => ['/res/Stage/D_MN10/R00_00.arc', '/res/Stage/D_MN01/R03_00.arc', ...]
  • '/res/Stage/D_MN10/R00_00.arc' => [patch0, patch1, patch2, ...]

Fortunately, the game uses a function to convert between the filepath and its identifier, and we can use this as well.

So essentially, the complexity comes from the above conversion which must happen at runtime.

note

We can document all of the path-to-identifier mappings so that we know them at compile time. The problem is that if the user is playing on a modified ROM (such as tpgz), the identifiers for a given path may be different.

Current structure

The current approach is fairly straightforward.

We store an array which contains one entry for each patch.

Each entry specifies:

  • the arc's filepath
  • where to apply the patch
  • the patch data to apply

Size is 0x20 bytes.

OffsetTypeNameDescription
0x00u32offsetThe offset of the byte where the item is stored from the start of the file.
0x04u32arcFileIndexThe index of the file that contains the check.
0x08u32replacementValueUsed to be item (byte), but can be more now.
0xCchar[0x12]fileNameThe name of the file where the check is stored.
0x1Eu8 (enum)fileDirectoryTypeThe type of directory where the check is stored.
0x1Fu8 (enum)replacementTypeThe type of replacement that is taking place.

Here is an example:

OffsetTypeNameValue
0x00u32offset0x8450C
0x04u32arcFileIndex(Placeholder space which is filled at runtime by entryNum of arc file)
0x08u32replacementValue0x42 (Ball and Chain itemId)
0xCchar[0x12]fileName"D_MN11/R00_00.arc"
0x1Eu8 (enum)fileDirectoryType0x0 (Stage)
0x1Fu8 (enum)replacementType0x0 (Item)

Problems with the current stucture

The main problem is the amount of space this takes up.

The expected patch count is currently 332.

0x20 bytes per patch * 332 patches => 0x2980 bytes

Yikes! A block is only 0x2000 bytes, so we are using more than a block for this portion of the seed data alone. Surely we can do better.

How to improve

The most obvious thing to look at is the fileName, which takes up 0x12 bytes per patch. Remember, the patch is only 0x20 bytes long, so this means each seed would have 0x1758 bytes (1.5 blocks!) of the following:

"D_MN11/R00_00.arc","D_MN11/R00_00.arc","D_MN11/R00_00.arc","D_MN11/R00_01.arc","D_MN11/R00_02.arc",...

And yes, you would have several copies of the same string if you needed to do multiple patches to the same arc.

Solution

Rather than looking at each patch and asking which ARC it affects, we can instead look at a given ARC and determine its patches. So instead of having one string per patch, we could have many patches which are pointed to by one string (generally speaking).

Essentially, this means changing the structure to be more like a hierarchy/tree.

Tree Structure

Example high-level representation:

{
res: {
Stage: {
D_MN01: {
R00_00: { patches: [] },
R01_00: { patches: [] },
R03_00: { patches: [] },
R05_00: { patches: [] },
R06_00: { patches: [] },
R07_00: { patches: [] },
R08_00: { patches: [] },
R09_00: { patches: [] },
R10_00: { patches: [] },
R11_00: { patches: [] },
R12_00: { patches: [] },
R13_00: { patches: [] },
},
D_MN01B: {
R51_00: { patches: [] },
},
D_MN04: {
R01_00: { patches: [] },
R03_00: { patches: [] },
R04_00: { patches: [] },
R06_00: { patches: [] },
R07_00: { patches: [] },
R09_00: { patches: [] },
R11_00: { patches: [] },
R14_00: { patches: [] },
R16_00: { patches: [] },
R17_00: { patches: [] },
},
D_MN05: {
R00_00: { patches: [] },
R01_00: { patches: [] },
R02_00: { patches: [] },
R03_00: { patches: [] },
R05_00: { patches: [] },
R09_00: { patches: [] },
R10_00: { patches: [] },
R11_00: { patches: [] },
R22_00: { patches: [] },
},
// ...
},
},
}
tip

That looks an awful lot like the game's directory structure.

Describing the tree will have some overhead, but we will be eliminating a ton of wasteful string data, so we will have plenty of space to work with.

Building the structure

  • We can treat each directory and file as a node.
    • The nodes themeselves can be stored in an array.
  • We need to be able to look at a node and determine if it is a file or a directory.
  • If the node is a directory, we need to be able to find its children.
  • If the node is a file, we need to be able to find its patches.
  • We need to be able to determine the string name of each node.
    • For example, "res" => "Stage" => "D_MN01"

Let us define a node structure at a high-level:

NameDescription
nameSomething like "res" or "D_MN05".
isDirIs this a directory or a file?
children(directory only) Child nodes.
patches(file only) Patches for this (arc) file.

This is a little too abstract and needs to be broken down.

First, let's learn from the RARC structure and use a string table.

We will end up with something like this:

72 65 73 00 53 74 61 67 65 00 44 5F 4D 4E 30 35  res.Stage.D_MN05
00 44 5F 4D 4E 30 34 00 44 5F 4D 4E 30 31 00 44 .D_MN04.D_MN01.D
5F 4D 4E 30 31 42 00 44 5F 4D 4E 31 30 00 44 5F _MN01B.D_MN10.D_
4D 4E 31 30 42 00 44 5F 4D 4E 31 31 00 44 5F 4D MN10B.D_MN11.D_M
4E 31 31 42 00 44 5F 4D 4E 30 36 00 44 5F 4D 4E N11B.D_MN06.D_MN
30 36 42 00 44 5F 4D 4E 30 37 00 44 5F 4D 4E 30 06B.D_MN07.D_MN0
37 42 00 44 5F 4D 4E 30 38 00 44 5F 4D 4E 30 39 7B.D_MN08.D_MN09
00 52 5F 53 50 30 31 00 44 5F 53 42 31 30 00 46 .R_SP01.D_SB10.F
5F 53 50 31 30 38 00 52 5F 53 50 31 30 39 00 46 _SP108.R_SP109.F
5F 53 50 31 32 31 00 46 5F 53 50 31 30 39 00 46 _SP121.F_SP109.F
5F 53 50 31 31 31 00 46 5F 53 50 31 31 33 00 44 _SP111.F_SP113.D
5F 53 42 30 33 00 46 5F 53 50 31 31 35 00 46 5F _SB03.F_SP115.F_
53 50 31 31 30 00 44 5F 53 42 30 32 00 46 5F 53 SP110.D_SB02.F_S
50 31 32 32 00 46 5F 53 50 31 32 34 00 44 5F 53 P122.F_SP124.D_S
42 30 34 00 46 5F 53 50 31 31 38 00 46 5F 53 50 B04.F_SP118.F_SP
31 31 34 00 44 5F 53 42 30 30 00 46 5F 53 50 31 114.D_SB00.F_SP1
31 37 00 46 5F 53 50 31 31 36 00 62 6D 67 72 65 17.F_SP116.bmgre
73 35 00 62 6D 67 72 65 73 31 00 62 6D 67 72 65 s5.bmgres1.bmgre
73 36 00 62 6D 67 72 65 73 34 00 62 6D 67 72 65 s6.bmgres4.bmgre
73 32 00 62 6D 67 72 65 73 38 00 62 6D 67 72 65 s2.bmgres8.bmgre
73 37 00 52 32 32 5F 30 30 00 52 30 30 5F 30 30 s7.R22_00.R00_00
00 52 30 39 5F 30 30 00 52 30 32 5F 30 30 00 52 .R09_00.R02_00.R
30 35 5F 30 30 00 52 30 33 5F 30 30 00 52 30 31 05_00.R03_00.R01
5F 30 30 00 52 31 30 5F 30 30 00 52 31 31 5F 30 _00.R10_00.R11_0
30 00 52 31 34 5F 30 30 00 52 30 34 5F 30 30 00 0.R14_00.R04_00.
52 30 36 5F 30 30 00 52 30 37 5F 30 30 00 52 31 R06_00.R07_00.R1
37 5F 30 30 00 52 31 36 5F 30 30 00 52 30 38 5F 7_00.R16_00.R08_
30 30 00 52 31 32 5F 30 30 00 52 31 33 5F 30 30 00.R12_00.R13_00
00 52 35 31 5F 30 30 00 52 31 35 5F 30 30 00 00 .R51_00.R15_00..
note

Notice that we only need one copy of "R01_00" even though it is used in D_MN01, D_MN04, D_MN05, and I'm sure plenty of others.

Revisiting the structure

NameTypeDescription
strTableOffsetu16?Offset in string table
isDir?Is this a directory or a file?
children?(directory only) Child nodes.
patches?(file only) Patches for this (arc) file.

Let's look at "patches" now.

A node can have an arbitrary number of patches, so let's go ahead and pull that out into its own table.

NameTypeDescription
strTableOffsetu16?Offset in string table
isDir?Is this a directory or a file?
children?(directory only) Child nodes.
patchTableIndexu16?(file only) Patches for this (arc) file.
numPatchesu8?(file only) Number of patches for this (arc) file.

Let's look at "children" now.

A child is a Node, and we already have a table for this. In fact, this structure we are describing is an entry in that table.

NameTypeDescription
strTableOffsetu16?Offset in string table
isDir?Is this a directory or a file?
nodeTableIndexu16?(directory only) Index of first child node.
numChildrenu8?(directory only) Number of Child nodes.
patchTableIndexu16?(file only) Patches for this (arc) file.
numPatchesu8?(file only) Number of patches for this (arc) file.

Let's see how much room this takes up:

NameType
isDir1 bit
strTableOffset15 bits (10 is actually plenty here)
nodeTableIndex or patchTableIndexu16
numChildren or numPatchesu8

This takes up 5 bytes, which we can round up to 8. If we can get this down to 4 bytes, the space we need for the node table will be cut in half.

We'll come back to this.

Patches

When the ARC file is loaded, its path such as /res/Stage/D_MN01/R01_00.arc is converted to a u32 id called entryNum.

Let's imagine we have a table which we will refer to as the RuntimeTable containing entries like the following:
(A better table name will be at end of this article once I come up with one.)

NameType
entryNumu32
patchTableIndexu16
numPatchesu16

Whenever an ARC is loaded, we can look at its entryNum then scan the above table. If we find a match, we can use patchTableIndex and numPatches alongside the patch table itself to handle applying the appropriate patches.

Ideally, the only data we would have in the seed GCI's ARC patch section are the above RuntimeTable and the patch table. Unfortunately, we must wait until runtime to accurately convert filepaths to entryNums.

Let's look at the chunks we have mentioned so far:

NameNeeded when?
nodeTableNot needed after creating runtimeTable
stringTableNot needed after creating runtimeTable
patchTableNeeded
runtimeTableGenerated at runtime

Generating the RuntimeTable

{
res: {
Stage: {
D_MN01: {
R11_00: { patches: [] },
R12_00: { patches: [] },
R13_00: { patches: [] },
},
D_MN01B: {
R51_00: { patches: [] },
},
D_MN04: {
R01_00: { patches: [] },
R03_00: { patches: [] },
R04_00: { patches: [] },
},
},
},
},

Let's pretend the above represents the ARCs which we want to patch.

To generate the runtimeTable, we need to navigate through the tree and convert each File node into the following:

NameType
entryNumu32
patchTableIndexu16
numPatchesu16

I'm going to keep track of a variable called currentPatchIndex to make a point later.

Here is how that (depth-first) traversal would look:

  • currentPatchIndex is 0.
  • At root. Not a file. numChildren is 1.
  • At res. Not a file. numChildren is 1.
  • At Stage. Not a file. numChildren is 3.
  • At D_MN01. Not a file. numChildren is 3.
  • At R11_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 0. Copy into RuntimeTableEntry0.
    • the node's numPatches property is 1. Copy into RuntimeTableEntry0.
    • Increase currentPatchIndex by the number of patches (1).
      • currentPatchIndex becomes 1.
  • At R12_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 1. Copy into RuntimeTableEntry1.
    • the node's numPatches property is 3. Copy into RuntimeTableEntry1.
    • Increase currentPatchIndex by the number of patches (3).
      • currentPatchIndex becomes 4.
  • At R13_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 4. Copy into RuntimeTableEntry2.
    • the node's numPatches property is 1. Copy into RuntimeTableEntry2.
    • Increase currentPatchIndex by the number of patches (1).
      • currentPatchIndex becomes 5.
  • (That was the last entry in D_MN01, so will go to next child of Stage)
  • At D_MN01B. Not a file. numChildren is 1.
  • At R51_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 5. Copy into RuntimeTableEntry3.
    • the node's numPatches property is 2. Copy into RuntimeTableEntry3.
    • Increase currentPatchIndex by the number of patches (2).
      • currentPatchIndex becomes 7.
  • (That was the last entry in D_MN01B, so will go to next child of Stage)
  • At D_MN04. Not a file. numChildren is 3.
  • At R01_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 7. Copy into RuntimeTableEntry4.
    • the node's numPatches property is 4. Copy into RuntimeTableEntry4.
    • Increase currentPatchIndex by the number of patches (4).
      • currentPatchIndex becomes 11.
  • At R03_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 11. Copy into RuntimeTableEntry5.
    • the node's numPatches property is 1. Copy into RuntimeTableEntry5.
    • Increase currentPatchIndex by the number of patches (1).
      • currentPatchIndex becomes 12.
  • At R04_00. Is a file.
    • currentPatchIndex and the node's patchTableOffset property are both 12. Copy into RuntimeTableEntry6.
    • the node's numPatches property is 1. Copy into RuntimeTableEntry6.
    • Increase currentPatchIndex by the number of patches (1).
      • currentPatchIndex becomes 13.
  • (That was the last entry in D_MN04)
  • (That was the last entry in Stage)
  • (That was the last entry in res)
  • (That was the last entry of the root)
  • We are done.

The key takeaways are the following:

  1. The traversal is deterministic (will always be done in the same order).
  2. currentPatchIndex and patchTableOffset are equal every step of the way.

Thus we can conclude:

  • We do not need to store the patchTableOffset in the node data.

Let's look at what a Node might look like now:

NameType
isDir1 bit
strTableOffset15 bits (10 is actually plenty here)
firstChildNodeIndex(directory only) u16
numChildren or numPatchesu8
  • In the case of a File, we only need 24 bits.
  • In the case of a Directory, we need 40 bits.

If we can get it down to <= 32 bits in the case of a Directory, then we can cut the nodeTable size in half.

DirInfoTable

Here is something we can do:

  • Create another table called dirInfoTable which has entries like the following:
NameType
firstChildNodeIndexu16
numChildrenu16
  • Then change Node to look like this:
NameType
isDir1 bit
strTableOffset15 bits (10 is actually plenty here)
dirInfoIndex (dir) or numPatches (file)u8

We have pulled out the data which is only needed for the Directory nodes into their own table. There is only one entry for a directory node, and 0 for a file node. Since the majority of our nodes are files, this saves quite a bit of space.

So now both the File node and Directory node only need 24 bytes.

note

We can store dirInfoIndex as a u8 because there is a maximum of 90 directory nodes in the game based on the exhaustive list of arc files which is well under the 255 max for a u8.

Probably won't be doing anywhere close to 255 patches on an individual arc, so u8 for numPatches should be fine as well.

But we can do even better.

Storing this as 3 bytes would force us to round up to 4, but we can store it in 2 arrays so that we only use 3 bytes while still having the data nicely aligned.

First array will contain NodeInfoA (size: 2 bytes):

NameType
isDir1 bit
Reserved bits (more on this later)3 bits
strTableOffset12 bits

Second array will contain NodeInfoB (size: 1 byte):

NameType
dirInfoIndex (dir) or numPatches (file)u8

At this point, we have all of the pieces we need to describe the following:

  • arcsToPatch => [arcA, arcD, arcQ, arc7, arcC, arc2, ...]

Now we just need to go discuss this part:

  • arcA => [patch0, patch1, patch2, ...]

Patches Part 2

A patch is made up of the following information:

  • Where should we overwrite bytes
  • What value should we write there

Patch offset

The largest arc file is /res/Object/Demo28_01.arc which is 3603200 bytes when uncompressed, or 0x0036FB00.

This means 3 bytes will always be enough to specify the offset at which we will write the patch.

Patch contents

The first thing to note is that the patch we want to write could be 1, 2, or 4 bytes, so we will need a way to specify how many bytes we should write.

Let's use a u8 enum for this.

Here is what we have so far:

NameType
patchTypeu8
offset3 bytes
Remaining space4 bytes

We can use the patchType enum to specify what is in the remaining space.

For example, if we needed to patch an itemId which is 1 byte, we could have something like the following:

[00 05 E6 EC 00 00 00 45]

meaning:

  • patchType: 0 (ItemId)
  • offset: 0x05E6EC
  • value: 0x45 (patchType indicates that the value is 1 byte)

Example 2:

[01 02 FB 6C 00 00 AB CD]

meaning:

  • patchType: 1 (ItemMessage)
  • offset: 0x02FB6C
  • value: 0xABCD (patchType indicates that the value is 2 bytes)

If we have a type of patch that only needs to write 1, 2, 3, or 4 bytes, we can fit that into the remaining space.

But what if we want to write more than 4 bytes?

Patch contents extended

Let's create another chunk and call it patchExtensions. It is a stream of bytes which contains data for patches that are too big to fit into the 2nd half of the patch.

A patch's patchType will indicate if a patch uses the patchExtensions chunk.

For example:

[AB 01 A6 6F 01 23 00 0C]

meaning:

  • patchType: 0xAB (LongPatch) (enum value was chosen arbitrarily)
  • offset: 0x01A66F
  • patchExtensionsOffset: 0x0123
  • patchBytelength: 0x000C

Notice that the 2nd group of 4 bytes has a completely different meaning than before. That is the power of using the patchType enum -- the remaining 4 bytes can be interpreted according to the value of the enum.

The patchExtensions chunk is a stream of bytes, so according to the above Patch, we should start at byte 0x123 in the extensions chunk and copy 0xC bytes into the arc data starting at offset 0x01A66F.

Here is an example of another kind of patch you might use:

[CD 0B FA E0 00 F3 00 FF]

meaning:

  • patchType: 0xCD (LongPatchSkipBytes)
  • offset: 0x0BFAE0
  • patchExtensionsOffset: 0x00F3
  • skipIfByteIs: 0xFF

We didn't specify the byteLength, but let's check what the data looks like in the extensions section:

[00 08 67 FF FF 63 FF FF 12 34]

Let's assume that LongPatchSkipBytes means that the first 2 bytes in the extensions section will indicate the length.

In this case, the byte length is 8.

The skipIfByteIs is 0xFF, so we will copy the next 8 bytes, but we will skip over any bytes which have a value of 0xFF.

Those were just some examples. You can really do whatever you want with the enum, and the good news is that you can easily add new enum types without breaking backwards compatibility.

note

This extension section is just an idea. It wouldn't be included until/unless we actually need it.

Patch Content Optimization

Our patches currently look like the following:

NameType
patchTypeu8
offset3 bytes
Remaining space4 bytes

In terms of our current seed's ARCP section size, these actually take up the majority of the space, so if we can improve this we will get some pretty significant gains.

Of our 332 patches currently, 288 only need one byte of the "Remaining space" bytes (meaning they waste 3 bytes), and the other 44 are message indexes (meaning they waste 2 bytes).

Let's create another chunk called patchContent and write the values that we would have put in the "Remaining space" in the above table there in a back-to-back fashion.

As we iterate through the patches (which are now 4 bytes long), we can use patchType to determine how many bytes to read from the patchContent. We will keep track of our current position in patchContent (which is essentially a data stream) as we do this.

So patches look like this now:

NameType
patchTypeu8
offset (to apply patch in arc)3 bytes

Special String Values

Earlier we described the string table. The keen eye may have noticed that it had bmgres4 in it, but nothing like Msgus.

This is because the exact name that should be used in place of Msgus depends on the TP region you are playing (US, PAL, JP) and will be filled in at runtime by the Randomizer.

We can use a bit in the NodeA entry to indicate that it is a string enum and not a value in the string table as follows:

NameType
isDir1 bit
isStringEnum1 bit
Reserved bits (more on this later)2 bits
strTableOffset or stringEnum12 bits

So for example:

[80 05] is a directory node, and its name is stored at offset 0x5 in the string table.

[C0 AA] is a directory node. It uses a string enum rather than the string table, and its enum is 0xAA. The Randomizer was compiled for the US version, so it knows that the value of enum 0xAA is Msgus.

That should be all of the areas we need to discuss regarding the inner structure.

The entire structure is split up into the chunks we discussed above, so we will use a header to indicate things like the offset to a chunk and how many entries are in it.

OffsetTypeNameDescription
0x00char[4]offsetAlways "ARCP"
0x04u8majorVersionThis is independent of the randomizer version
0x05u8minorVersionThis is independent of the randomizer version
0x06u16totalSizeTotal byte size of ARCP section
0x08u16nodeInfoAOffsetOffset to NodeInfoA table
0x0Au16nodeInfoBOffsetOffset to NodeInfoB table
0x0Cu16numNodesNumber of entries in NodeInfoA and NodeInfoB tables
0x0Eu16dirInfoOffsetOffset to DirInfo table
0x10u16numDirInfosNumber of entries in DirInfo table
0x12u16strTableOffsetOffset to string table
0x14u16patchTableOffsetOffset to Patch table
0x16u16numPatchesNumber of entries in Patch table
0x18u16patchExtOffsetOffset to PatchExtensions chunk
0x8u8[8]padding/reservedCurrently unused, rounds header to 0x20 bytes
  • The section title of "ARCP" (short for Arc Patch) is to make it easier to visually understand what you are looking at when inspecting in a hex editor. This is inspired by how many of the files are already handled in TP. Might be useful some other way at some point as well. We also have room for it.

Major version is a u8 which gets incremented every time there is a change which breaks backwards compatibility.

  • The benefit of storing a version number at this level rather than just at the top SeedInfo level is that the Randomizer can check the ArcPatch section's version and run the appropriate routines based off of that (if that version of the Randomizer was supporting multiple ARCP major versions at the same time, for example).

Minor version number is more for debugging purposes. This would be incremented whenever a non-breaking change is made, such as adding a patchType enum or a string enum such as the one which is used for Msgus. Non-breaking in the sense that version 42.4 is essentially a superset of 42.3.

  • Incrementing the major or minor version of the ARCP section would also increment the version number of SeedInfo as appropriate.

  • totalSize is the total number of bytes of the ARCP section. Each chunk will be rounded to a multiple of 0x10, so this value will also always be rounded to 0x10.

  • patchExtOffset will be 0x0000 if there is no patchExtensions chunk (because it is not needed). Or realistically it will always be 0x0000 until we actually have something that needs the extensions chunk.

Now we are ready to put everything together.

Structure Definition

The ARCP section will be broken into the following chunks:

NameType
Headerobject
NodeInfoAarray
NodeInfoBarray
DirInfoarray
StrTablechunk
PatchTablearray
PatchContentchunk
PatchExtensionschunk

At runtime, this will transform into another block of data:

NameTypesource
RuntimeHeaderobjectgenerated
ArcListarraygenerated
PatchTablearraygenerated
PatchExtensionschunkcopied
note

We may not actually want to add the PatchExtensions part until/unless it becomes necessary, but we can leave space for it in the header to make it easily backwards compatible.


Structures in GCI

Header (size: 0x20):

OffsetTypeNameDescription
0x00char[4]offsetAlways "ARCP"
0x04u8majorVersionThis is independent of the randomizer version
0x05u8minorVersionThis is independent of the randomizer version
0x06u16totalSizeTotal byte size of ARCP section
0x08u16numNodesNumber of entries in NodeInfoA and NodeInfoB tables
0x0Au16nodeInfoAOffsetOffset to NodeInfoA table
0x0Cu16nodeInfoBOffsetOffset to NodeInfoB table
0x0Eu16numDirInfosNumber of entries in DirInfo table
0x10u16dirInfoOffsetOffset to DirInfo table
0x12u16strTableOffsetOffset to string table
0x14u16numPatchesNumber of entries in Patch table
0x16u16patchTableOffsetOffset to Patch table
0x18u16patchContentOffsetOffset to Patch content stream
0x1Au16numArcsNumber of nodes of type "File"
0x1Cu16patchExtOffsetOffset to PatchExtensions chunk (0 if unused)
0x1Eu8[2]padding/reservedCurrently unused, rounds header to 0x20 bytes

NodeInfoA (size: 0x2):

TypeName
1 bitisDir
1 bitisStringEnum
2 bitsReserved/unused bits
12 bitsstrTableOffset (u16 & 0xFFF) or stringEnum (u16 & 0xFF)

NodeInfoB (size: 0x1):

OffsetTypeName
0x0u8dirInfoIndex (dir) or numPatches (file)

DirInfo (size: 0x4):

OffsetTypeName
0x0u16firstChildIndex
0x2u16numChildren

StrTable:

Back-to-back null-terminated strings.

PatchTable (size: 0x4):

OffsetTypeName
0x0u8patchType
0x0 & 0x00FFFFFFu32 (3 bytes)offset

PatchContent:

Stream of bytes.

PatchExtensions:

Optional chunk of bytes. Patches can point to data in here.

Generated Structures

RuntimeHeader:

Not really in the scope of this article to define an exact structure for this, but it will need something to do the following:

  • pointer/offset to ArcList
  • pointer/offset to PatchTable
  • pointer/offset to PatchExtensions
  • way to free data

ArcList:

OffsetTypeName
0x0u32entryNum (returned from my_DVDConvertPathToEntrynum)
0x4u16patchTableIndex
0x6u16numPatches

PatchTable (size: 0x8):

OffsetTypeName
0x0u8 (enum)patchType
0x0 & 0x00FFFFFFu32 (3 bytes)offset
0x44 bytesremainingSpace

PatchExtensions:

Copied directly from ARCP section. Optional chunk of bytes. Patches can point to data in here.

Other thoughts

There is another optimization which can be done. You can make a change such that:

  • [bmgres1,bmgres4,bmgres5,bmgres6,bmgres7,bmgres8]

changes to something more like:

  • [bmgres] => [1,4,5,6,7,8]

for this and similar strings, but this adds a lot of complexity (to generating the GCI) and saves very little space, so it is not really worth it.

Here is validated example data which has you can view in a hex editor:

Download arcpExampleData.bin