ScriptParseTree Asset

From COD Engine Research
Jump to: navigation, search

The ScriptParseTree asset is a new asset in Black Ops 2 and this is the only Call of Duty game it has appeared in so far. This is Treyarch's answer to GSCs (Keep in mind that on Black Ops 1 GSCs were still rawfiles). Compared to IW's equivalent scriptfile asset, the ScriptParseTree has proven to give slightly more information and thus are easier to reverse. The basic structure is identical to the rawfiles

struct ScriptParseTree
{
  const char *name;
  int len;
  char *buffer;
};

Simply an uncompressed buffer holding the GSC data. The binary data is then passed to the GSC Virtual Machine (VM) to be ran. Now for the header of the binary data

#define GSCMagic = 0x804753430D0A0006
 
struct GSC_OBJ
{
  char magic[8];
  unsigned int source_crc;
  unsigned int include_offset;
  unsigned int animtree_offset;
  unsigned int cseg_offset;
  unsigned int stringtablefixup_offset;
  unsigned int exports_offset;
  unsigned int imports_offset;
  unsigned int fixup_offset;
  unsigned int profile_offset;
  unsigned int cseg_size;
  unsigned __int16 name;
  unsigned __int16 stringtablefixup_count;
  unsigned __int16 exports_count;
  unsigned __int16 imports_count;
  unsigned __int16 fixup_count;
  unsigned __int16 profile_count;
  char include_count;
  char animtree_count;
  char flags;
};

Following the header is a section of null terminated strings (undetermined size/count) called the stringtable section. This section is also padded at the end to a 4-byte buffer. The following sections will reference back to this section when a string is used. Keep in mind that the offsets are simply just an offset in the buffer.

Include Section

The first known section after the string table is the include section, and this should be the first section dumped as well. This section is used to replace the following in a raw gsc

#include maps\mp\_utility;
#include common_scripts\utility;

The include section is probably the easiest section to reverse. Each include exists as a single integer which holds the offset of the include's string in the buffer.

Using Animtree Section

Although this section doesn't follow the include section, it is what needs to be dumped second. When a GSC wants to directly access and animtree, then the following code is added (for animtree "mp_vehicles")

#using_animtree( "mp_vehicles" );

Then animations can be accessed from the tree as follows (Using the %)

destructible_anims[ "car" ] = %veh_car_destroy;

Knowing this, the binary data section can be reversed fairly easily. The structure for their definitions in this section is

struct GSC_ANIMNODE_ITEM
{
  unsigned int name;
  unsigned int address;
};
 
struct GSC_ANIMTREE_ITEM
{
  unsigned __int16 name;
  unsigned __int16 num_tree_address;
  unsigned __int16 num_node_address;
  char pad[2];
  GSC_ANIMNODE_ITEM nodes[num_node_address];
};

XRefs are explained in better detail in the code section. Each animtree thats included will have an GSC_ANIMTREE_ITEM struct, and every time an animation is used it will be defined in the references.

StringTable Reference Section

The stringtable reference section not only defines any function arguments that are defined with function declarations, but any local variable a function might create. Although they are defined the same, the difference between the two is highlighted in the code section. This section maps instances in code of a string (whether it be an actual string, or a literal name) to the string itself in the stringtable. Each reference will be defined as follows

struct GSC_STRINGTABLE_ITEM
{
  unsigned __int16 string;
  char num_address;
  char type;
  unsigned int references[num_address];
};

The references will explained in more detail in the code section.

Imports Section

If a function is called by a function in this GSC, it must be defined here in the imports section to import it into use, even if that function is located inside this GSC. First the struct

struct GSC_IMPORT_ITEM
{
  unsigned __int16 name;
  unsigned __int16 name_space;
  unsigned __int16 num_address;
  char param_count;
  char flags;
  unsigned int references[referenceCount];
};

name obviously holds the name of the called function. References are explained in the code section. If the function is not located in this GSC or in a "using" GSC, then the function must have a path with the name of the GSC that it is defined in, and this is where the name_space points to. eg

thread maps\mp\gametypes\_spawning::init();

Exports Section

This is the section that will be cycled through when dumping the GSC. Every function that is defined in this GSC is here, and then any incoming import requests search here for the offset of code needed to continue execution.

struct GSC_EXPORT_ITEM
{
  unsigned int checksum;
  unsigned int address;
  unsigned __int16 name;
  char param_count;
  char flags;
};

The executionDataOffset is an offset in the code section defining where this function begins.

Executable Data

This section is by far the trickiest section. This is where the actual code the functions are compiled into is stored. The first thing that will be needed is a list of operation codes (OPcodes). These opcodes should be the same for all systems.

enum OPCodes : char
{
  OP_End = 0x0,
  OP_Return = 0x1,
  OP_GetUndefined = 0x2,
  OP_GetZero = 0x3,
  OP_GetByte = 0x4,
  OP_GetNegByte = 0x5,
  OP_GetUnsignedShort = 0x6,
  OP_GetNegUnsignedShort = 0x7,
  OP_GetInteger = 0x8,
  OP_GetFloat = 0x9,
  OP_GetString = 0xA,
  OP_GetIString = 0xB,
  OP_GetVector = 0xC,
  OP_GetLevelObject = 0xD,
  OP_GetAnimObject = 0xE,
  OP_GetSelf = 0xF,
  OP_GetLevel = 0x10,
  OP_GetGame = 0x11,
  OP_GetAnim = 0x12,
  OP_GetAnimation = 0x13,
  OP_GetGameRef = 0x14,
  OP_GetFunction = 0x15,
  OP_CreateLocalVariable = 0x16,
  OP_SafeCreateLocalVariables = 0x17,
  OP_RemoveLocalVariables = 0x18,
  OP_EvalLocalVariableCached = 0x19,
  OP_EvalArray = 0x1A,
  OP_EvalLocalArrayRefCached = 0x1B,
  OP_EvalArrayRef = 0x1C,
  OP_ClearArray = 0x1D,
  OP_EmptyArray = 0x1E,
  OP_GetSelfObject = 0x1F,
  OP_EvalFieldVariable = 0x20,
  OP_EvalFieldVariableRef = 0x21,
  OP_ClearFieldVariable = 0x22,
  OP_SafeSetVariableFieldCached = 0x23,
  OP_SafeSetWaittillVariableFieldCached = 0x24,
  OP_ClearParams = 0x25,
  OP_CheckClearParams = 0x26,
  OP_EvalLocalVariableRefCached = 0x27,
  OP_SetVariableField = 0x28,
  OP_CallBuiltin = 0x29,
  OP_CallBuiltinMethod = 0x2A,
  OP_Wait = 0x2B,
  OP_WaitTillFrameEnd = 0x2C,
  OP_PreScriptCall = 0x2D,
  OP_ScriptFunctionCall = 0x2E,
  OP_ScriptFunctionCallPointer = 0x2F,
  OP_ScriptMethodCall = 0x30,
  OP_ScriptMethodCallPointer = 0x31,
  OP_ScriptThreadCall = 0x32,
  OP_ScriptThreadCallPointer = 0x33,
  OP_ScriptMethodThreadCall = 0x34,
  OP_ScriptMethodThreadCallPointer = 0x35,
  OP_DecTop = 0x36,
  OP_CastFieldObject = 0x37,
  OP_CastBool = 0x38,
  OP_BoolNot = 0x39,
  OP_BoolComplement = 0x3A,
  OP_JumpOnFalse = 0x3B,
  OP_JumpOnTrue = 0x3C,
  OP_JumpOnFalseExpr = 0x3D,
  OP_JumpOnTrueExpr = 0x3E,
  OP_Jump = 0x3F,
  OP_JumpBack = 0x40,
  OP_Inc = 0x41,
  OP_Dec = 0x42,
  OP_Bit_Or = 0x43,
  OP_Bit_Xor = 0x44,
  OP_Bit_And = 0x45,
  OP_Equal = 0x46,
  OP_NotEqual = 0x47,
  OP_LessThan = 0x48,
  OP_GreaterThan = 0x49,
  OP_LessThanOrEqualTo = 0x4A,
  OP_GreaterThanOrEqualTo = 0x4B,
  OP_ShiftLeft = 0x4C,
  OP_ShiftRight = 0x4D,
  OP_Plus = 0x4E,
  OP_Minus = 0x4F,
  OP_Multiply = 0x50,
  OP_Divide = 0x51,
  OP_Modulus = 0x52,
  OP_SizeOf = 0x53,
  OP_WaitTillMatch = 0x54,
  OP_WaitTill = 0x55,
  OP_Notify = 0x56,
  OP_EndOn = 0x57,
  OP_VoidCodePos = 0x58,
  OP_Switch = 0x59,
  OP_EndSwitch = 0x5A,
  OP_Vector = 0x5B,
  OP_GetHash = 0x5C,
  OP_RealWait = 0x5D,
  OP_VectorConstant = 0x5E,
  OP_IsDefined = 0x5F,
  OP_VectorScale = 0x60,
  OP_AnglesToUp = 0x61,
  OP_AnglesToRight = 0x62,
  OP_AnglesToForward = 0x63,
  OP_AngleClamp180 = 0x64,
  OP_VectorToAngles = 0x65,
  OP_Abs = 0x66,
  OP_GetTime = 0x67,
  OP_GetDvar = 0x68,
  OP_GetDvarInt = 0x69,
  OP_GetDvarFloat = 0x6A,
  OP_GetDvarVector = 0x6B,
  OP_GetDvarColorRed = 0x6C,
  OP_GetDvarColorGreen = 0x6D,
  OP_GetDvarColorBlue = 0x6E,
  OP_GetDvarColorAlpha = 0x6F,
  OP_FirstArrayKey = 0x70,
  OP_NextArrayKey = 0x71,
  OP_ProfileStart = 0x72,
  OP_ProfileStop = 0x73,
  OP_SafeDecTop = 0x74,
  OP_Nop = 0x75,
  OP_Abort = 0x76,
  OP_Object = 0x77,
  OP_ThreadObject = 0x78,
  OP_EvalLocalVariable = 0x79,
  OP_EvalLocalVariableRef = 0x7A,
  OP_DevblockBegin = 0x7B,
  OP_DevblockEnd = 0x7C,
  OP_Breakpoint = 0x7D,
  OP_AutoBreakpoint = 0x7E,
  OP_ErrorBreakpoint = 0x7F,
  OP_WatchBreakpoint = 0x80,
  OP_NotifyBreakpoint = 0x81,
  OP_Count = 0x82,
};

Notice that 0x7B is the highest opcode, but 0x7F is still valid. In truth anything over 0x7B will register as NOP. The executable does some odd jumping around for each OP code, so a recap of each OP code will try to be added. In the following functions "currentFunctionDataOffset" is defined as an integer that points to the current OP code being read. Keep in mind that an OP code is read, the data offset is increased by 1, and then the following code is ran. OP codes are parsed in the order they are read, but if a dump is being done then parts are going to be dumped in reverse order. Here is an example of a series of OP codes...

OP_PreScriptCall
OP_GetString		"1350"
OP_GetString		"scr_veh_health_tank"
OP_CallBuiltin 		2, "setdvar"
OP_clearparams

This would be the "asm" of the GSC, and would look like this...

setdvar( "scr_veh_health_tank", "1350" );

OP_End

If FunctionDeclaration->crc32 equals to returned crc then break. This is how you calculate the CRC32 below:

static const unsigned int kCrc32Table[] = {
    0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f,
	0xe963a535, 0x9e6495a3,	0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988,
	0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2,
	0xf3b97148, 0x84be41de,	0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7,
	0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec,	0x14015c4f, 0x63066cd9,
	0xfa0f3d63, 0x8d080df5,	0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172,
	0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b,	0x35b5a8fa, 0x42b2986c,
	0xdbbbc9d6, 0xacbcf940,	0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59,
	0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423,
	0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924,
	0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d,	0x76dc4190, 0x01db7106,
	0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433,
	0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d,
	0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e,
	0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950,
	0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65,
	0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7,
	0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0,
	0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa,
	0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f,
	0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81,
	0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a,
	0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84,
	0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1,
	0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb,
	0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc,
	0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e,
	0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b,
	0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55,
	0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236,
	0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28,
	0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d,
	0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f,
	0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38,
	0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242,
	0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777,
	0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69,
	0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2,
	0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc,
	0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9,
	0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693,
	0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94,
	0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d
};
 
unsigned int crc(unsigned char* pData, unsigned int length) {
unsigned char* pCur = (unsigned char*)pData;
unsigned int remaining = length, _crc = ~0;
 
for (; remaining--; ++pCur)
 _crc = (_crc >> 8) ^ kCrc32Table[(_crc ^ *pCur) & 0xff];
 
return ~_crc;
}

For the CRC32 checksum parameters it's unsigned int crc(functionStart, currentFunctionDataOffset - functionStart). If the returned crc does not equal FunctionDeclaration->crc32 then write "return;" to the stack.

OP_Return

The return OP is exactly as expected. Any OP codes in the stack go after a "return". There is no data skipped, it is simply a 1 byte opcode. Keep in mind that if the game reaches a return, it leaves the function however for a complete dump, data must continue to be read and sometimes this data is unnecessary.

OP_GetUndefined & OP_GetZero

This simply represents the "undefined"/0 element in GSCs. It is a simple 1 byte opcode, and once read it is put right onto the stack. When dumping from the stack, simply print "undefined"/0 and continue.

OP_GetByte & OP_GetNegByte

These are used to hold any numbers that are small enough to be held in 1 byte. These are 2 byte opcode in that the first byte is the opcode and the second byte is an unsigned value. The opcode is added to the stack once read. When dumping from the stack the value is simply printed and then continued. Keep in mind that OP_GetNegByte should get a negative symbol before the value is printed.

OP_GetUnsignedShort & OP_GetNegUnsignedShort

These are used to hold any numbers that are small enough to be held in 2 byte. After reading the opcode, the following is used to determine the offset of the unsigned short value

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

Then "currentFunctionDataOffset" will be the offset of the unsigned short value to use. Then skip another 2 bytes for that value to continue reading. The opcode is added to the stack once read. When dumping from the stack the value is simply printed and then continued. Keep in mind that OP_GetNegUnsignedShort should get a negative symbol before the value is printed.

OP_GetInteger & OP_GetFloat

OP_GetInteger is used for any numbers that are large enough to warrant a signed integer, while OP_GetFloat is used for bigger numbers and decimals. After reading the opcode, the following is used to determine the offset of the value to get

currentFunctionDataOffset += 3;
currentFunctionDataOffset &= 0xFFFFFFFC;

At this point "currentFunctionDataOffset" will be the offset of the value to get, then skip another 4 bytes past the value to continue reading. The opcode is then added to the stack. When dumping the value is simply printed and then continued;

OP_GetString & OP_GetIString

These are very clearly used to reference strings in code. Strings are found by their references. To get the offset of the reference use the following

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

In the argument section should be an argument with a reference that is equal to "currentFunctionDataOffset" at this point. The name of the argument is the offset of the string to use in the string pool from the beginning of the GSC. Add another 2 to currentFunctionDataOffset to skip past the reference to continue reading. Keep in mind that OP_GetIString is used to get localized strings, and a & should be a added before the quotes.

OP_GetLevelObject & OP_GetLevel & OP_GetSelf & OP_GetSelfObject & OP_GetGame & OP_GetGameRef & OP_GetAnimObject

All of these are simple 1 byte opcodes that are added to the stack once they are read. They represent (and should be printed as) the "game", "level", "anim" and "self" objects in GSC code.

OP_GetAnimation

These are used to reference animations in an animtree that was included to this GSC. The name of the animation is found by a reference. To get the offset of the reference

currentFunctionDataOffset += 3;
currentFunctionDataOffset &= 0xFFFFFFFC;

There should be an animtree in the animtree include section that has an animation with a reference equal to "currentFunctionDataOffset" at this point. Add another 4 bytes to skip past the reference to continue reading opcodes. After this is read it is added to the stack. When dumping the reference animation name should be found and dumped (without quotes) preceeded by a % sign.

OP_GetFunction

This is used to reference a function without calling it. This allows you to set a variable to a function, and then call the variable later. There should be a function call with a reference equal to "currentFunctionDataOffset", and this is the function to use. This is then added to the stack. After that the following code is performed

currentFunctionDataOffset += 3;
currentFunctionDataOffset &= 0xFFFFFFFC;
currentFunctionDataOffset += 4;

to continue reading. When dumping the function name should be printed after "::".

OP_CreateLocalVariable

This opcode is used at the beginning of any function with arguments or local variables to define them. After the opcode is a single byte that is the number to define. Each variable argument is found by getting the reference for it in the argument section. To get the offset of each reference

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

That gets ran before to find a reference offset. There should be a argument with a reference equal to the "currentFunctionDataOffset". Then 2 is added to skip past the current reference before the above is started again. The function declaration states the number of arguments, any extra are local variables. Keep in mind the stack should be clear after this is ran.

OP_EvalLocalVariableCached1 & OP_EvalLocalVariableRefCached

This is a simple 2 byte opcode used to reference an argument or local variable. The second byte is the index of the local variable to use, defined by OP_CreateLocalVariable. This is then added to the stack to be dumped. To dump the local variable must be looked up, the argument found and the name referenced.

OP_EvalArray & OP_EvalArrayRef

These are simple 1 byte opcodes used to reference elements in an array. It is simply added to the stack to be dumped. When dumping, the next entity on the stack is the array to use and the next entity is the index in the array (Indexes are not always numbers).

OP_ClearArray

This is a 1 byte opcode used to set an array element to "undefined". This is not added to the stack, instead the stack is processed. The first entity on the stack is the array to use and the second is the array index. This element is then set to undefined. At the end of this, the stack should be empty.

OP_EmptyArray

This is a simple 1 byte opcode used to denote an empty array ("[]"). It is added right to the stack and simply printed when dumping.

OP_EvalFieldVariable & OP_EvalFieldVariableRef

These opcodes are used to reference any field variable. The names of the field variable is found by the offset of a reference. To find the offset of the reference

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

There should then be an argument in the argument section that has a reference equal to "currentFunctionDataOffset". Skip another 2 bytes past the reference to continue reading opcodes. The opcode is then added to the stack to be dumped later. The next entity on the stack is the item the field variable is for.

OP_ClearFieldVariable

This opcode is used to set a field variable element to "undefined". The name of the field variable is found by the offset of a reference. To find the offset of a reference

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

There should then be an argument in the argument section that has a reference equal to "currentFunctionDataOffset". Skip another 2 bytes past the reference to continue reading opcodes. This opcode is not added to the stack, instead the stack is processed. The only entity on the stack should be the item the field variable is for. Once this is processed the stack should be empty.

OP_checkclearparams & OP_clearparams

Used at the beginning of a function with no arguments. Clears the stack without processing it.

OP_SetVariableField

This is a 1 byte opcode used to set an entity to anything else. This is not added to the stack, instead the stack is processed. The first entity on the stack is the entity to set (left side of the "=" sign) with the second being the entity setting with (right side). At the end of this, the stack should be empty.

OP_CallBuiltin

This opcode is used to call a GSC function that is predefined in the executable. The name of the GSC function is found by a reference. The offset of the reference is actually the offset of this opcode. In the function call section should be a definition with a reference to this opcode. This opcode is then added to the stack to be dumped. When dumping, there should be an entity on the stack for each expected argument. To get to the next opcode, perform the following

currentFunctionDataOffset += 4;
currentFunctionDataOffset &= 0xFFFFFFFC;
currentFunctionDataOffset += 4;

Before adding the last 4, "currentFunctionDataOffset" will point to a value that holds the offset of the function to use in memory.

OP_ScriptMethodCall

This opcode is used to call a GSC function on an entity that is defined in another GSC. The name of the GSC function is found by a reference. The offset of the reference is actually the offset of this opcode. In the function call section should be a definition with a reference to this opcode. This opcode is then added to the stack to be dumped. When dumping, the first entity on the stack will be the entity the function is being called upon, followed by an entity for each argument in the function. To get to the next opcode, perform the following

currentFunctionDataOffset += 4;
currentFunctionDataOffset &= 0xFFFFFFFC;
currentFunctionDataOffset += 4;

Before adding the last 4, "currentFunctionDataOffset" will point to a value that holds the offset of the function to use in memory.

OP_wait

This is a simple 1 byte opcode used for the "wait()" function. This is not added to the stack, instead the stack is processed. The first (and only) entity on the stack should be the number of milliseconds to wait. At the end of this, the stack should be empty.

OP_waittillframeend

This is a simple 1 byte opcode used for "waittillframeend". This is not added to the stack, and is always ran with an empty stack.

OP_PreScriptCall

This is a simple 1 byte opcode used before all instructions and is equivalent to NOP.

OP_Script Call Functions

These opcodes are used to call a function and are very similar. The names of the function to call are found by a reference, and for all of these opcodes the reference is the offset of the opcode. "Method" opcodes have an extra entity on the stack as the entity the function is called upon. "Thread" opcodes are threading the function they call. The "Pointer" opcodes do not have a reference with the function's name, instead the function is a variable and an extra entity is on the stack. There will be an entity on the stack for each expected argument. To read past OP_ScriptFunctionCall, OP_ScriptMethodCall, OP_ScriptMethodCall2, and OP_ScriptMethodThreadCall, do

currentFunctionDataOffset += 4;
currentFunctionDataOffset &= 0xFFFFFFFC;
currentFunctionDataOffset += 4;

The "Pointer" opcodes are all 2 byte opcdes. All these opcodes are added to the stack to be dumped.

OP_clearparams

This is a 1 byte opcode and is used most often to process the stack. The stack is simply processed and should be clear by the end of this running.

OP_CastFieldObject

Not very well understood, skipping it seems to work fine.

OP_BoolNot

This is a simple 1 byte opcode used for the logical NOT operator ("!"). This is simply added to the stack to be dumped. When dumping, the next entity on the stack is the value to invert.

OP_jump Codes

Rather than have logical AND and OR operators in this language, it is similar to PPC in that there are "branches" or "jumps". Rather than using an AND operator, a series of conditional jumps are used to produce the same result. OP_jump is a simple absolute jump value. After reading the opcode,

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

At this point "currentFunctionDataOffset" will point to a "short" data value. This will be an offset in the GSC to jump to (Usually just a few bytes away). Skip another 2 bytes past this offset to continue reading opcodes. This opcode is then added to the stack to be dumped. The next entity in the stack is the condition to jump on.

Logical and Bitwise Operators

OP_inc & OP_dec are skipped by performing this operation

(++currentFunctionDataOffset) &= 0xFFFFFFFE;
currentFunctionDataOffset += 2;

They are added to the stack to be dumped. When dumping, the next member of the stack is the entity to shift, followed by "++" for increase and "--" for decrease.

The next few operators are all 1 byte operators. They are simply added to the stack to be dumped. When dumping, the next entity on the stack is the right side of the operator with the nxt entity being the left side.

opcode Operator
OP_bit_or |
OP_bit_and &
OP_equality ==
OP_inequality !=
OP_less <
OP_greater >
OP_less_equal <=
OP_greater_equal >=
OP_plus +
OP_minus -
OP_multiply *
OP_divide /
OP_mod %

OP_size

This opcode is used to get the ".size" of an array entity. Simply added to the stack for dumping, the next stack entity is the array entity.

OP_waittillmatch & OP_waittill & OP_notify

These 1 byte opcodes codes are used in combination with OP_endon for cross-thread syncing. Rather than being added to the stack, this processes the stack. The first entity on the stack will be the entity the function is being referenced to. The constant strings are next on the stack and should be the only remaining items on the stack. Following these opcodes are a OP_SafeSetWaittillVariableFieldCached for each variable to pass (Used to pass data at the point of the sync). This is a 2 byte opcode, with the second byte being the index of the local variable defined at the beginning of the function.

OP_endon

This 1 byte opcode is used in combination with the above codes for cross-thread syncing. Rather than being added to the stack, this processes the stack. The first entity on the stack will be the entity this function is being referenced to. The constant strings to end on are then placed on the stack.

OP_voidCodepos

Equivelent to NOP.

OP_switch & OP_endswitch

When a switch is used by the game, only the needed case is ran at a time. However when dumping, all cases must be processed. There are a few points of interest when using the switch, the first is found by doing

currentFunctionDataOffset += 3;
currentFunctionDataOffset &= 0xFFFFFFFC;

At this point "currentFunctionDataOffset" is a partial pointer to an integer, the case definition location (The switch case code is directly after this integer, with the case definitions directly after that). After setting "currentFunctionDataOffset" to this integer, perform this to get to the actual case definitions

currentFunctionDataOffset += 7;
currentFunctionDataOffset &= 0xFFFFFFFC;

Currently "currentFunctionDataOffset" will point to directly after the switch case code. The first integer is the switch case count. Following this is 8 bytes for each case in the switch. The first short in these 8 bytes is the "switch case type". There are currently 2 known switch types, integer (0x80) and string (0). If the type is integer then the short after the type is the index for this case. If the type is a string, then there will be an argument in the argument section with the offset of the short after the type. The 4 bytes after these 2 shorts is the offset for the section of code for this case. If the type is 0 and the short for the reference is 0, then this is the "default" case. The OP_endswitch is found at the end of the section of code for this switch, just before the case defintions. It is a 1 byte opcode, and after reading this the switch should be ended and the case defintions skipped.

OP_vector

This is a simple 1 byte opcode. Vectors are a standard l-type in GSC and simply defined in raw text by doing (number1, number2, number3). Once this opcode is read it is added to the stack to be dumped. When dumping there must be at least 3 more entities on the stack, one for each item in the vector.

OP_GetHash

This opcode is used to reference dvar entities in GSC. In GSC when referencing a dvar it appears as #"dvarName". Rather than storing the dvarName directly, the hash is stored and any dvar with a matching hash is used. To get the hash, perform

currentFunctionDataOffset += 3;
currentFunctionDataOffset &= 0xFFFFFFFC;

Then "currentFunctionDataOffset" will point to an integer that is the dvar's hash. This is then added to the stack to be dumped. To continue reading opcodes, skip another 4 bytes past the hash.

The hash they used is DJB2 Hash algorithm. This is the implementation:

unsigned int DvarHash(const char* dvar) {
    unsigned int hash = 5381;
 
    while (*dvar != 0)
	hash = (33 * hash) + tolower(*dvar++);
 
    return hash;
}

OP_GetSimpleVector

This 2 byte opcode is used to store very simple vectors. The second byte in the opcode is the flags for the vector elements. Each element in the vector is 0 by default. The first element is "-1" if the bitflag 0x10 is set, and "1" if 0x20 is set. The second element is "-1" if the bitflag 4 is set, and "1" if 8 is set. The third element is "-1" if the bitflag 1 is set, and "1" if 2 is set.

OP_isdefined

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "isDefined()" function. The next entity on the stack is the entity to check for existence.

OP_vectorscale

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "vector_scale()" function. The next entity on the stack is the vector entity to scale. The following entity is the scaler entity.

OP_anglestoup

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "anglesToUp()" function. The next entity on the stack is the angles to convert.

OP_anglestoright

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "anglesToRight()" function. The next entity on the stack is the angles to convert.

OP_anglestoforward

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "anglesToForward()" function. The next entity on the stack is the angles to convert.

OP_angleclamp180

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "angleClamp180()" function. The next entity on the stack is the angle to clamp.

OP_vectorstoangle

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "vectorsToAngle()" function. The next entity on the stack is the vector to convert.

OP_abs

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "abs()" or "absolute value" function. The next entity on the stack is the integer to convert.

OP_gettime

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getTime()" function.

OP_getdvar

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getDvar()" function. The next entity on the stack is the dvar to get, typically an OP_GetHash for the dvar's hash.

OP_getdvarint

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getDvarInt()" function. The next entity on the stack is the dvar to get, typically an OP_GetHash for the dvar's hash.

OP_getdvarfloat

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getDvarFloat()" function. The next entity on the stack is the dvar to get, typically an OP_GetHash for the dvar's hash.

OP_getdvarred

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getDvarRed()" function to grab the red component of a color dvar. The next entity on the stack is the dvar to get, typically an OP_GetHash for the dvar's hash.

OP_getdvargreen

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getDvarGreen()" function to grab the green component of a color dvar. The next entity on the stack is the dvar to get, typically an OP_GetHash for the dvar's hash.

OP_getdvarblue

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getDvarBlue()" function to grab the blue component of a color dvar. The next entity on the stack is the dvar to get, typically an OP_GetHash for the dvar's hash.

OP_GetFirstArrayKey

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getFirstArrayKey()" function.

OP_GetNextArrayKey

This is a simple 1 byte opcode that is added to the stack once its read. It is used for the "getNextArrayKey()" function.

OP_skipdev

This opcode is used to denote developer code sections. These sections are outlined by "/#" and "#/" in raw text and are only ran when the dvars "developer" and "developer_script" are set to true. The section starts at this opcode, and the end is defined at

(++currentFunctionDataOffset) &= 0xFFFFFFFE;

At this point there is a short value that holds the length of this code section at which point the end marker will be.

Fixup Section

This section and the following one have never been seen in a retail gsc file. Previously it was thought that the offset locations were actually just repeats of the file's size, but since missing sections will have offsets that start at the next section, these always point to the end of the file. The fixup section simply patches the offset location with the data of length address.

struct GSC_FIXUP_ITEM
{
  unsigned int offset;
  unsigned int address;
  char data[address];
};

Profile Section

This section has never been seen in a retail gsc file.

struct GSC_PROFILE_ITEM
{
  unsigned int name;
  unsigned int address;
};

???

Source Format

The source format for GSCs is very well known. Simply a code file (text) with a .gsc extension.