CASC: Difference between revisions

From wowdev
Jump to navigation Jump to search
(Added 3 missing keys)
(Remove hardcoded table styles)
 
(356 intermediate revisions by 19 users not shown)
Line 1: Line 1:
CASC is the name of the new file system that Blizzard has created to replace the outdated format of [[MPQ]].
<div style='border: 1px solid black; padding: 10px;'>
'''Missing something?''' This page was recently split up into separate pages. For the content transfer part of [[NGDP]], see [[TACT]]. This page should now only contain information on the local filesystem format called CASC.
</div>
 
CASC is the name of the new file system that Blizzard created to replace the outdated format of [[MPQ]]. It is used in WoW versions {{Template:Sandbox/VersionRange|min_expansionlevel=6|min_inclusive=6}}.


=CASC v1=
=CASC v1=
Line 8: Line 12:
The remainder of this article will refer exclusively to the system called CASC v2 as 'CASC'. While many parts of the file system are identical between v1 and v2, there are enough changes to make explaining both formats at once inadvisable.
The remainder of this article will refer exclusively to the system called CASC v2 as 'CASC'. While many parts of the file system are identical between v1 and v2, there are enough changes to make explaining both formats at once inadvisable.


=NGDP=
=Journal-based Data Files=
CASC was introduced simultaneously with a new system for managing configuration, blob, and installation files called NGDP, or Next Generation Download Protocol. When the acronym 'NGDP' is used in conjunction with the term CASC, it is typically referring to the hosted components of the CASC file system, and its ability to stream data on the fly.
During the installation process for a Blizzard game, the program will download the required files as requested by root, encoding, download, and install. It stores the downloaded data fragments in data files in "INSTALL_DIR\Data\data\". The program will record the content hash ([[BLTE]]-compressed hash), size, and position of the file as well as the number of the data file that it is in. It places those four parameters into journal files with the extension '.idx'.
 
==NGDP URLs==
As of October 14th, 2014, the following generic NGDP URLs are known:
* http://us.patch.battle.net:1119/(ProgramCode)/cdns - a table of domains available with game data per region
* http://us.patch.battle.net:1119/(ProgramCode)/versions - a table of the current game version, build config, and cdn config per region
* http://us.patch.battle.net:1119/(ProgramCode)/bgdl - similar to versions, but tailored for use by the Battle.net App background downloader process
* http://us.patch.battle.net:1119/(ProgramCode)/blobs - contains InstallBlobMD5 and GameBlobMD5
* http://us.patch.battle.net:1119/(ProgramCode)/blob/game - a blob file that regulates game functionality for the Battle.net App
* http://us.patch.battle.net:1119/(ProgramCode)/blob/install - a blob file that regulates installer functionality for the game in the Battle.net App
 
Keep in mind Blizzard's CDN is pretty shit at caching sometimes so you after an update to the above files it might switch back and forth between the old and new version of the files for a few hours.
 
==NGDP Program Codes==
 
As of September 14th, 2015, the following program codes are known to support NGDP:
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="90" | Program
! width="250" | Description
|-
| agent || Battle.net Agent
|-
| bna || Battle.net App
|-
| bnt || Heroes of the Storm Alpha (Deprecated)
|-
| clnt || Client (?)
|-
| d3 || Diablo 3 Retail
|-
| d3cn || Diablo 3 China
|-
| d3t || Diablo 3 Test
|-
| demo || ? (Partial)
|-
| hero || Heroes of the Storm Retail
|-
| herot || Heroes of the Storm Test
|-
| hsb || Hearthstone
|-
| pro || Overwatch Retail
|-
| prodev || Overwatch Dev
|-
| sc2 || StarCraft II (Partial)
|-
| s2 || StarCraft II
|-
| s2t || StarCraft II Test (Partial)
|-
| s2b || StarCraft II Beta
|-
| test || ? (Partial)
|-
| storm || Heroes of the Storm (Deprecated)
|-
| war3 || Warcraft III (Partial)
|-
| wow || World of Warcraft Retail
|-
| wowt || World of Warcraft Test
|-
| wow_beta || World of Warcraft Beta
|}
 
=CASC Online=
 
==Standard URL Hash Format==
URL Format: http://(cdnsHost)/(cdnsPath)/(pathType)/(FirstTwoHexOfHash)/(SecondTwoHexOfHash)/(FullHash)
 
For WoW, cdnsHost of dist.blizzard.com.edgesuite.net should always be acceptable, and currently the cdnsPath of "tpr/wow" has never changed. If you have any doubts, check the NGDP URL for 'cdns', which contains both pieces of information.
 
Known path types are:
* config - contains the three types of config files: Build configs, CDN configs, and Patch configs
* data - contains archives, indexes, and unarchived standalone files (typically binaries, mp3s, and movies)
* patch - contains patch files
 
Blizzard regularly cleans old builds from the CDN so any example files mentioned in this article might be unavailable at the time of reading.
Example URL: http://dist.blizzard.com.edgesuite.net/tpr/wow/config/0a/6f/0a6f07f48525c4203cb2fdbf6a7d7e9a
 
==Config Files==
 
===Build Config===
Example file: http://dist.blizzard.com.edgesuite.net/tpr/wow/config/0a/6f/0a6f07f48525c4203cb2fdbf6a7d7e9a
 
Some of the files listed in this file are explained later on in this article.
 
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="140" | Value name
! width="650" | Description
|-
| root || Content hash of the '''decoded''' root file, look this up in encoding to get the CDN '''encoded''' CDN key/Content hash
|-
| download || Content hash of the '''decoded''' download file, look this up in encoding to get the '''encoded''' CDN key/Content hash
|-
| install || Content hash of the '''decoded''' install file, look this up in encoding to get the '''encoded''' CDN key/Content hash
|-
| encoding || First key is the content hash of the '''decoded''' encoding file, second one is the CDN key
|-
| encoding-size || Encoding sizes
|-
| build-name || Name of the build
|-
| build-playbuild-installer || Type of installer for the Battle.net app to use
|-
| build-product || Product name
|-
| build-uid || Program code (see NGDP Program Codes)
|-
| patch || Unknown
|-
| patch-size || Unknown
|-
| patch-config || Patch config file location (see Patch Config)
|}
 
===CDN Config===
Example file: http://dist.blizzard.com.edgesuite.net/tpr/wow/config/8b/52/8b52f64f8f031ebf0cb7dec0048f018e
 
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="140" | Value name
! width="650" | Description
|-
| archives || CDN keys of all archives (and by appending .index to the hash their indexes)
|-
| archive-group || CDN key of the the combined index file (see Archive-Group Index)
|-
| patch-archives || CDN keys of patch archives (needs research)
|-
| patch-archive-group || CDN key of probably the combined patch index file (needs research)
|-
| builds || List of build configs this config supports
|}
 
===Patch Config===
This configuration file was added after all of the others. It first appeared in CASC v1 for Heroes of the Storm in August 2014. It then appeared in WoW for CASC v2 around build 19000 (approximately October 1st, 2014).
The purpose of this file is to reduce redundant downloads. It achieves this by directing the system to download patch files to apply and update previously downloaded material.
The structure and purpose of all of the fields of this file are unknown at this time.
 
Example file: http://dist.blizzard.com.edgesuite.net/tpr/wow/config/b6/c8/b6c844d423c0c3e8620a171828080b06
 
==Data Files==
 
Example index: http://dist.blizzard.com.edgesuite.net/tpr/wow/data/00/72/0072651343c29797b9da4aad2d0c93fa.index
 
Example archive: http://dist.blizzard.com.edgesuite.net/tpr/wow/data/00/72/0072651343c29797b9da4aad2d0c93fa
 
==Patch Files==
 
=File References=
Files are referred to by many different pieces of data in CASC. A quick summary of them:
* Filename: The file's real name. Note that one file can have many names - essentially, one header hash can map to many different name hashes.
* Locale Flag:
* Content Flag:
* Name Hash: The file's name, after being hashed with the Jenkins Hash.
* Header Hash: The MD5 of the BLTE header of the compressed file.
* Content Hash: The MD5 of the entire file in its uncompressed state; the purest representation of the data.
=BLTE encoded files=
Any files stored inside the data files are BLTE encoded, which means before reading anything in the file, first you have to decode it. The documentation below refers to ''decoded'' files!
 
It consists of these chunks in the following order:
* Header
* ChunkInfo (only if Header.headerSize > 0)
* Data
 
 
To read a BLTE encoded file:
# Read the Header chunk
# Read the ChunkInfo chunk if Header.headerSize > 0
# Read each of the Data chunks and combine them to create the complete file
 
'''Note:''' If there is no ChunkInfo struct, there is just one Data chunk.
 
 
*'''Header'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || char[4] || FileSignature || "BLTE"
|-
| 0x04 || uint32_t [BE] || headerSize || Size of the BLTE header (BLTE header = Header + ChunkInfo).
|}
 
 
*'''ChunkInfoEntry'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || uint32_t [BE] || compressedSize || Compressed size of the chunk (the compression mode byte is included).
|-
| 0x04 || uint32_t [BE] || decompressedSize || Decompressed chunk of the size.
|-
| 0x08 || char[16] || checksum || The checksum of the compressed chunk (the compression mode byte is included).
|}
 
 
*'''ChunkInfo'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || uint8_t [BE] || flags || Flags of some sort.
|-
| 0x02 || uint24_t [BE] || chunkCount || The number of chunks.
|-
| 0x04 || ChunkInfoEntry[chunkCount] || chunks || The chunk info for the chunks in the file.
|}
 
 
*'''Data'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || char || encodingMode || Available values: N, Z, F, E
|-
| 0x01 || char[ChunkInfo.compressedSize] || data || The encoded data.
|}
 
 
 
Example implementation as Binary Template can be found here: [[BLTE-Template]]
 
 
'''Encoding modes:'''
* N: Plain data.
* Z: Zlib encoded data.
* F: Recursively encoded BLTE data.
* E: encrypted: one of salsa20, arc4, rc4.
 
struct
{
  unsigned char key_name_length;              // 0x8
  unsigned char key_name[key_name_length];
  unsigned char IV_length;                    // 0x4
  unsigned char IV[IV_length];
  char type; // 'S': salsa20, 'A': arc4
} E_chunk;
 
key_name is resolved by client to the actual key. keys are distributed via keyrings and some keys are hardcoded. WoW has dbfilesclient/tactkey.db2 and dbfilesclient/tactkeylookup.db2.
 
known keys:
key_name          key                              used for  seen in
FB680CB6A8BF81F3  62D90EFA7F36D71C398AE2F1FE37BDB9  salsa20  overwatch 0.8.0.24919_retailx64 (hardcoded)
402CD9D8D6BFED98  AEB0EADEA47612FE6C041A03958DF241  salsa20  overwatch 0.8.0.24919_retailx64 (hardcoded)
DBD3371554F60306  34E397ACE6DD30EEFDC98A2AB093CD3C  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
11A9203C9881710A  2E2CB8C397C2F24ED0B5E452F18DC267  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
A19C4F859F6EFA54  0196CB6F5ECBAD7CB5283891B9712B4B  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
87AEBBC9C4E6B601  685E86C6063DFDA6C9E85298076B3D42  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
DEE3A0521EFF6F03  AD740CE3FFFF9231468126985708E1B9  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
8C9106108AA84F07  53D859DDA2635A38DC32E72B11B32F29  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
49166D358A34D815  667868CD94EA0135B9B16C93B1124ABA  salsa20  overwatch 0.8.0.24919_retailx64 (streamed from server)
1463A87356778D14  69BD2A78D05C503E93994959B30E5AEC  salsa20  overwatch (streamed from server)
5E152DE44DFBEE01  E45A1793B37EE31A8EB85CEE0EEE1B68  salsa20  overwatch (streamed from server)
9B1F39EE592CA415  54A99F081CAD0D08F7E336F4368E894C  salsa20  overwatch (streamed from server)
3ECB6A12785050FA  BDC51862ABED79B2DE48C8E7E66C6200  salsa20  WOW-20740patch7.0.1_Beta (db2 id: 16, lookup in db)
                  AA0B5C77F088CCC2D39049BD267F066D  salsa20  WOW-20740patch7.0.1_Beta (db2 id: 25, lookup streamed from server)
D1E9B5EDF9283668  8E4A2579894E38B4AB9058BA5C7328EE  salsa20  WOW-20740patch7.0.1_Beta (db2 id: 39, lookup streamed from server)
B76729641141CB34  9849D1AA7B1FD09819C5C66283A326EC  salsa20  WOW-20740patch7.0.1_Beta (db2 id: 40, lookup streamed from server)
                  D514BD1909A9E5DC8703F4B8BB1DFD9A  salsa20  WOW-20740patch7.0.1_Beta (db2 id: 41, lookup streamed from server)
23C5B5DF837A226C  1406E2D873B6FC99217A180881DA8D62  salsa20  WOW-20740patch7.0.1_Beta (db2 id: 42, lookup streamed from server)
 
=States of CASC Data=
CASC data comes in all forms and sizes.
 
==Key CASC Files==
 
===Root===
File signature: None
The purpose of Root is to translate Content Hashes into file names
 
 
===Encoding===
File signature: "EN"
 
The encoding file contains data which is used to map content hash to file key.
 
The file contains the following in order:
* File header
* String block #1
* Table A header
* Table A entries
* Table B header
* Table B entries
* String block #2
 
 
====Encoding Header Structure====
*'''The beginning of the file is compromised of this structure of 0x16 bytes. Structure names were invented by the author of this page.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || char[2] || FileSignature || "EN"
|-
| 0x02 || uint8_t || UNK || ???
|-
| 0x03 || uint8_t || checksumSizeA || The length of the checksums in table A.
|-
| 0x04 || uint8_t || checksumSizeB || The length of the checksums in table B.
|-
| 0x05 || uint16_t || flagsA || Flags for table A.
|-
| 0x07 || uint16_t || flagsB || Flags for table B.
|-
| 0x09 || uint32_t [BE] || numEntriesA ||  The number of entries in table A.
|-
| 0x0D || uint32_t [BE] || numEntriesB ||  The number of entries in table B.
|-
| 0x11 || uint8_t || UNK || ???
|-
| 0x12 || uint32_t [BE] || stringBlockSize ||  The size of string block #1.
|}
 
 
====Encoding Table Header Block Structure====
*'''Each of the tables have numEntries entries of this structure of 0x20 bytes. They are used to locate what entry in the next part of the table contains a hash and to verify the integrity of that entry once it is read.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || char[checksumSizeA] || firstHash || The hash of the first file in the entry.
|-
| 0x10 || char[checksumSizeA] || blockHash || The checksum of the entry.
|-
|}
 
 
====Encoding Table Entry Block Structure====
*'''Each of the tables have numEntries entries of 4096 bytes which contains these structures, followed by padding.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || uint16_t || keyCount || The number of keys.
|-
| 0x02 || uint32_t [BE] || fileSize || The decompressed size of the file.
|-
| 0x06 || char[checksumSizeA] || hash || The hash of the file.
|-
| 0x16 || char[checksumSizeA*keyCount] || keys || The file keys belonging to the file. This can be used to look up the location of the file in the .IDX files.
|}
 
 
====Encoding Layout Table Header Block Structure====
*'''Each of the tables have numEntries entries of this structure of 0x20 bytes. They are used to locate what entry in the next part of the table contains a hash and to verify the integrity of that entry once it is read.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || char[checksumSizeB] || firstKey || The key of the first file in the entry.
|-
| 0x10 || char[checksumSizeB] || blockHash || The checksum of the entry.
|-
|}
 
 
====Encoding Layout Table Entry Block Structure====
*'''Each of the tables have numEntries entries of 4096 bytes which contains these structures, followed by padding.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="200" | Type
! width="150" | Name
! width="900" | Description
|-
| 0x00 || char[checksumSizeB] || key || The key of the file.
|-
| 0x10 || uint32_t [BE] || stringIndex || The index into string block #1.
|-
| 0x14 || char || UNK || ???
|-
| 0x15 || uint32_t [BE] || fileSize || The compressed size of the file.
|}
 
 
====String blocks====
The two string blocks contain descriptions of file layouts, providing information about the sections and compression mode of the files.
 
* Block #1 is referenced by the layout table (see above).
* Block #2 is the description of the encoding file itself.
 
 
The string uses the following format:
 
: <encoding_mode>:{<comma-separated subchunks>}
 
: '''Note:''' Usually ''<encoding_mode>'' is b for BLTE in the top chunk.
 
 
It specifies each subchunk in this form:
 
: <size>=<encoding_mode>
 
 
''<size>'':
 
: Value refers to the number of bytes that chunk (at a minimum, see below) contains.
 
: The value might contain K, M or *.
 
: * If K is present, multiply the number with 1024.
: * If M is present, multiply the number with 1048576.
: * If * is present, the chunk is "greedy" and it contains the rest of the bytes in the file in addition to any number specified.
 
 
''<encoding_mode>'':
 
: Values will be either n, z, f, or e.
: It can also include a specifier (ex: =z:{6,mpq}) for encoder parameters (ex: z:{6, mpq} means level == 6 and windowBits == 0).
 
 
: '''n''' None
 
: '''z''' Zlib
:: '''Parameters:'''
:: ''level'' - default value: 9
:: ''windowBits'' - default value: 15 ('''note:''' the value mpq means windowBits == 0)
 
: '''f'''  Frame
 
: '''c''' Crypt
 
 
'''Example:'''
 
b:{64=n,256K*=z}
 
 
'''010 Template:'''
 
https://gist.github.com/heksesang/fdda3e4f8a5ed53b71ed
 
'''Function for parsing encoder profiles:'''
 
https://gist.github.com/heksesang/b15057fe3f093eebee3a
 
 
===Install===
File signature: "IN"
 
====Install Header Structure====
*'''The beginning of the file is compromised of this structure of 0x0A bytes. Structure names were invented by the author of this page.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="70" | Type
! width="90" | Name
! width="900" | Description
|-
| 0x00 || char[2] || FileSignature || "IN"
|-
| 0x02 || uint32 || UNK || ???
|-
| 0x06 || uint32 || numEntries ||  The number of entries in the body of the file
|}
 
====Install Header Entry Structure====
*'''The remainder of the header is populated by these header entries, each a variable size (due to the strings). Structure names were invented by the author of this page.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="70" | Type
! width="90" | Name
! width="900" | Description
|-
| char[] || FlagName || The name of the optional flag for the entry
|-
| uint16 || FlagType || A number shared amongst specific flags. For example, languages are '3'. Regions are '5'. Architecture type is '0'.
|-
| byte[28] || FileFlags || This appears to be a bit array represented in hex form. Each bit appears to represent an entry of this file; if the bit is enabled, then the flag named by FlagName is active for that file.
|}
 
====Install Entry Structure====
*'''The rest of the file is populated by these normal entries, each a variable size (due to the strings). Structure names were invented by the author of this page.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="70" | Type
! width="90" | Name
! width="900" | Description
|-
| char[] || FileName ||  The name of the file.
|-
| char[16] || MD5 || The MD5 of the uncompressed (?) file.
|-
| byte[28] || Size || The size of the file.
|}
 
===Download===
This file has this structure:
- Header
- Entries[Header.EntryCount]
- Tags[Header.TagsCount]
 
====Download Header====
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="120" | Type
! width="150" | Name
! width="810" | Description
|-
| char[2] || Signature || The signature for this file (always "DL")
|-
| char[3] || unk || ???
|-
| int [BE] || EntryCount || The amount of file entries in this file
|-
| short [BE] || TagCount || The amount of tag entries in this file
|}
 
====Download Entry====
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="120" | Type
! width="150" | Name
! width="810" | Description
|-
| char[16] || Hash || This hash is found in every node of the encoding file. (Reverse lookup)
|-
| char[10] || unk || ???
|}
 
====Download Tag====
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="120" | Type
! width="150" | Name
! width="810" | Description
|-
| string || Name || A C-String indicating this tag's Name.
|-
| short [BE] || Type || Hash type
|-
| char[N] || Bits || an array of size N = Header.EntryCount / 8 + (Header.EntryCount % 8 > 0 ? 1 : 0); that is basically a massive bit mess. Use Schroeppel's 8 bits reverse function on it to have bits.
|}
 
===Patch===
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="120" | Type
! width="150" | Name
! width="810" | Description
|-
| char[2] || Signature || The signature for this file (always "PA")
|-
| char || 1
|-
| char || size_a || <= 0x10
|-
| char || size_b || <= 0x10
|-
| char || size_c || <= 0x10
|-
| char || size_d || <= 0x18
|-
| char[19] || unk || ???
|-
| char[16] || Encoding file || The hash for encoding file (same as second string in build config file)
|-
| int || Uncompressed || Uncompressed encoding file size in bytes
|-
| int || Compressed || Compressed encoding file size in bytes
|-
| char || EncodingFormatLength || Length of the following string
|-
| char[EncodingFormatLength] || EncodingFmt || Encoding string (same format as string blocks in encoding file)
|-
| char[] || ??? || byte array until the end of the file
|}
 
 
header+entries needs to be less than 0x10000 bytes (at least in wow-18179). md5sum is only checked for header+entries, file might be larger thus.
 
struct PatchManifest_Header
{
  uint16_t_BE magic; // 'PA'
  uint8_t unk1; // 1
  uint8_t key_size_a; // <= 0x10
  uint8_t size_b; // <= 0x10
  uint8_t size_c; // <= 0x10
  uint8_t size_d; // (size_d - 0xc) <= 0x12.
  uint16_t_BE entry_count; // (key_size_a + 20) * entry_count + sizeof (PatchManifest_Header) < 0x10000
  uint8_t unk2; // flags
} header;
#if encoding_information_apparently_added_after_18179
uint8_t encoding_key[16];
uint32_t_BE decoded_size;
uint32_t_BE encoded_size;
uint8_t encoding_format_length;
char encoding_format[encoding_format_length];
#endif
struct PatchManifest_Entry
{
  uint8_t key[header.key_size_a];
  uint8_t md5_of_entry_data[0x10];
  uint32_t_BE offset_entry_data; // in this file
} entries[header.entry_count]; // sorted ascending by key
// at positions given in PatchManifest_Entry
struct entry_data // maximum size: 2^header.size_d!
{
  struct
  {
    uint8_t num_entries; // <= 0x10.
    uint8_t key[header.key_size_a];
    uint40_t_BE unk; // yes, 5 bytes!
    struct
    {
      uint8_t key[header.key_size_a];
      uint40_t_BE unk1;
      uint8_t key[header.key_size_c];
      uint40_t_BE unk2;
    } sub_entries[num_entries];
  } entries[]; // count unspecified: read until the next sub_entries[].num_entries would be 0
                // OR entry_data would be bigger than 2^header.size_d
};
// in my example file (bd260d7f3a9008620a90033b561a6289), after the last
// entries_data which ended with num_entries == 0, there was further data.
// something above is thus not correct, or incomplete.
 
==Blizzard-Created Archives==
In its natural state, the vast majority of the data for any CASC-based game exists in the archives.
 
===Archives===
Archives are extensionless 256 MB files that are usually only stored on the Blizzard CDNs. Their naming follows the standard URL hash format using the '/data/' path type.
 
The structure of the archives is presumably just file fragment after file fragment. You will never need to parse it because you can just look up offset + size of your file fragment in the index files and then take the piece directly out of the archive.
 
===Archive Indexes (.index)===
These '.index' files reveal to the user where the compressed game files are located within the archives. All indexes (except the Archive-Group index, see below) are named after their archive (only difference is these have an extension).
'.index' files are stored on the CDN using the standard hash naming scheme (remember they have an extension though). They are also located in the directory 'INSTALL_DIR/Data/indices/' for a WoW install. Note that the index files are not complete -- some HeaderHash entries obtained from the encoding file will not appear in the index. These can be fetched directly from the CDN using the HeaderHash in the standard hash naming scheme.
 
====Normal Index Entry Structure====
*'''The file is divided into 4kb chunks populated by these standard index entries of 0x18 (hex) bytes. Each chunk is zero-padded to a full 4kb, though there may be more than 0x18 bytes of padding at the end of a chunk -- be sure the check for all-null HeaderHash fields. The last chunk is a table-of-contents, listing the LAST HeaderHash in each chunk, as well as some unknown footer data. Structure names were invented by the author of this page.'''
*'''NOTE: This structure uses big endian numbers.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="70" | Type
! width="90" | Name
! width="900" | Description
|-
| 0x00 || char[16] || HeaderHash || The MD5 of the BLTE header for the compressed fragment that this index entry represents
|-
| 0x10 || uint32 || Offset || Position of the fragment in the archive
|-
| 0x14 || uint32 || Size || Size of the fragment
|}
 
===Archive-Group Index (.index)===
Archive-group is actually a very special '.index' file. While virtually all '.index' files are under 2 MB, the archive-group '.index' file is always over 15 MB. It is essentially a merger of all .index files, with a structure change. There is a new uint16 field that serves as an index for the array of archives from this build's CDN config.
 
Therefore, it is critical that you identify this outlier - if you try to parse it as a regular '.index' purely because of its extension, your program will undoubtedly fail. You can identify it because it will be named the same as the 'archive-group' hash listed in the CDN config. Additionally, it will not be listed as an archive hash in the CDN config. As discussed before, the different file structure and irregular file size are also viable methods to avoid parsing this file (or to avoid parsing the other '.index' files).
 
====Merged Index Entry Structure====
*'''The entire file is populated by these 'merged' index entries of 0x1A (hex) bytes. Structure names were invented by the author of this page.'''
*'''NOTE: This structure uses big endian numbers.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="70" | Type
! width="90" | Name
! width="900" | Description
|-
| 0x00 || char[16] || HeaderHash || The MD5 of the BLTE header for the compressed fragment that this index entry represents
|-
| 0x10 || uint16 || ArchiveIndex || If you placed the hashes of the 'archives = ' line of the CDN config in an array, this number would be the index for that array
|-
| 0x12 || uint32 || Offset || Position of the fragment in the archive
|-
| 0x16 || uint32 || Size || Size of the fragment
|}


==Journal-based Data Files==
==Shared Memory==
During the installation process for a Blizzard game, the program will download the required files as requested by root, encoding, download, and install. It stores the downloaded data fragments in data files in "INSTALL_DIR\Data\data\". The program will record the content hash (BLTE-compressed hash), size, and position of the file as well as the number of the data file that it is in. It places those four parameters into journal files with the extension '.idx'.
 
===Shared Memory===
The shared memory file is called 'shmem' and is usually located in the same folder as the data and .IDX journals. This file contains the path where the data files are stored, which is the current version of each of the .IDX files, and which areas of the data files have unused space. The file is recreated every time a client is started.
The shared memory file is called 'shmem' and is usually located in the same folder as the data and .IDX journals. This file contains the path where the data files are stored, which is the current version of each of the .IDX files, and which areas of the data files have unused space. The file is recreated every time a client is started.


====Shared Memory Header Structure====
===Shared Memory Header Structure===
*'''The first part of the header.'''
*'''The first part of the header.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
! width="90" | Name
! width="90" | Name
! width="900" | Description
! width="900" | Description
|-
|- style="vertical-align: top;"
| 0x00 || uint32_t || BlockType || A value indicating what type of block this is. For this block, the value is 4.
| 0x00 || uint32_t || BlockType || A value indicating what type of block this is. For this block, the value is either 4 or 5.<br>• If the type is 5, then the free space block contains no entries. (see below)
|-  
|-  
| 0x04 || uint32_t || NextBlock || The offset of the next block.
| 0x04 || uint32_t || NextBlock || The offset of the next block.
|-  
|- style="vertical-align: top;"
| 0x08 || char[0x100] || DataPath || The path of the data files. This is prefixed with "Global\" if the path is an absolute path.
| 0x08 || char[0x100] || DataPath || The path to the data files. It is always prefixed with "Global\". The path uses forward slashes (except the prefix).<br>• When the file is written by the Battle.Net client, it is an absolute path.<br>• When the file is written by the game, it is a relative path (relative from the game executable).
|}
|}




*'''Followed by a number of these entries. The count can be calculated like this: (NextBlock - 264 - idxFileCount * 4) / 8'''
*'''Followed by a number of these entries. The count can be calculated like this: (NextBlock - 264 - idxFileCount * 4) / 8'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
Line 731: Line 48:


*'''Followed by a number of these entries. The count is equal to number of .IDX files (usually 16).'''
*'''Followed by a number of these entries. The count is equal to number of .IDX files (usually 16).'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
Line 740: Line 57:
|}
|}


 
===Shared Memory Free Space Structure===
====Shared Memory Free Space Structure====
After a small header, this structure is split up into two equal parts.
After a small header, this structure is split up into two equal parts.
The first part contains entries with the number of unused bytes.
The first part contains entries with the number of unused bytes.
Line 749: Line 65:


*'''The header part of the structure.'''
*'''The header part of the structure.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
Line 764: Line 80:


*'''This is the number of unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.'''
*'''This is the number of unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
Line 770: Line 86:
! width="900" | Description
! width="900" | Description
|-  
|-  
| 0x00 || uint10* || DataNumber || This is always set to 0 in this part of the block.
| 0x00 || uint10* || DataNumber || Before Agent v8012? this was always set to 0. After Agent v8012? this appears to be 1 when the data file referenced by the equivalent entry in the unused byte positions section has not been created.
|-  
|-  
| 0x01 || uint30* || Count || The number of unused bytes.
| 0x01 || uint30* || Count || The number of unused bytes.
Line 777: Line 93:


*'''This is the position of the unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.'''
*'''This is the position of the unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
Line 788: Line 104:
|}
|}


===.IDX Journals===
=== Special Case: Block Type 5 ===
There is a special case after a game update by the Battle.Net client. It seems that the client only writes a template file without free space entries and lets the game later complete the file. The following applies in this case:
 
* the first block type is set to 5 (instead of 4) with the same layout
* file size is always 16,384 bytes long
* offset to next block is set to 4,096<br>''this is misleading: it is still at 340, and the free space block is found at that offset, not at offset 4,096 (340 is indicated by the block size field of the 1st block)''
* the free space block has an entry count of zero but is still padded to 1090 entries as usual
* after the terminator block, the file is additionally padded to 16,384 bytes
 
==.IDX Journals==
Example file path: INSTALL_DIR\Data\data\0e00000054.idx
Example file path: INSTALL_DIR\Data\data\0e00000054.idx


.IDX journals contain references. There used to be one .IDX file per journal, and the naming scheme used to have two separate meanings. The '0e' part of the file name used to designate which archive the .IDX file was associated with. This changed halfway through the Warlords Beta, and the current .IDX names are just iteration numbers.
.IDX journals contain a mapping from keys to the location of their data in the local CASC archives. There used to be one .IDX file per journal, and the naming scheme used to have two separate meanings. The '0e' part of the file name used to designate which archive the .IDX file was associated with. This changed halfway through the Warlords Beta.  Now there are 16 indices total, and the first byte of the hex filename says which of the 16 indices it is, while the remainder of the hex filename is just a version number that increments when a new set of files is added to the local archives.
 
To determine which of the 16 indices a key is bucketed in, the key is hashed by xoring together each 4-bit nibble in the first 9 bytes of the key:
 
  uint8_t cascGetBucketIndex(const uint8_t k[16]) {
    uint8_t i = k[0] ^ k[1] ^ k[2] ^ k[3] ^ k[4] ^ k[5] ^ k[6] ^ k[7] ^ k[8];
    return (i & 0xf) ^ (i >> 4);
  }
 
To determine the bucket index of the cross-reference entries at the start of each data file, use that same function and add 1 to the result, modulo 16 for the number of indices.
 
  uint8_t cascGetBucketIndexCrossReference(const uint8_t k[16]) {
    uint8_t i = cascGetBucketIndex(k);
    return (i + 1) % 16;
  }
 
 
===.IDX Header Structure===


====.IDX Header Structure====
The header is little-endian:
???


====.IDX Entry Structure====
{| border="1" cellpadding="2" class="wikitable"
*'''The rest of the file is populated by these normal entries, each 0x10 bytes in size. Structure names were invented by the author of this section because official names were not available.'''
*'''Note: .IDX files are chunked into groups of 0x1000 bytes. If a chunk is not filled to exactly 0x1000 bytes, the gap will be filled with '00's.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="70" | Type
! width="70" | Type
Line 805: Line 143:
! width="900" | Description
! width="900" | Description
|-
|-
| 0x00 || char[9] || HeaderHash || The MD5 of the BLTE header of the compressed file
| 0x00 || uint32 || HeaderHashSize || The number of bytes to use for the hash at +04; usually 0x10.
|-
| 0x04 || uint32 || HeaderHash || This should equal the value of pc after calling hashlittle2 on the following HeaderHashSize bytes of the file with an initial value of 0 for pb and pc.
|-
| 0x08 || uint16 || Unk0 || Must be 7
|-
| 0x0a || uint8 || BucketIndex || The bucket index of this file; should be the same as the first byte of the hex filename.
|-
| 0x0b || uint8 || Unk1 || Must be 0
|-  
|-  
| 0x09 || uint10* || DataNumber || The number of the data file to read from
| 0x0c || uint8 || EntrySizeBytes || Must be 4
|-  
|-  
| 0x10.25 || uint30* || Offset || The position to begin reading from in the data file
| 0x0d || uint8 || EntryOffsetBytes || Must be 5
|-  
|-  
| 0x14 || uint32 || Size || The amount to read from the data file
| 0x0e || uint8 || EntryKeyBytes || Must be 9
|-
| 0x0f || uint8 || ArchiveFileHeaderBytes || Must be 30
|-
| 0x10 || uint64 || ArchiveTotalSizeMaximum || The maximum size of a casc installation; 0x4000000000, or 256GiB.
|-
| 0x18 || char[8] || padding || The header is padded with zeroes to the next 0x10-byte boundary.
|-
| 0x20 || uint32 || EntriesSize || This is the length in bytes of the entries in the index file.
|-
| 0x24 || uint32 || EntriesHash || This should equal the value of pc after calling hashlittle2 on the following EntriesSize bytes of the file with an initial value of 0 for pb and pc.
|}
|}
*'''* designates unusual data types. It is probably easiest to read the DataNumber as a Byte (and put it into a UInt16) and the Offset as a UInt32. Then use bit-shifting and a mask on Offset to update DataNumber and apply a mask to update Offset.


===.XXX Data Files===
===.IDX Entry Structure===
*'''The rest of the file is populated by these normal entries, each 0x12 bytes in size. Structure names were invented by the author of this section because official names were not available.'''
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Dec)
! width="70" | Type
! width="90" | Name
! width="900" | Description
|-
| 00 || char[9] || Key || The first 9 bytes of the key for this entry.
|-
| 09 || uint40* || Offset || Unlike the other little-endian integers in this file, this is a big-endian 5-byte integer.  The top 10 bits are the number of the archive (data.%03d), and the bottom 30 bits are the offset in that archive to the file data.
|-
| 14 || uint32 || Size || The length of the file in bytes.
|}
*'''* designates unusual data types. In C#, you can read the Offset by reading a Byte, reading a big-endian UInt32, shifting the byte left 32 bits, and ORing them together.  Use a 30-bit mask (0x3fffffff) to get the file offset, and right shift the value 30 bits to get the archive number.
 
==.XXX Data Files==
Example file path: INSTALL_DIR\Data\data\data.015
Example file path: INSTALL_DIR\Data\data\data.015


These files consist of a sequence of headers with corresponding BLTE data.
These files consist of a sequence of headers with corresponding [[BLTE]] data.


Most .xxx archives begin with 16 special index cross-linking files. These files have no data and have encoding keys of XXYYbba1af16c50e1900000000000000, where XX is the index number and YY is the .xxx number. The purpose of these files is unclear.


*'''The data header.'''
*'''The data header.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="200" | Type
! width="200" | Type
Line 828: Line 200:
! width="900" | Description
! width="900" | Description
|-
|-
| 0x00 || char[0x10] || BlteHash || The hash of the BLTE header (see the section above). If the BLTE header doesn't include a block table, this is a hash of the complete BLTE file. This hash is in reverse, so reverse it before you use it. This is also the same hash that is used as file key on CDN and in the .idx files.
| 0x00 || char[0x10] || BlteHash || Encoding key of the file, in reversed byte order. Note that only as many bytes (final bytes in this reversed order) of this key as are contained in the .idx files (9) must be accurate, and the remaining 7 bytes may be 0s or otherwise altered.
|-  
|-  
| 0x10 || uint32_t || Size || The size of this header + the following data.
| 0x10 || uint32_t || Size || The size of this header + the following data.
|-  
|-  
| 0x14 || char[0x0A] || UNK || Unknown bytes. Most likely not needed by the games.
| 0x14 || char[0x02] || Flags?? || Unknown. Mostly 0. Set to 1,0 by Agent.exe on index cross-linking files, possibly indicating data-less metadata files.
|-
| 0x16 || uint32_t || ChecksumA || hashlittle(first 0x16 bytes of the header, 0x3D6BE971)
|-
| 0x1A || uint32_t || ChecksumB || Checksum of the first 0x1A bytes of the header. The exact algorithm seems to vary over time. The current implementation is described below
|}
|}
The algorithm for calculating ChecksumB is equivalent to the following C code:
    // Table is extracted from Agent.exe 8020. Hasn't changed for quite a while
    uint32_t TABLE_16C57A8[0x10] = {
        0x049396b8, 0x72a82a9b, 0xee626cca, 0x9917754f, 0x15de40b1, 0xf5a8a9b6, 0x421eac7e, 0xa9d55c9a,
        0x317fd40c, 0x04faf80d, 0x3d6be971, 0x52933cfd, 0x27f64b7d, 0xc6f5c11b, 0xd5757e3a, 0x6c388745,
    };
   
    // Arguments:
    //  header: Pointer to the memory containing the header
    //  archive_index: Number of the data file the record is stored in (e.g. xxx in data.xxx)
    //  archive_offset: Offset of the header inside the archive file
    // Precondition: Header is at least 0x1e bytes (e.g a full header)
    // Precondition: checksum_a has already been calculated and stored in the header
    // Assumption: Code is written assuming little endian.
    uint32_t checksum(uint8_t *header, uint16_t archive_index, uint32_t archive_offset) {
        // Top two bits of the offset must be set to the bottom two bits of the archive index
        uint32_t offset = (offset & 0x3fffffff) | (archive_index & 3) << 30;
   
        uint32_t encoded_offset = TABLE_16C57A8[(offset + 0x1e) & 0xf] ^ (offset + 0x1e);
   
        uint32_t hashed_header = 0;
        for (int i = 0; i < 0x1a; i++) { // offset of checksum_b in header
            ((uint8_t *)&hashed_header)[(i + offset) & 3] ^= header[i];
        }
   
        uint32_t checksum_b = 0;
        for (int j = 0; j < 4; j++) {
            int i = j + 0x1a + offset;
            ((uint8_t *)&checksum_b)[j] = ((uint8_t *)&hashed_header)[i & 3] ^ ((uint8_t *)&encoded_offset)[i & 3];
        }
        return checksum_b;
    }




*'''The BLTE data.'''
*'''The BLTE data.'''
{| border="1" cellpadding="2" style="background:#FCFCFC; color:black"
{| border="1" cellpadding="2" class="wikitable"
! width="80" | Offset (Hex)
! width="80" | Offset (Hex)
! width="200" | Type
! width="200" | Type
Line 843: Line 252:
! width="900" | Description
! width="900" | Description
|-
|-
| 0x00 || char[Header.Size - 30] || Data || The BLTE file data. See the BLTE section above.
| 0x00 || char[Header.Size - 30] || Data || The [[BLTE]] file data. See the [[BLTE]] page.
|}
|}
=hashpath=
hashpath (string path) → uint32_t
{
  string normalized = toupper (path).replace (from: '/', to: '\\')
  uint32_t pc = 0, pb = 0;
  hashlittle2 (normalized, strlen (normalized), &pc, &pb);
  return pc;
}


[[Category:Format]]
[[Category:Format]]

Latest revision as of 17:48, 16 March 2024

Missing something? This page was recently split up into separate pages. For the content transfer part of NGDP, see TACT. This page should now only contain information on the local filesystem format called CASC.

CASC is the name of the new file system that Blizzard created to replace the outdated format of MPQ. It is used in WoW versions ≥ WoD.

CASC v1

The CASC file system made its first debut in the Heroes of the Storm Technical Alpha, which was hosted on Blizzard's servers in late January. The form of CASC that Heroes of the Storm uses is designated by Blizzard as "CASC". In contrast, World of Warcraft's "build-playbuild-installer" config line clearly states it is generated by "ngdptool_casc2" (NGDP stands for Next Generation Download Procotol). These are the two most substantial changes between CASC v1 and CASC v2:

  • Sections of CASC v1 data files are grouped together in collections of files we call "packages". These packages all have the same root folder, and if all of the files are not properly added with the package's base directory, the extraction process will produce an incredibly mangled directory output. This system is completely removed in CASC v2.
  • CASC v1's Root file relates content hashes to file names. CASC v2's Root file relates content hashes to name hashes. Translating name hashes to file names requires use of the Jenkins Hash function [1], which in turn requires a listfile to generate the hashes. Essentially CASC v1 has its own listfile (in root). CASC v2 does not, and requires the user to provide names.

The remainder of this article will refer exclusively to the system called CASC v2 as 'CASC'. While many parts of the file system are identical between v1 and v2, there are enough changes to make explaining both formats at once inadvisable.

Journal-based Data Files

During the installation process for a Blizzard game, the program will download the required files as requested by root, encoding, download, and install. It stores the downloaded data fragments in data files in "INSTALL_DIR\Data\data\". The program will record the content hash (BLTE-compressed hash), size, and position of the file as well as the number of the data file that it is in. It places those four parameters into journal files with the extension '.idx'.

Shared Memory

The shared memory file is called 'shmem' and is usually located in the same folder as the data and .IDX journals. This file contains the path where the data files are stored, which is the current version of each of the .IDX files, and which areas of the data files have unused space. The file is recreated every time a client is started.

Shared Memory Header Structure

  • The first part of the header.
Offset (Hex) Type Name Description
0x00 uint32_t BlockType A value indicating what type of block this is. For this block, the value is either 4 or 5.
• If the type is 5, then the free space block contains no entries. (see below)
0x04 uint32_t NextBlock The offset of the next block.
0x08 char[0x100] DataPath The path to the data files. It is always prefixed with "Global\". The path uses forward slashes (except the prefix).
• When the file is written by the Battle.Net client, it is an absolute path.
• When the file is written by the game, it is a relative path (relative from the game executable).


  • Followed by a number of these entries. The count can be calculated like this: (NextBlock - 264 - idxFileCount * 4) / 8
Offset (Hex) Type Name Description
0x00 uint32_t Size The size of the block.
0x04 uint32_t Offset The offset of the block.


  • Followed by a number of these entries. The count is equal to number of .IDX files (usually 16).
Offset (Hex) Type Name Description
0x00 uint32_t Version The version number. Used to identify the .IDX filename.

Shared Memory Free Space Structure

After a small header, this structure is split up into two equal parts. The first part contains entries with the number of unused bytes. The second part contains entries with the position of the unused bytes.

There can be up to 1090 entries. Each of the two parts will always be 5450 bytes, so if there are fewer than 1090 entries, the rest of the bytes will be padded with '\0'.

  • The header part of the structure.
Offset (Hex) Type Name Description
0x00 uint32_t BlockType A value indicating what type of block this is. For this block, the value is 1.
0x04 uint32_t NextBlock The offset of the next block.
0x08 char[0x18] Padding Padding at the end of the header.


  • This is the number of unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.
Offset (Hex) Type Name Description
0x00 uint10* DataNumber Before Agent v8012? this was always set to 0. After Agent v8012? this appears to be 1 when the data file referenced by the equivalent entry in the unused byte positions section has not been created.
0x01 uint30* Count The number of unused bytes.


  • This is the position of the unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.
Offset (Hex) Type Name Description
0x00 uint10* DataNumber The number of the data file where the unused bytes are located.
0x01 uint30* Offset The position within the data file where the unused bytes are located.

Special Case: Block Type 5

There is a special case after a game update by the Battle.Net client. It seems that the client only writes a template file without free space entries and lets the game later complete the file. The following applies in this case:

  • the first block type is set to 5 (instead of 4) with the same layout
  • file size is always 16,384 bytes long
  • offset to next block is set to 4,096
    this is misleading: it is still at 340, and the free space block is found at that offset, not at offset 4,096 (340 is indicated by the block size field of the 1st block)
  • the free space block has an entry count of zero but is still padded to 1090 entries as usual
  • after the terminator block, the file is additionally padded to 16,384 bytes

.IDX Journals

Example file path: INSTALL_DIR\Data\data\0e00000054.idx

.IDX journals contain a mapping from keys to the location of their data in the local CASC archives. There used to be one .IDX file per journal, and the naming scheme used to have two separate meanings. The '0e' part of the file name used to designate which archive the .IDX file was associated with. This changed halfway through the Warlords Beta. Now there are 16 indices total, and the first byte of the hex filename says which of the 16 indices it is, while the remainder of the hex filename is just a version number that increments when a new set of files is added to the local archives.

To determine which of the 16 indices a key is bucketed in, the key is hashed by xoring together each 4-bit nibble in the first 9 bytes of the key:

 uint8_t cascGetBucketIndex(const uint8_t k[16]) {
   uint8_t i = k[0] ^ k[1] ^ k[2] ^ k[3] ^ k[4] ^ k[5] ^ k[6] ^ k[7] ^ k[8];
   return (i & 0xf) ^ (i >> 4);
 }

To determine the bucket index of the cross-reference entries at the start of each data file, use that same function and add 1 to the result, modulo 16 for the number of indices.

 uint8_t cascGetBucketIndexCrossReference(const uint8_t k[16]) {
   uint8_t i = cascGetBucketIndex(k);
   return (i + 1) % 16;
 }


.IDX Header Structure

The header is little-endian:

Offset (Hex) Type Name Description
0x00 uint32 HeaderHashSize The number of bytes to use for the hash at +04; usually 0x10.
0x04 uint32 HeaderHash This should equal the value of pc after calling hashlittle2 on the following HeaderHashSize bytes of the file with an initial value of 0 for pb and pc.
0x08 uint16 Unk0 Must be 7
0x0a uint8 BucketIndex The bucket index of this file; should be the same as the first byte of the hex filename.
0x0b uint8 Unk1 Must be 0
0x0c uint8 EntrySizeBytes Must be 4
0x0d uint8 EntryOffsetBytes Must be 5
0x0e uint8 EntryKeyBytes Must be 9
0x0f uint8 ArchiveFileHeaderBytes Must be 30
0x10 uint64 ArchiveTotalSizeMaximum The maximum size of a casc installation; 0x4000000000, or 256GiB.
0x18 char[8] padding The header is padded with zeroes to the next 0x10-byte boundary.
0x20 uint32 EntriesSize This is the length in bytes of the entries in the index file.
0x24 uint32 EntriesHash This should equal the value of pc after calling hashlittle2 on the following EntriesSize bytes of the file with an initial value of 0 for pb and pc.

.IDX Entry Structure

  • The rest of the file is populated by these normal entries, each 0x12 bytes in size. Structure names were invented by the author of this section because official names were not available.
Offset (Dec) Type Name Description
00 char[9] Key The first 9 bytes of the key for this entry.
09 uint40* Offset Unlike the other little-endian integers in this file, this is a big-endian 5-byte integer. The top 10 bits are the number of the archive (data.%03d), and the bottom 30 bits are the offset in that archive to the file data.
14 uint32 Size The length of the file in bytes.
  • * designates unusual data types. In C#, you can read the Offset by reading a Byte, reading a big-endian UInt32, shifting the byte left 32 bits, and ORing them together. Use a 30-bit mask (0x3fffffff) to get the file offset, and right shift the value 30 bits to get the archive number.

.XXX Data Files

Example file path: INSTALL_DIR\Data\data\data.015

These files consist of a sequence of headers with corresponding BLTE data.

Most .xxx archives begin with 16 special index cross-linking files. These files have no data and have encoding keys of XXYYbba1af16c50e1900000000000000, where XX is the index number and YY is the .xxx number. The purpose of these files is unclear.

  • The data header.
Offset (Hex) Type Name Description
0x00 char[0x10] BlteHash Encoding key of the file, in reversed byte order. Note that only as many bytes (final bytes in this reversed order) of this key as are contained in the .idx files (9) must be accurate, and the remaining 7 bytes may be 0s or otherwise altered.
0x10 uint32_t Size The size of this header + the following data.
0x14 char[0x02] Flags?? Unknown. Mostly 0. Set to 1,0 by Agent.exe on index cross-linking files, possibly indicating data-less metadata files.
0x16 uint32_t ChecksumA hashlittle(first 0x16 bytes of the header, 0x3D6BE971)
0x1A uint32_t ChecksumB Checksum of the first 0x1A bytes of the header. The exact algorithm seems to vary over time. The current implementation is described below

The algorithm for calculating ChecksumB is equivalent to the following C code:

   // Table is extracted from Agent.exe 8020. Hasn't changed for quite a while
   uint32_t TABLE_16C57A8[0x10] = {
       0x049396b8, 0x72a82a9b, 0xee626cca, 0x9917754f, 0x15de40b1, 0xf5a8a9b6, 0x421eac7e, 0xa9d55c9a,
       0x317fd40c, 0x04faf80d, 0x3d6be971, 0x52933cfd, 0x27f64b7d, 0xc6f5c11b, 0xd5757e3a, 0x6c388745,
   };
   
   // Arguments:
   //  header: Pointer to the memory containing the header
   //  archive_index: Number of the data file the record is stored in (e.g. xxx in data.xxx)
   //  archive_offset: Offset of the header inside the archive file
   // Precondition: Header is at least 0x1e bytes (e.g a full header)
   // Precondition: checksum_a has already been calculated and stored in the header
   // Assumption: Code is written assuming little endian.
   uint32_t checksum(uint8_t *header, uint16_t archive_index, uint32_t archive_offset) {
       // Top two bits of the offset must be set to the bottom two bits of the archive index
       uint32_t offset = (offset & 0x3fffffff) | (archive_index & 3) << 30;
   
       uint32_t encoded_offset = TABLE_16C57A8[(offset + 0x1e) & 0xf] ^ (offset + 0x1e);
   
       uint32_t hashed_header = 0;
       for (int i = 0; i < 0x1a; i++) { // offset of checksum_b in header
           ((uint8_t *)&hashed_header)[(i + offset) & 3] ^= header[i];
       }
   
       uint32_t checksum_b = 0;
       for (int j = 0; j < 4; j++) {
           int i = j + 0x1a + offset;
           ((uint8_t *)&checksum_b)[j] = ((uint8_t *)&hashed_header)[i & 3] ^ ((uint8_t *)&encoded_offset)[i & 3];
       }
       return checksum_b;
   }


  • The BLTE data.
Offset (Hex) Type Name Description
0x00 char[Header.Size - 30] Data The BLTE file data. See the BLTE page.