CASC
Missing something? This page was recently split up into separate pages. For the content transfer part of NGDP, see TACT. This page should now only contain information on the local filesystem format called CASC.
CASC is the name of the new file system that Blizzard created to replace the outdated format of MPQ. It is used in WoW versions ≥ .
CASC v1
The CASC file system made its first debut in the Heroes of the Storm Technical Alpha, which was hosted on Blizzard's servers in late January. The form of CASC that Heroes of the Storm uses is designated by Blizzard as "CASC". In contrast, World of Warcraft's "build-playbuild-installer" config line clearly states it is generated by "ngdptool_casc2" (NGDP stands for Next Generation Download Procotol). These are the two most substantial changes between CASC v1 and CASC v2:
- Sections of CASC v1 data files are grouped together in collections of files we call "packages". These packages all have the same root folder, and if all of the files are not properly added with the package's base directory, the extraction process will produce an incredibly mangled directory output. This system is completely removed in CASC v2.
- CASC v1's Root file relates content hashes to file names. CASC v2's Root file relates content hashes to name hashes. Translating name hashes to file names requires use of the Jenkins Hash function [1], which in turn requires a listfile to generate the hashes. Essentially CASC v1 has its own listfile (in root). CASC v2 does not, and requires the user to provide names.
The remainder of this article will refer exclusively to the system called CASC v2 as 'CASC'. While many parts of the file system are identical between v1 and v2, there are enough changes to make explaining both formats at once inadvisable.
Journal-based Data Files
During the installation process for a Blizzard game, the program will download the required files as requested by root, encoding, download, and install. It stores the downloaded data fragments in data files in "INSTALL_DIR\Data\data\". The program will record the content hash (BLTE-compressed hash), size, and position of the file as well as the number of the data file that it is in. It places those four parameters into journal files with the extension '.idx'.
The shared memory file is called 'shmem' and is usually located in the same folder as the data and .IDX journals. This file contains the path where the data files are stored, which is the current version of each of the .IDX files, and which areas of the data files have unused space. The file is recreated every time a client is started.
- The first part of the header.
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint32_t | BlockType | A value indicating what type of block this is. For this block, the value is either 4 or 5. • If the type is 5, then the free space block contains no entries. (see below) |
0x04 | uint32_t | NextBlock | The offset of the next block. |
0x08 | char[0x100] | DataPath | The path to the data files. It is always prefixed with "Global\". The path uses forward slashes (except the prefix). • When the file is written by the Battle.Net client, it is an absolute path. • When the file is written by the game, it is a relative path (relative from the game executable). |
- Followed by a number of these entries. The count can be calculated like this: (NextBlock - 264 - idxFileCount * 4) / 8
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint32_t | Size | The size of the block. |
0x04 | uint32_t | Offset | The offset of the block. |
- Followed by a number of these entries. The count is equal to number of .IDX files (usually 16).
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint32_t | Version | The version number. Used to identify the .IDX filename. |
After a small header, this structure is split up into two equal parts. The first part contains entries with the number of unused bytes. The second part contains entries with the position of the unused bytes.
There can be up to 1090 entries. Each of the two parts will always be 5450 bytes, so if there are fewer than 1090 entries, the rest of the bytes will be padded with '\0'.
- The header part of the structure.
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint32_t | BlockType | A value indicating what type of block this is. For this block, the value is 1. |
0x04 | uint32_t | NextBlock | The offset of the next block. |
0x08 | char[0x18] | Padding | Padding at the end of the header. |
- This is the number of unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint10* | DataNumber | Before Agent v8012? this was always set to 0. After Agent v8012? this appears to be 1 when the data file referenced by the equivalent entry in the unused byte positions section has not been created. |
0x01 | uint30* | Count | The number of unused bytes. |
- This is the position of the unused bytes. There can be up to 1090 entries of these. If there are fewer, the rest of the area is padded.
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint10* | DataNumber | The number of the data file where the unused bytes are located. |
0x01 | uint30* | Offset | The position within the data file where the unused bytes are located. |
Special Case: Block Type 5
There is a special case after a game update by the Battle.Net client. It seems that the client only writes a template file without free space entries and lets the game later complete the file. The following applies in this case:
- the first block type is set to 5 (instead of 4) with the same layout
- file size is always 16,384 bytes long
- offset to next block is set to 4,096
this is misleading: it is still at 340, and the free space block is found at that offset, not at offset 4,096 (340 is indicated by the block size field of the 1st block) - the free space block has an entry count of zero but is still padded to 1090 entries as usual
- after the terminator block, the file is additionally padded to 16,384 bytes
.IDX Journals
Example file path: INSTALL_DIR\Data\data\0e00000054.idx
.IDX journals contain a mapping from keys to the location of their data in the local CASC archives. There used to be one .IDX file per journal, and the naming scheme used to have two separate meanings. The '0e' part of the file name used to designate which archive the .IDX file was associated with. This changed halfway through the Warlords Beta. Now there are 16 indices total, and the first byte of the hex filename says which of the 16 indices it is, while the remainder of the hex filename is just a version number that increments when a new set of files is added to the local archives.
To determine which of the 16 indices a key is bucketed in, the key is hashed by xoring together each 4-bit nibble in the first 9 bytes of the key:
uint8_t cascGetBucketIndex(const uint8_t k[16]) { uint8_t i = k[0] ^ k[1] ^ k[2] ^ k[3] ^ k[4] ^ k[5] ^ k[6] ^ k[7] ^ k[8]; return (i & 0xf) ^ (i >> 4); }
To determine the bucket index of the cross-reference entries at the start of each data file, use that same function and add 1 to the result, modulo 16 for the number of indices.
uint8_t cascGetBucketIndexCrossReference(const uint8_t k[16]) { uint8_t i = cascGetBucketIndex(k); return (i + 1) % 16; }
.IDX Header Structure
The header is little-endian:
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | uint32 | HeaderHashSize | The number of bytes to use for the hash at +04; usually 0x10. |
0x04 | uint32 | HeaderHash | This should equal the value of pc after calling hashlittle2 on the following HeaderHashSize bytes of the file with an initial value of 0 for pb and pc. |
0x08 | uint16 | Unk0 | Must be 7 (CascLib names this a versioning byte for the file) |
0x0a | uint8 | BucketIndex | The bucket index of this file; should be the same as the first byte of the hex filename. |
0x0b | uint8 | Unk1 | Must be 0 (CascLib names this "ExtraBytes"; must be zero) |
0x0c | uint8 | Spec.Size | Number of bytes used to encode the size of the entry in its record. Usually 4. |
0x0d | uint8 | Spec.Offset | Number of bytes used to encode the archive number and offset of the entry in its record. Usually 5. |
0x0e | uint8 | Spec.Key | Number of bytes used to encode the beginning of the key for an entry in its record. Usually 9. |
0x0f | uint8 | Spec.OffsetBits | Number of bits used to encode the archive offset in the record's Offset field. Usually 30. |
0x10 | uint64 | ArchiveTotalSizeMaximum | The maximum size of a casc installation; 0x4000000000, or 256GiB. |
0x18 | char[8] | padding | The header is padded with zeroes to the next 0x10-byte boundary. |
0x20 | uint32 | EntriesSize | This is the length in bytes of the entries in the index file. To get the amount of entries populated, divide by Spec.Size + Spec.Offset + Spec.Key |
0x24 | uint32 | EntriesHash | This should equal the value of pc after calling hashlittle2 on the following EntriesSize bytes of the file with an initial value of 0 for pb and pc. |
.IDX Entry Structure
- The rest of the file is populated by these normal entries, each 0x12 bytes in size. Structure names were invented by the author of this section because official names were not available.
Every field is interpreted as little-endian unless specified
Offset (Dec) | Type | Name | Description |
---|---|---|---|
00 | char[Header.Spec.Key] | Key | The first X bytes of the key for this entry. |
09 | char[Header.Spec.Offset]* | Offset | Unlike the other little-endian integers in this file, this is a big-endian N-byte integer. The top X bits are the number of the archive (data.%03d), and the bottom Y bits are the offset in that archive to the file data. See the explanation below. |
14 | char[Header.Spec.Size] | Size | The length of the file in bytes. |
- * designates unusual data types. In C#, you can read the Offset by reading a Byte, reading a big-endian UInt32, shifting the byte left 32 bits, and ORing them together. (This is when offset is encoded on 5 bytes - if not, treat the bytes as big endian variable length u64)
- * Use a
Header.Spec.OffsetBits
-bit mask to get the file offset, right shift the valueHeader.Spec.OffsetBits
bits, and mask with(1 << (Header.Spec.Offset * 8 - Header.Spec.OffsetBits)) - 1
to get the archive number.
Example Rust code to parse this file
pub struct Index {
bucket : u8,
spec : EntrySpec,
entry_count : u32,
buffer : Vec<u8>, /* .IDX file data after the header */
}
struct EntrySpec {
size : u8,
offset : u8,
key : u8,
offset_bits : u8
}
pub struct Entry<'a>(&'a Index, Range<usize> /* range of bytes in the file for this entry */);
impl Entry<'_> {
pub fn key(&self) -> &[u8] {
let record = &self.0.buffer[self.1];
let range : Range<usize> = Range {
start : 0,
end : self.0.spec.key as _
};
&record[range]
}
pub fn size(&self) -> u64 {
let record = &self.0.buffer[self.1];
let range : Range<usize> = Range {
start : self.0.spec.key + self.0.spec.offset as _,
end : self.0.spec.key + self.0.spec.offset + self.0.spec.size as _
};
let mut record = &record[range];
record.read_uint::<LittleEndian>(self.0.spec.size as _).unwrap()
}
pub fn offset(&self) -> (u64, u64) {
let record = &self.0.buffer[self.1];
let range : Range<usize> = Range {
start : self.0.spec.key as _,
end : self.0.spec.key + self.0.spec.offset as _
};
let mut record = &record[range];
let raw_value = record.read_uint::<BigEndian>(self.0.spec.offset as _).unwrap();
let archive_bits = self.0.spec.offset * 8 - self.0.spec.offset_bits;
let offset_bits = self.0.spec.offset_bits;
(
(raw_value >> offset_bits) & ((1 << archive_bits) - 1),
(raw_value & ((1 << offset_bits) - 1))
)
}
}
010 Template
struct File {
int HeaderHashSize;
int HeaderHash;
short Version;
byte Bucket;
byte ExtraBytes;
struct {
byte Size;
byte Offset;
byte Key;
byte OffsetBits;
} EntrySpec;
int64 ArchiveTotalSizeMaximum;
byte _[8] <hidden=true>;
int EntryCount;
int EntriesHash;
local int entrySize;
entrySize = EntrySpec.Size + EntrySpec.Offset + EntrySpec.Key;
entrySize = EntryCount / entrySize;
local int archiveBits = EntrySpec.Offset * 8 - EntrySpec.OffsetBits;
local int offsetBits = EntrySpec.OffsetBits;
struct EntryInfo {
byte Key[EntrySpec.Key];
struct {
BigEndian();
BitfieldDisablePadding();
int ArchiveID : archiveBits;
int Offset : offsetBits;
LittleEndian();
} Offset;
byte Size[EntrySpec.Size];
} Entries[entrySize] <optimize=false>;
} file;
.XXX Data Files
Example file path: INSTALL_DIR\Data\data\data.015
These files consist of a sequence of headers with corresponding BLTE data.
Most .xxx archives begin with 16 special index cross-linking files. These files have no data and have encoding keys of XXYYbba1af16c50e1900000000000000, where XX is the index number and YY is the .xxx number. The purpose of these files is unclear. In Overwatch the cross linking hash changes depending on the patch. For patch 128150 it was XXYY005386c9ab946d00000000000000.
- The data header.
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | char[0x10] | BlteHash | Encoding key of the file, in reversed byte order. Note that only as many bytes (final bytes in this reversed order) of this key as are contained in the .idx files (9) must be accurate, and the remaining 7 bytes may be 0s or otherwise altered. |
0x10 | uint32_t | Size | The size of this header + the following data. |
0x14 | char[0x02] | Flags?? | Unknown. Mostly 0. Set to 1,0 by Agent.exe on index cross-linking files, possibly indicating data-less metadata files. |
0x16 | uint32_t | ChecksumA | hashlittle(first 0x16 bytes of the header, 0x3D6BE971) |
0x1A | uint32_t | ChecksumB | Checksum of the first 0x1A bytes of the header. The exact algorithm seems to vary over time. The current implementation is described below |
The algorithm for calculating ChecksumB is equivalent to the following C code:
// Table is extracted from Agent.exe 8020. Hasn't changed for quite a while uint32_t TABLE_16C57A8[0x10] = { 0x049396b8, 0x72a82a9b, 0xee626cca, 0x9917754f, 0x15de40b1, 0xf5a8a9b6, 0x421eac7e, 0xa9d55c9a, 0x317fd40c, 0x04faf80d, 0x3d6be971, 0x52933cfd, 0x27f64b7d, 0xc6f5c11b, 0xd5757e3a, 0x6c388745, }; // Arguments: // header: Pointer to the memory containing the header // archive_index: Number of the data file the record is stored in (e.g. xxx in data.xxx) // archive_offset: Offset of the header inside the archive file // Precondition: Header is at least 0x1e bytes (e.g a full header) // Precondition: checksum_a has already been calculated and stored in the header // Assumption: Code is written assuming little endian. uint32_t checksum(uint8_t *header, uint16_t archive_index, uint32_t archive_offset) { // Top two bits of the offset must be set to the bottom two bits of the archive index uint32_t offset = (offset & 0x3fffffff) | (archive_index & 3) << 30; uint32_t encoded_offset = TABLE_16C57A8[(offset + 0x1e) & 0xf] ^ (offset + 0x1e); uint32_t hashed_header = 0; for (int i = 0; i < 0x1a; i++) { // offset of checksum_b in header ((uint8_t *)&hashed_header)[(i + offset) & 3] ^= header[i]; } uint32_t checksum_b = 0; for (int j = 0; j < 4; j++) { int i = j + 0x1a + offset; ((uint8_t *)&checksum_b)[j] = ((uint8_t *)&hashed_header)[i & 3] ^ ((uint8_t *)&encoded_offset)[i & 3]; } return checksum_b; }
- The BLTE data.
Offset (Hex) | Type | Name | Description |
---|---|---|---|
0x00 | char[Header.Size - 30] | Data | The BLTE file data. See the BLTE page. |