LMS > File Format Overview

This page describes the general structure of files loaded by the LMS library. The files start with a header which is followed by different kinds of blocks. All blocks are aligned to 16 bytes. The space between two blocks is padded with 0xAB bytes.

The following file formats are used by LMS:

File Header

OffsetSizeDescription
0x08Magic number
0x82Endianness (0xFEFF=Big, 0xFFFE=Little)
0xA2Unknown (always 0?)
0xC1Message encoding (0=UTF-8, 1=UTF-16, 2=UTF-32)
0xD1Version number
0xE2Number of blocks
0x102Unknown (always 0?)
0x124Filesize
0x1610Padding

Block Header

OffsetSizeDescription
0x04Block type
0x44Block size (without header)
0x88Padding
0x10Block data

Hash Tables

Many items (such as messages in msbt files or colors in msbp files) are looked up by label. The labels are looked up with a hash table and are stored in a different block than the items themselves. In official files the hash table always has a fixed number of buckets (101 in msbt files, 29 in msbp files), even if it contains only a few labels.

The following hash algorithm is used:

def calc_hash(label, num_buckets):
    hash = 0
    for char in label:
        hash = hash * 0x492 + ord(char)
    return (hash & 0xFFFFFFFF) % num_buckets

The block with the labels contains the following data:

OffsetSizeDescription
0x04Number of buckets
0x48 per bucketHash table buckets
Labels

Hash Table Bucket

OffsetSizeDescription
0x04Number of labels
0x44Offset to labels

Label

OffsetSizeDescription
0x01Length of label string
0x1Label string (without null terminator)
4Item index