Cairn
Index Format
The KnowledgeBank index is a single append-only file of authenticated-encryption envelopes. This page is the complete specification. Write your own reader in any language; the format is yours.
1. Overview
The Cairn KnowledgeBank index is a local-first append-only log. Each successful export appends one row describing the artefact written to disk. Rows are stored in the browser's Origin Private File System (OPFS) as a JSON-Lines file where each line is a separately-encrypted authenticated envelope.
This format is published so the user is never locked into Cairn. If the extension disappears, is uninstalled, or the user simply prefers their own tooling, the on-disk bytes and the user's passphrase are sufficient to reconstruct every index row outside the extension. The reference reader at the bottom of this page is the entire dependency.
This spec versions independently of the extension. The version token cairn/v1 appears in the AAD and the KDF parameters file. A future cairn/v2 reader will refuse to silently downgrade.
2. File layout
Three files live under the OPFS subdirectory cairn/. Nothing else is written to OPFS by the index module.
cairn/library.jsonl.enc— the append-only encrypted log. One line per envelope. Each line is the base64 encoding of[12-byte AES-GCM nonce][ciphertext][16-byte GCM auth tag]concatenated in that order. Lines are terminated by a single\n. No header, no footer, no record separator beyond the newline.cairn/library.salt.bin— 16 random bytes, written once at first-paid-unlock, never rewritten. Plaintext. If this file is lost, the index becomes unrecoverable. This is by design.cairn/library.kdf.json— the KDF parameters as a single JSON object. Plaintext. Pinned at first-paid-unlock so the parameters cannot drift mid-library.
The encrypted log file is line-oriented so a reader can stream it without parsing JSON until it has a decrypted plaintext. A truncated or corrupted line fails GCM verification and is reported as a single bad row; the rest of the file remains readable.
3. Row schema
The plaintext of each envelope is a single JSON object on one line. Field order is not significant; field presence is. A reader that finds an unrecognised field must preserve it and continue. A reader that finds a missing required field must mark that row corrupt.
{
"ts": "2026-05-29T14:32:08.412Z",
"conversation_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"file_name": "claude-2026-05-29-amber-thread.json",
"sha256": "9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08",
"size_bytes": 48213,
"message_count": 42,
"artefact_count": 3
}
ts— ISO-8601 string in UTC with millisecond precision and a trailingZ. Always wall-clock at the moment the row was written.conversation_id— RFC 4122 UUID string. Source-of-truth is the Claude conversation identifier; the extension does not re-mint it.file_name— the base file name written to the Downloads folder. No path component. Bytes are UTF-8.sha256— lowercase hex SHA-256 of the file's raw bytes as written. 64 hex characters.size_bytes— non-negative integer, byte count of the file.message_count— non-negative integer, number of conversation turns captured.artefact_count— non-negative integer, number of artefacts (code blocks, attachments) captured alongside the conversation file.
4. Encryption envelope
Each row is sealed independently. Rotating a key requires re-sealing every row; partial rotation is not supported in cairn/v1.
- Cipher: AES-256-GCM. Key length 32 bytes. Tag length 16 bytes.
- Nonce: 12 random bytes per row, generated with a CSPRNG (
crypto.getRandomValues). Never reused under the same key. - Plaintext: the JSON-encoded row from section 3, UTF-8 bytes, no trailing newline.
- Associated data (AAD): the literal ASCII string
cairn/v1/library/row(20 bytes). Verifying this AAD binds the envelope to this version and prevents cross-context replay. - On-disk encoding: base64-standard (RFC 4648 §4, with padding) of the concatenation
nonce || ciphertext || tag, followed by a single\n.
The reader recovers the row by: base64-decoding the line, splitting off the first 12 bytes as the nonce, treating the remainder as ciphertext || tag, calling AES-GCM decrypt with the derived key and the AAD cairn/v1/library/row, and parsing the resulting plaintext as JSON.
5. KDF parameters
Key derivation is Argon2id with parameters pinned at first-paid-unlock. The parameters live in cairn/library.kdf.json so a future reader can derive the key without guessing.
{
"version": "cairn/v1",
"kdf": "argon2id",
"t": 3,
"m": 65536,
"p": 4,
"outLen": 32,
"saltPath": "cairn/library.salt.bin"
}
t— time cost (iterations). Fixed at 3 for v1.m— memory cost in KiB. 65536 KiB = 64 MiB.p— parallelism. Fixed at 4 for v1.outLen— derived key length in bytes. Fixed at 32 (AES-256).saltPath— OPFS-relative path to the 16-byte salt file.
The derived 32-byte key is cached in chrome.storage.session for the browser session only and is never written to disk in plaintext. Closing the browser flushes the session store; reopening it requires re-entering the passphrase.
correct horse battery staple with a 16-byte salt of 0x0f1e2d3c4b5a69788796a5b4c3d2e1f0 and the parameters above produces a 32-byte key whose hex representation begins a7 c4 9e 1b…. Real values depend on the exact Argon2id implementation; treat this only as a shape check. Any conforming Argon2id library with the parameters listed will derive a key that decrypts envelopes sealed by another conforming implementation.
6. Reference reader
The complete decode loop in pseudocode. Port to any language with an Argon2id library and an AES-GCM library; thirty lines is the entire surface area.
# inputs: passphrase, opfs_root
# outputs: stream of plaintext rows
kdf_json = read_json(opfs_root / "cairn/library.kdf.json")
assert kdf_json["version"] == "cairn/v1"
assert kdf_json["kdf"] == "argon2id"
salt = read_bytes(opfs_root / kdf_json["saltPath"])
assert len(salt) == 16
key = argon2id(
password = passphrase.encode("utf-8"),
salt = salt,
t = kdf_json["t"], # 3
m = kdf_json["m"], # 65536 KiB
p = kdf_json["p"], # 4
out_len = kdf_json["outLen"], # 32
)
AAD = b"cairn/v1/library/row"
for line in read_lines(opfs_root / "cairn/library.jsonl.enc"):
blob = base64_decode(line.strip())
nonce = blob[:12]
ct_and_tag = blob[12:]
plaintext = aes_256_gcm_decrypt(key, nonce, ct_and_tag, AAD)
row = json_loads(plaintext)
yield row
That is the entire reader. A reader written this way is independent of Cairn for the lifetime of the user's library.
7. What this spec doesn't cover
- Conversation contents. The index row records metadata about an exported file; it does not store conversation text. The conversation file itself lives in the user's Downloads folder as plaintext per the user's request.
- Key escrow. There is no recovery channel. Losing the passphrase loses the library. Blacktrace cannot reset a passphrase and does not hold a copy.
- Search index format. The index-browse surface in the extension builds its in-memory search structure from the decrypted rows at boot. That structure is not persisted and is not part of this spec.
- Synchronisation. The index is single-device. No cloud sync, no multi-device merge protocol. Copying
cairn/library.jsonl.enc+cairn/library.salt.bin+cairn/library.kdf.jsonto a second machine and re-entering the passphrase is the entire supported migration path. - Tamper detection beyond per-row GCM. A row's authenticated encryption proves the row's integrity. The file as a whole has no Merkle root and no signed manifest. A reader that requires end-to-end tamper evidence should add its own outer signature.
- Forward secrecy. Rotating a compromised passphrase requires re-sealing every row with a key derived from the new passphrase. The format does not provide automatic forward secrecy across rows.