DreamDB Specification — 0000: Overview

Status: Draft. This document establishes the vocabulary and conceptual model for the rest of the spec. Every subsequent document inherits the terms and stance defined here.

1. What DreamDB Is

DreamDB is a storage and retrieval protocol for multimodal signals (video, audio, text, vectors, and arbitrary derived features) anchored to a shared timeline.

It defines:

A data model for representing multimodal information as time-anchored, immutable, content-addressed objects.
An addressing scheme in which the semantic features of data (e.g. learned vector embeddings) deterministically encode the storage path of that data.
A backend HTTP contract that any cloud object store (S3, OSS, GCS, Azure Blob, MinIO, …) already provides or can be made to provide — Range reads, idempotent puts, optional conditional writes for refs.
A client protocol for ingesting, layering, querying, and streaming data, with search logic pushed to the SDK rather than executed by a centralized service.

DreamDB is not:

A database, in the sense of a managed service with its own query engine.
A specific implementation. DreamDB is the spec; implementations are conformant when they obey it.
An indexing library bolted onto an existing data store. Indexing is intrinsic to the addressing scheme — there is no separate index to keep in sync.

2. Why a Protocol, Not a Database

Classical databases (relational, document, vector) bundle three concerns: a data model, a query engine, and a storage layer. They couple these tightly because shared state simplifies optimization and consistency.

That coupling becomes the bottleneck at the scale DreamDB targets:

10B+ multimodal records. No single query engine sits on the hot path of every read without becoming the constraint.
1,000+ concurrent collaborators. Shared mutable state requires global coordination; every "update" is a potential conflict.
Heterogeneous backends. The data outlives the storage technology. A protocol that survives backend migration is more useful than a database that does not.

A DreamDB space is, instead, a set of immutable files arranged on a backend in a way that the addressing scheme can interpret without a coordinator. The "engine" is a library that runs in the client. Any number of clients can read the same space, write new immutable layers to it, and exchange manifests that describe what exists — with no shared service mediating.

The closest analogues are IPFS (content-addressed object store with a wire protocol), Git (immutable, content-addressed history of mutable references), and Parquet/Iceberg (storage-layout conventions read by many engines). DreamDB borrows from all three but specializes them for time-anchored multimodal signals.

3. The Architectural Shape

       ┌──────────────────────────────────────────┐
       │              Application                 │  ← user code
       └───────────────────┬──────────────────────┘
                           │  (function call)
       ┌───────────────────▼──────────────────────┐
       │               DreamDB SDK                 │
       │ ┌──────────────────────────────────────┐ │
       │ │         DreamDB Protocol              │ │ ← the spec lives here:
       │ │     (the rules of the spec)          │ │   hashing, time-axis
       │ │                                      │ │   mapping, vector→path,
       │ │                                      │ │   Manifest parsing
       │ └────────────────┬─────────────────────┘ │
       │                  │                       │
       │ ┌────────────────▼─────────────────────┐ │ ← thin transport shim:
       │ │        Storage Connector             │ │   maps protocol calls
       │ │     (driver-driver — no DreamDB       │ │   to HTTP. No DreamDB
       │ │      logic; just transport)          │ │   logic of its own.
       │ └────────────────┬─────────────────────┘ │
       └──────────────────┼───────────────────────┘
                          │  HTTP Range-GET / PUT
       ┌──────────────────▼───────────────────────┐
       │         Cloud Object Store               │ ← pure storage block
       │      (S3 / OSS / GCS / Azure Blob /      │   (NO DreamDB logic
       │       MinIO — anything that speaks       │    here)
       │       HTTP with Range and conditional    │
       │       writes)                            │
       └──────────────────────────────────────────┘

The architecture has three layers, two boundaries:

DreamDB Protocol (the upper sub-component of the SDK): all interesting logic lives here — content hashing, time-axis mapping, vector → spatial-path computation, Manifest parsing, Index Page tree traversal, query planning. This is what the spec normatively defines.
Storage Connector (the lower sub-component of the SDK): a thin in-process shim that translates Protocol-level operations into HTTP requests against the backend. It carries no DreamDB semantics of its own — its only job is transport. In Rust terms, this is a few hundred lines wrapping reqwest/hyper.
Cloud Object Store: a pure-storage backend that speaks HTTP with Range: reads and conditional-write semantics. No DreamDB code runs here. S3, OSS, GCS, Azure Blob, MinIO, and any CDN-fronted equivalent qualify. This is the only kind of backend DreamDB v0 supports.

The two boundaries:

Function-call boundary between the Application and the SDK. The application links the SDK as a library; there is no DreamDB daemon, service, or RPC on this side.
HTTP boundary between the Storage Connector and the Cloud Object Store. The contract is HTTP semantics — what HTTP verbs, what status codes, what conditional headers, what consistency guarantees. 0005-backend-interface.md makes that contract precise.

Two things that follow from this shape:

The DreamDB Protocol is portable. A future native HDFS or IPFS Connector would swap out the Storage Connector layer without touching the Protocol component above it. v0 ships only the HTTP Connector; alternatives are out of scope.
The conformance story is HTTP-shaped. "Is this backend DreamDB-compliant?" reduces to "does it respond to a defined set of HTTP requests with the required semantics?" — a story testable end-to-end with curl.

Everything above the SDK boundary is application code. Everything inside the SDK is normatively specified by DreamDB. Everything below the HTTP boundary is commodity storage. DreamDB defines the SDK and the HTTP contract — nothing else.

4. Core Vocabulary

The following terms are introduced informally here and given precise definitions in later documents (0001-data-model.md for the entities; 0002-content-addressing.md for addresses; 0003-time-encoding.md for time semantics).

Timeline. A monotonic axis of high-precision timestamps. Every piece of data in a DreamDB space is anchored to a point or interval on the timeline. The timeline is the only primary key.
Item. The atomic unit of a track: a single time-anchored payload of the track's modality. The kind of item depends on the kind of track (see below): a Frame for continuous signals, an Event for discrete events, a Constant for global constants.
Track. A collection of items of one modality, anchored to a timeline. A track has a kind that determines its time-anchoring discipline and its storage layout. DreamDB defines three track kinds:
- Continuous Signal Track. Dense items (Frames) over an interval of the timeline, typically at a regular cadence. Examples: video, audio, dense per-frame embeddings. Optimized for streaming consumption (Principle 4).
- Discrete Event Track. Sparse, irregularly-spaced items (Events), each anchored to its own timestamp or short interval. Examples: scene-change markers, transcript turns, annotations, sensor pings.
- Global Constant Track. Exactly one item (a Constant), anchored to the entire timeline span. Examples: title, author, license, source URI, calibration constants. The "name" of the constant is carried by the track's modality (e.g. title.text, author.text), so there is no separate name-within-a-track key.
A "video track" and an "audio track" of the same recording are two Continuous Signal Tracks occupying the same interval; a "title" and a "license" are two Global Constant Tracks on the same timeline.
Layer. A derived track placed over an existing track (or set of tracks) on the same time interval. Example: a vector embedding layer derived from a video track. Layers are how new information is added without modifying originals.
Manifest. An immutable, content-addressed object that enumerates the tracks and layers present on a timeline at a given moment in the space's history. Like a Git commit, manifests reference their parents and are themselves addressable.
Address. The protocol-level identifier for any object. Combines time anchor, modality, and (for spatially-indexed data) feature coordinates. Addresses are deterministic functions of content, not opaque identifiers.
Backend. The cloud object store hosting the bytes — S3, OSS, GCS, Azure Blob, MinIO, or any HTTP-speaking equivalent. Backends carry no DreamDB logic; they obey the HTTP contract defined in 0005-backend-interface.md.
Storage Connector. The thin in-SDK transport shim that translates DreamDB Protocol operations into HTTP requests against the Backend. Adds no DreamDB semantics of its own.
SDK. The client library; combines the DreamDB Protocol component (where all spec-defined logic lives) and the Storage Connector. Linked into the application as a library; no separate process or service.
Space. A single DreamDB instance — a set of timelines, tracks, layers, and manifests that share a content-addressing root and a backend.

5. The Four First Principles, in Spec Form

The principles in the project vision are aesthetic; this section translates each into a concrete obligation that the rest of the spec must honor.

5.1 Time is the sole primary key

The protocol defines no user-level identifier scheme. Tracks and items are addressed by their position on the timeline plus their modality, not by name or UUID.
All co-occurring tracks share the same timeline. There is no per-track clock that needs reconciliation.
The principle holds across all three track kinds, with progressively degenerate time anchors:
- Continuous Signal Tracks anchor each Frame to a point (or short, regular interval).
- Discrete Event Tracks anchor each Event to a point (or short, irregular interval).
- Global Constant Tracks anchor their single Constant to the entire timeline span — a degenerate but valid time anchor. The constant is "true at every point on the timeline."
The timeline's resolution must be fine enough that no two distinct items within a single modality can collide at the same timestamp under any realistic ingest scenario. (Resolution and format are fixed in 0003-time-encoding.md.)
Implication for design: Joining across modalities is a range query on the timeline, not a foreign-key lookup. Constants and Events are joined to signals by overlap on the timeline, not by ID. Any spec choice that would require non-time keys for join is rejected.

5.2 Immutability is the bedrock of collaboration

DreamDB separates state into two layers with distinct mutation disciplines. The content layer holds essentially all bytes and is strictly immutable. The refs layer is small, optional, and uses optimistic concurrency. Splitting them keeps the data plane absolutely lock-free while still allowing stable, advanceable names where backends can support them.

Content layer (Items, Tracks, Manifests).

Every content-layer object is content-addressed: its address is a function of its bytes. Two writers producing the same bytes produce the same address.
The protocol has no update or delete verb at this layer. Corrections are expressed as new layers that supersede earlier ones at the manifest level. The original bytes remain.
Backends MUST support idempotent put-by-hash: writing an already-present address with the same bytes is a no-op; writing different bytes to the same content-address is impossible by construction (the address is the hash). No CAS is required at this layer.
If two clients race to write the same content, the result is indistinguishable from one client writing once.

Refs layer (named manifest pointers — optional).

A ref is a small named pointer that maps a stable name (e.g. main, release/v1) to the address of a Manifest. Refs are the only mutable thing in the system.
Updates to a ref use conditional write (CAS / If-Match): a writer reads ref → addr_old, computes addr_new, and attempts put(ref, addr_new) if current == addr_old. The loser retries against the new state. This is optimistic concurrency, not cooperative locking — no writer ever waits on another writer's lock.
Refs are optional. A Space can be addressed purely by manifest hash (dreamdb://<backend>/<manifest-hash>); the refs layer is a convenience for "I want a stable name that advances," not a requirement. Backends that do not expose a conditional-write primitive are still conformant — they support hash-addressed Spaces only, with manifest distribution handled out-of-band.
Conditional-write primitives that satisfy this requirement exist on every modern object store (S3 If-Match, GCS x-goog-if-generation-match, Azure Blob If-Match, Aliyun OSS x-oss-forbid-overwrite) and on local filesystems (O_CREAT|O_EXCL + rename).

Implication for design: The data plane is absolutely lock-free. The refs plane is coordination-free in the no-conflict case (the expected case for DreamDB's branch-per-writer pattern); rare conflicts are detected and retried, never blocked. No part of the spec may require cooperative locking between writers in either plane.

5.3 Retrieval is probabilistic localization, not scanning

Every DreamDB Item address has a two-part structure:

   <prefix>                           /  <payload-hash suffix>
   ───────────────────────────────       ───────────────────────
   what queries TARGET                   what disambiguates collisions
   (deterministic from time +            (deterministic from the
    modality, plus a spatial            item's bytes)
    component for feature-bearing
    items)

This structure supports two cheap lookup modes:

Lookup by time / by feature region — list-prefix(<prefix>) returns the set of items matching that prefix. One backend list operation; usually 1 entry, occasionally more.
Lookup by exact identity — get(<full address>) returns the one item with those bytes. Used when the full address is already known (e.g. from a manifest entry, or from a prior list-prefix hit).

Item ≠ Object: the bucketing layer

At 10B+ scale, an additional distinction is essential. An Item is the protocol's logical addressable unit; an Object is the backend's physical storage unit. Multiple Items group into one Object whenever one-Item-per-Object would be uneconomical. (1B vectors as 1B objects on S3 ≈ $400 just to scan once, with metadata overhead exceeding payload bytes.)

The full Item address therefore decomposes:

   <object-address>  ·  <intra-object-locator>
   ─────────────────     ──────────────────────
   targets one Object    locates the Item within it
   on the backend        (byte offset, array index,
                          time-within-bucket, etc.)

Both lookup modes still work: list-prefix targets a set of Objects; the SDK then extracts the matching Items from each.

This same pattern appears in three places in the spec — instances of one mechanism, not three special cases:

What groups	Object kind	Intra-object locator	Why bucket
video / audio frames	Fragment (GOP)	byte offset within fragment	VBR; per-frame closed-form byte-offset impossible
dense vectors at scale	Spatial Bucket	index into packed array	per-vector Object count fatal at 1B+ scale
high-volume events	Time-bucketed batch	time-ordered position	per-event Object count fatal at high event rates

For modalities where item count and item size are already well-matched to one-Item-per-Object (small embedding tracks, sparse low-volume events, all Constants), no bucketing is used — the Item is the Object; the intra-object locator is empty.

Per track kind

Time collisions are not a bug. Two distinct events at the same timestamp produce the same time-keyed prefix but different payload-hash suffixes, so they coexist. Lookup-by-time enumerates them all. Discrete Event Tracks rely on this; same-instant events are expected, not pathological.
For feature-bearing data (vectors, learned hashes), the prefix is augmented with a spatial-keyed component derived from the feature. Similar features produce similar prefixes (locality property quantified in 0004-spatial-indexing.md). A feature query computes the candidate prefix region — typically 1–16 Spatial Bucket Objects — fetches them, and runs exact comparison against the candidates inside.
For Discrete Event and Global Constant Tracks (no high-dimensional features), the prefix is purely time-keyed. Lookup is an O(1) prefix list per timestamp (or per time bucket).
For Continuous Signal Tracks of media modalities, the Fragment is the Object. Per-frame addressing resolves to (fragment-address, byte-range); the byte range is computed from the fragment-index in the manifest's track entry (small, cached, content-addressed). Cross-version conflicts are a layer-ordering concern (0008), not an addressing concern.

The expected number of backend Object reads per query must be sub-linear in the size of the space — provably so for the spatial-indexing scheme adopted.

Implication for design: The address grammar fixed in 0002 MUST accommodate the <object-address> · <intra-object-locator> decomposition. Per-modality bucketing schemes (Fragment, Spatial Bucket, Event Batch) and the bucket-key derivation for spatially-indexed data are pinned in 0004 and 0007.

5.4 Data is stream

For media modalities (video, audio), compression is variable-bit-rate by default. VBR makes per-frame byte position a function of content, not of timestamp, so a closed-form byte_offset(t) cannot exist. Forcing CBR would discard quality; rejecting VBR would exclude essentially all real-world media. DreamDB therefore uses fragment-based encapsulation:

Continuous Signal Tracks of media modalities are stored as a sequence of Fragments — self-contained, decoder-ready chunks (typically a GOP, ~1–10 s of media). Each Fragment is a content-addressed Object on the backend.
The manifest's track entry holds a small fragment-index: an array of (time_start_i, byte_size_i, fragment_address_i) per Fragment. This is the only "central index" in the system. It is small (KB to low MB even for hour-long tracks), content-addressed, immutable, and cached per session.
A seek to time T: consult the fragment-index (cached) → identify the containing Fragment → one backend GET for that Fragment → intra-fragment seek (microseconds, against the Fragment's own header).
The SDK never reads media bytes to discover a seek point. The fragment-index (small) and the matching Fragment (one read) are the only fetches.
A successful query against a media track returns a Fragment address + byte range that, fed to a streaming media decoder, plays back the result with no further translation step.
For Discrete Event Tracks and Global Constant Tracks, the streaming requirement is trivially satisfied: each Item (or bucket of Items) is its own Object, retrievable by a single backend read at its address.

The protocol must specify, for each supported media modality, a Fragment format and a fragment-index encoding. (0007-streaming-encapsulation.md defines fragment duration ranges, header layout, and per-modality container choice — likely fragmented MP4 / CMAF for video and audio.)

Implication for design: DreamDB's "central index" is the manifest's track entry, not anything inside the media file. No design step relies on parsing media bytes to seek. Fragmentation is mandatory for VBR media tracks; CBR-only modalities MAY skip fragmentation if the closed-form byte-offset function is exact.

6. A Worked Example, End to End

Concrete walkthrough of a single ingest-and-query cycle, using imprecise but suggestive terms. Precise definitions follow in later docs.

Ingest

A camera produces a 10-minute video. A client wishes to add it to a DreamDB space.

The client picks a timeline and a starting timestamp (e.g. the camera's wall-clock time at the start of recording).
The video is encoded as a sequence of Fragments (per 0007), each a self-contained ~2 s GOP. Each Fragment is hashed and written to the backend as an immutable Object.
The client builds a small fragment-index — [(time_start_i, byte_size_i, fragment_address_i)] — and embeds it in the new track entry.
The client publishes a new manifest that adds this video track (with its fragment-index) to the timeline. The manifest is content-addressed and references the previous manifest as its parent.

So far, no vectors, no semantic indexing.

Layering an embedding track

A second client wishes to make the video searchable by visual content.

It reads the video track's Fragments (range reads against the immutable Fragment Objects).
It computes a vector embedding per video frame (or per N-frame chunk). These vectors form a new track, anchored to the same timeline.
For each vector, the client derives a spatial-bucket key from the vector's coordinates (per 0004). Vectors that share a key prefix land in the same bucket.
Vectors are packed into Spatial Bucket Objects (one Object per bucket; targeting ~1–10 MB per Object) and written to the backend at addresses derived from (timeline, modality, spatial-bucket-key, bucket-content-hash). Few large writes, not many small ones.
A new manifest is published, adding the embedding layer (and its enumeration of Bucket Objects) to the timeline.

The original video bytes were not touched. Both clients can do this kind of layering concurrently, on the same timeline, without coordination.

Query

A third client wants to find "the moment the goalkeeper saves the penalty kick" in the video.

The client encodes the text query into a vector using the same embedding model.
It derives the same kind of spatial-bucket key from the query vector. The candidate region resolves to a small set of Spatial Bucket Objects.
It fetches those Bucket Objects (one or a few backend GETs) and runs exact comparison locally against the vectors inside.
It picks the best match. The matching vector carries a time anchor on the timeline.
The time anchor is mapped (via the manifest's fragment-index, per 0007) to the address of the containing video Fragment plus a byte offset inside it. One backend GET for the Fragment.
The Fragment is a valid streaming-container chunk. The client passes it to a media decoder and plays back the moment.

No centralized search service was involved. No file was downloaded in full. The query, the localization, and the playback all happened via deterministic address arithmetic and range reads against an immutable substrate.

This is the experience DreamDB is designed to make routine.

7. Reading Order for the Rest of the Spec

Each document builds on the vocabulary of the previous. Skipping is not recommended.

#	Document	What it pins down
0001	`data-model.md`	The entities (Timeline, Track, Frame, Layer, Manifest) precisely
0002	`content-addressing.md`	Hash function, what gets hashed, address syntax
0003	`time-encoding.md`	Timestamp format, resolution, ordering
0004	`spatial-indexing.md`	The vector → address scheme. The hardest doc.
0005	`backend-interface.md`	What backends MUST/MAY/SHOULD provide
0006	`protocol-operations.md`	The verbs: append, layer, query, stream
0007	`streaming-encapsulation.md`	Container formats; address → byte-offset functions
0008	`versioning-collab.md`	Manifest DAG, branching, conflict-free layering
0009	`conformance.md`	Test criteria for "is this implementation DreamDB-compliant?"

8. Non-Goals (for v0)

A managed service. DreamDB is a protocol; running a hosted DreamDB-as-a-service is out of scope.
Access control / encryption-at-rest. These are backend concerns. The protocol does not define an authorization model in v0. (May be revisited in a later spec series.)
Mutable references for collaboration. v0 collaboration is via manifest exchange (pull-style). A pub/sub or coordination layer is out of scope.
A canonical embedding model. The spec is agnostic to which model produced the vectors; it specifies how vectors map to addresses given a fixed dimensionality and metric.
Schema evolution for non-time data. DreamDB tracks are typed by modality, not by user schema. Modeling rich relational schemas is out of scope.

9. Open Questions (to be resolved in later docs)

These are flagged here so they are not silently decided:

OQ-1 (→ 0003 §2): Timeline epoch — absolute vs. Genesis-relative. Resolved: Genesis-relative canonical; Genesis Object's origin field carries an optional absolute Unix-ns back-reference. Decoupled from Timeline identity (the hash of the Genesis Object, regardless of origin format).
OQ-2 (→ 0004 §5): Spatial-indexing scheme. Resolved: dreamdb.lsh-cosine (Charikar's signature LSH) ships as v0 default; algorithm registry supports v0.1+ alternatives.
OQ-3 (→ 0005 §5): Backend consistency model. Resolved: strong read-after-write for content-addressed Objects (modern S3 default); linearizable CAS for refs; eventual consistency tolerated for prefix listings (rendered irrelevant by Manifest Supremacy doctrine).
OQ-4 (→ 0007 §5): Per-modality container. Resolved: CMAF / fragmented MP4 for video.* and audio.*. Per-codec details (AV1, HEVC, etc.) deferred to v0.1 (OQ-37).
OQ-5 (→ 0008 §8): Manifest distribution. Resolved: four modes — Refs (RefStore-Conformant backends), hash-addressed (works on every backend), pull-only (out-of-band hash exchange), federation across backends (operator-coordinated; formal federate verb deferred to v0.1, OQ-40).
OQ-22 (→ 0005): Local-filesystem development mode. v0's architecture (§3) commits to HTTP-speaking object stores; local-FS deployment runs MinIO or an equivalent HTTP gateway. Question: is a tiny file://-style Storage Connector (bypassing HTTP) acceptable for dev/test convenience, or is HTTP-only the cleaner contract? Lean: HTTP-only. Resolved in 0005.

Next: 0001-data-model.md — formalizing Timeline, Track, Frame, Layer, and Manifest as concrete data structures.