DreamDBv0.2.0bec026

Spec 0014 — Streaming Extensions (Item Chunking + Adaptive Bitrate)

Status: Draft (Phase 3 design). Depends on: spec/0001, spec/0002, spec/0007. Motivation: spec/0007 §5 defines per-Fragment CMAF encapsulation for media but treats each Item's media payload as a single content-addressed Object. This is correct for small-to-medium clips (the UCF-101 demo proved it for 405 videos averaging ~5 MB). It is structurally inadequate for two production workloads:

  1. Large single Items. A 4 GB MRI study, a 6 GB raw video, a 30-minute training-session capture — these blow up the single-PUT path. A network blip aborts the whole upload; the SDK and connector each hold the entire blob in memory; no dedup is possible between Items that share intro/outro sequences.
  2. Adaptive-bitrate playback. A single-rendition Track gives every viewer the same bitrate; production video pipelines emit 3–5 renditions (240p/480p/720p/1080p/4K) so that a client on cellular gets a watchable stream while a client on fiber gets full quality. spec/0007's single-Fragment-per-Item layout has no notion of renditions.

This spec defines two layered solutions:

  • Path A — Item chunking: One logical Item becomes N content-addressed chunk Objects plus a small ItemManifest Object that lists them. Per-field opt-in via chunk_size. Solves ingest fragility, RAM pressure, and dedup at the storage layer. Applies to Image, Audio, AND Video — modality-agnostic.
  • Path B — Adaptive bitrate (CMAF/HLS): A new modality string suffix .renditions=N plus a RenditionPlaylist Object that enumerates per-rendition Track segments. Video-specific.

The two compose: chunking handles per-rendition storage; renditions handle per-bitrate switching. Path A SHIPS FIRST; Path B sits on top.


1. Purpose

DreamDB's strict immutability and content-addressing make small Items trivial (one Object, one PUT, one GET). Large Items and multi-rendition video need the same immutability guarantees with finer-grained storage units. This spec adds those units without disturbing the address grammar, the modality grammar, or the Track Object index.

By the end of this document the following are concrete:

  • Path Achunk_size modality parameter, ItemManifest ObjectKind, FragmentEntry extension (is_manifest flag), reader stitching logic.
  • Path Brenditions=N modality parameter, RenditionPlaylist ObjectKind, HLS-compatible manifest emission, cross-rendition Time anchor alignment.
  • Address-grammar impact — new items/<hash> and playlist/<hash> slots under the per-Timeline tree.
  • Backwards compatibility — datasets with chunk_size = None and renditions = absent produce byte-identical FragmentEntries and Track Objects to today; existing v0 hashes do not change.
  • Range-request semantics — Service-Worker / SDK-side range stitching across chunks; per-rendition byte-range URI form.

What stays defined elsewhere:

  • Per-Fragment CMAF byte layout — spec/0007 §5.
  • Time-bucket placement and Track Object index — spec/0002, spec/0007.
  • The address-grammar slots that already exist (init/, vectors/, etc.) — spec/0002.

What this document does NOT define:

  • Live streaming (continuous tail-of-history HLS / DASH). spec/0014 is bounded-stream; live extends in a follow-up.
  • DRM / encryption. Per-Object encryption is application-layer.
  • Per-chunk authentication beyond BLAKE3. Hash verification per spec/0002 §5.1 is sufficient.
  • Transcoding pipelines. Producers are responsible for emitting the renditions; DreamDB stores them.

2. Path A — Item Chunking

2.1 Modality-string parameter

Image/Audio/Video modalities gain an optional chunk_size=<bytes> parameter:

video.h264.chunk-size=524288.frag-duration=2s.bucket=60s
image.jpeg.chunk-size=4194304
audio.aac.chunk-size=1048576

The parameter is OPTIONAL and TRAILING-OPTIONAL — modality tags without it parse identically to today. When present, the field's blob bytes are split into chunks of exactly chunk_size bytes (last chunk may be shorter).

Permitted range: 64 KiB ≤ chunk-size ≤ 16 MiB. Below 64 KiB, the chunk-overhead-to-payload ratio is unfavorable; above 16 MiB, the original "one big Object" problems return.

2.2 The ItemManifest Object

When chunking is enabled, each Item's blob produces N content-addressed chunk Objects plus one ItemManifest Object that enumerates them.

Address path:

<timeline>/<modality>/items/<itemmanifest-hash>

CBOR encoding (positional array, matching the protocol's deterministic-CBOR style):

;; [total_size, chunks]
;;   total_size: u64 — sum of chunk sizes
;;   chunks:     array of [size: u64, hash: bytes]
[
  <total_size: u64>,
  [
    [<chunk_0_size: u64>, <chunk_0_hash: bytes>],
    [<chunk_1_size: u64>, <chunk_1_hash: bytes>],
    …
  ]
]

Constraints:

  • Every chunk except the last MUST have size == modality.chunk_size.
  • The last chunk MAY be shorter (down to 1 byte for non-empty Items).
  • Sum of chunk sizes MUST equal total_size.
  • Chunks listed in byte order; readers concatenate in this order to reconstruct the blob.
  • Empty Items (zero-byte blobs) are encoded as total_size = 0 with an EMPTY chunks array ([]). The "minimum 1-byte" rule applies to non-degenerate Items only; zero-byte Items skip the chunks array entirely. Readers MUST accept this degenerate form.

2.3 Chunk Object storage

Each chunk is a content-addressed Object at the existing time-bucketed Fragment path:

<timeline>/<modality>/<time-bucket>/<chunk-hash>

Same path-grammar as today's single-Fragment-per-Item; the change is that one Item produces N such Objects instead of 1. Dedup happens at write time: two Items sharing an intro chunk produce the same chunk hash, second PUT is a no-op.

2.4 FragmentEntry extension (Track Object index)

Per spec/0002 §7.3, the Track Object's object_index carries one entry per Item. Existing 4-tuple positional CBOR:

[t_start, t_end, byte_size, fragment_address]

Path A extends this to a 5-tuple ONLY when chunking is enabled for the field:

[t_start, t_end, byte_size, fragment_address, is_manifest]

The 5th element is a boolean:

  • falsefragment_address points at a single chunk Object (the legacy case). Track Objects of unchunked modalities continue to emit 4-tuples and produce byte-identical encodings to today — CRITICAL for content-addressability of existing datasets.
  • truefragment_address points at an ItemManifest Object. The reader fetches it, then range-fetches the listed chunks.

Encoder rule: emit 4-tuple iff is_manifest = false; emit 5-tuple iff is_manifest = true. Per spec/0002 §3.1.1, readers MUST accept both lengths and treat missing trailing fields as defaults. Crucially, a chunked-Track encoded with 5-tuples is decoded by a v0 reader (which only knows 4-tuples) as a regular single-fragment Track whose Fragment is the ItemManifest CBOR bytes — playback is broken but the read does not crash. This is the forward-compatibility story.

2.5 Read path — stitching

When the SDK fetches an entry with is_manifest = true:

  1. Fetch the ItemManifest Object at fragment_address.
  2. Decode the CBOR; obtain the chunk list and total_size.
  3. If the caller requested a byte range [A, B): a. Identify which chunks i ∈ [start_chunk, end_chunk] overlap [A, B). b. For each overlapping chunk, range-fetch its bytes (full chunk if fully covered; partial range if partial). c. Concatenate in chunk order.
  4. If the caller requested the whole blob: range-fetch all chunks, concatenate.
  5. Verify that the concatenated byte count equals total_size. Mismatch is a critical error (chunk listed in manifest is shorter/longer than declared) → surface as protocol corruption.

Chunk fetches MAY be issued in parallel (HTTP/2 multiplexing). The SDK SHOULD use a small concurrency cap (4–8) to avoid head-of-line blocking in the connector pool.

2.6 Service-Worker streaming pattern

For browser playback of large chunked video, a Service Worker is the natural integration point:

On fetch('/item/<manifest-hash>'):
  1. Fetch + cache the ItemManifest (small, immutable).
  2. Parse the browser's Range header.
  3. Map the range to chunks; issue per-chunk range fetches against the backend.
  4. Stitch and return 206 Partial Content with the correct Content-Range.

The browser's <video> element issues range requests against this Service-Worker-intercepted URL; the SW transparently materializes the byte range from the chunks. Pseudocode in the existing chunking plan; conformance tests for the stitching logic ship in spec/0009.

2.7 GC reachability

The reachability walk (spec/0006 §7.3.1) for chunked Tracks:

For each Track in reachable manifests:
  For each FragmentEntry in Track's object_index:
    If is_manifest = true:
      Mark fragment_address (ItemManifest) reachable.
      Fetch ItemManifest; decode.
      For each chunk's hash:
        Mark chunk Object reachable.
    Else:
      Mark fragment_address (chunk Object) reachable.

Without the transitive walk, GC could DELETE chunks still referenced by live ItemManifests, producing dangling references and silent data loss. The conformance suite (spec/0009 §7) MUST include a chunked-Track GC test.

3. Path B — Adaptive Bitrate (CMAF Renditions)

3.1 Modality-string parameter

Video modalities gain an optional renditions=N parameter:

video.h264.renditions=4.frag-duration=2s.bucket=60s

N is the number of renditions (typical 3–5). The parameter is OPTIONAL and TRAILING-OPTIONAL. Modality tags without it produce single-rendition Tracks identical to today.

Renditions are typically labeled by resolution (240p/480p/720p/1080p/4K) or bitrate. The labels appear in the RenditionPlaylist Object (§3.2), not in the modality string — the modality string is structure-only.

3.2 The RenditionPlaylist Object

A RenditionPlaylist enumerates a Track's renditions and the per-rendition Track Object hashes.

Address path:

<timeline>/<modality>/playlist/<playlist-hash>

CBOR encoding:

{
  "version":   1,
  "renditions": [
    {
      "label":           "<string>",                ;; "240p", "1080p", "4K", etc.
      "bandwidth":       <unsigned int>,            ;; bits per second; for client selection
      "resolution":      [<width: uint>, <height: uint>] | null,   ;; null for audio
      "codec_string":    "<RFC-6381 codec ID>",     ;; e.g. "avc1.4d401e"
      "init_segment":    <multihash>,               ;; per-rendition init segment (spec/0007 §5.2)
      "track_object":    <multihash>,               ;; per-rendition Track Object
      "stream_role":     "video" | "audio",
    },
    …
  ],
  "default_rendition": <uint>,                      ;; index into renditions[]; client fallback
}

3.3 Per-rendition Track Objects

Each rendition is a standalone Track Object following spec/0007 §5 unchanged — same init segment, same Fragment format, same time-bucket placement. The renditions all share the same Timeline and the same logical Item set; they differ only in the bytes (codec parameters, quality).

Critical alignment constraint: all renditions of the same logical Item MUST share t_start, t_end, and Item identity. The RenditionPlaylist allows a client to switch renditions mid-playback by:

  1. Resolving the current playback timestamp t.
  2. Finding the FragmentEntry in the target rendition's Track Object whose [t_start, t_end) covers t.
  3. Fetching that Fragment (or its chunks via Path A).

Anchor-aligned renditions are emitted by producers running the same chunker (CMAF segmenter) on the same source media at different encode parameters. ffmpeg's -force_key_frames and equivalent options pin keyframes deterministically; the producer is responsible for the alignment guarantee. DreamDB doesn't enforce it at write time, but mis-aligned renditions produce visible playback glitches (re-buffering at boundaries) and SHOULD be flagged by producers' QC pipelines.

3.4 Manifest registry reference

A video modality using renditions declares the RenditionPlaylist hash in the Manifest registry:

"registry": {
  "video.h264.renditions=4.frag-duration=2s.bucket=60s": {
    "kind":               "continuous",
    "object_kind":        "playlist",                          ;; NEW (this spec)
    "playlist":           <multihash-of-RenditionPlaylist>,
    "tracks":             [<rendition-0-track>, … <rendition-N-track>],  ;; for GC walk
  }
}

The tracks array is redundant with renditions[i].track_object in the playlist but lives in the registry so GC walks (spec/0006 §7.3.1) can identify reachable Track Objects without parsing the playlist.

3.5 HLS interop emission (optional)

A SDK MAY emit an HLS-compatible manifest from a RenditionPlaylist on demand. This is a translation, not a storage format — the canonical form is the CBOR RenditionPlaylist; HLS .m3u8 is for browser/CDN compatibility.

#EXTM3U
#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=480x270,CODECS="avc1.4d401e"
<URI to rendition-0 sub-playlist>
#EXT-X-STREAM-INF:BANDWIDTH=2400000,RESOLUTION=1280x720,CODECS="avc1.4d401e"
<URI to rendition-1 sub-playlist>
…

Each rendition sub-playlist enumerates its Fragments (or ItemManifests, when Path A is also active) with their byte offsets. The HLS emission is a Service-Worker or HTTP-shim concern; the SDK provides a builder helper but the wire format is HLS-standard.

3.6 Path A × Path B composition

A Track using both chunk-size= AND renditions= produces N renditions, each of which uses chunked storage:

  • Each rendition has its own Track Object (per §3.3).
  • Each Track Object's FragmentEntries use 5-tuple encoding with is_manifest = true.
  • Each FragmentEntry's fragment_address is an ItemManifest hash.
  • Each ItemManifest enumerates chunks for that rendition's segments.

Dedup across renditions does NOT happen at the chunk level (different bytes per quality level). It DOES happen at the ItemManifest level when two Tracks happen to share an Item — e.g., a re-published Manifest where one rendition's bytes are unchanged.

4. Address-grammar additions

Two new Object Kinds and slots under the per-Timeline tree:

<timeline>/<modality>/items/<itemmanifest-hash>     ;; Path A — Item chunk index
<timeline>/<modality>/playlist/<playlist-hash>      ;; Path B — Rendition enumeration

These slots are NEW. spec/0002 §6.3's path parser (4-segment match arms for track/, index/, init/, vectors/, bucket/) extends with two more match arms for items/ and playlist/. spec/0009's conformance suite adds path-roundtrip vectors for both.

5. Backwards compatibility

The forward-compat story for both Path A and Path B is the same: datasets created before spec/0014 produce byte-identical encodings to today.

  • Path A: modalities without chunk-size= emit 4-tuple FragmentEntries; Track Object hashes unchanged. A v0.X-aware reader observing a 4-tuple treats it as is_manifest = false.
  • Path B: modalities without renditions= emit single Track Objects with no playlist; Manifest registry entries unchanged. A v0.X-aware reader observing a modality without playlist field treats it as a single-rendition Track.

A v0 reader (pre-spec/0014) observing a spec/0014 dataset:

  • 5-tuple FragmentEntry: treats as 4-tuple (per spec/0002 §3.1.1, trailing CBOR fields are ignored); fetches the ItemManifest hash and decodes its bytes as video — broken playback, no crash.
  • renditions=N modality: rejected as unknown modality unless the reader is permissive — operator-discretion. Recommended: producers gate spec/0014 features behind an opt-in flag until enough deployments have v0.X+ readers.

6. Storage and latency at scale

6.1 Path A bandwidth profile

A 4 GB MRI study with chunk-size=4 MiB:

  • 1024 chunks of 4 MiB each + 1 ItemManifest of ~36 KiB.
  • Ingest: 1024 parallel PUTs (capped at, e.g., 8 concurrent) → ~5 minutes at 100 Mbps vs ~5 minutes at 100 Mbps for a single 4 GB PUT. Concurrency doesn't speed total wire-time but it makes ingest resumable: a network blip aborts only the in-flight chunks, not the prior 1023.
  • Read full blob: ~5 minutes again, but parallel and pipelinable into the decoder.
  • Read 100 MB slice: 25 chunks of 4 MiB = 100 MiB transferred. Versus reading 100 MB from a single 4 GB Object: 100 MB at 8x more requests because the connector can't predict the range overlap. Performance about-equal; resumability strictly better.

6.2 Path B storage cost

A 1-hour 1080p video at 5 Mbps with 4 renditions (240p/480p/720p/1080p) at typical bandwidths (0.4 / 1.0 / 2.5 / 5.0 Mbps):

  • Total bytes: (0.4 + 1.0 + 2.5 + 5.0) × 3600 / 8 = ~4.0 GB across all renditions.
  • vs single-rendition 1080p: 2.25 GB.
  • Storage premium: ~1.8× for adaptive playback. Industry-standard tradeoff.

6.3 Combined Path A × Path B at billion-scale

10⁶ hours of 1080p video, 4 renditions, chunk-size=2 MiB:

  • Total bytes: 10⁶ × 4 GB = 4 PB.
  • Chunks: ~2 trillion. ItemManifests: ~2 billion.
  • Most queries access ~1 hour of one rendition: 720 chunks fetched, ~1.4 GB transferred, streaming concurrent → playback starts in seconds, finishes when video does.

The scale targets exceed v0's single-Timeline ceilings (1B Items max per Track) — production Petabyte-scale deployments combine Path A + Path B with spec/0012 federation, sharding Timelines per-asset or per-day-range.

7. Out of scope

  • Live streaming. Continuous tail-of-history HLS (sliding-window playlists, EXT-X-ENDLIST absent) is not in v0.X. The protocol's append-only model handles live ingest; what's missing is producer-side tooling to publish playlist deltas at sub-segment cadence. Defer to v0.X+1.
  • Per-rendition independent compression. Renditions all use the same chunk-size in v0.X. Per-rendition tuning is plausible but adds parameter complexity; defer.
  • Subtitle / caption renditions. A stream_role: "text" entry in the playlist is the natural extension; defer until needed.
  • DASH MPD emission. HLS is the lingua franca; DASH support follows the same translation pattern but is operator-layer.
  • Cross-rendition byte dedup. Possible if encoders produce shared chunks across qualities (unlikely at the byte level; produced chunks differ even at matching keyframes due to quantization). No protocol support.

8. Open questions

  • OQ-57 (→ this spec): Should ItemManifest contain a top-level mime_type field? Currently the modality tag carries it. Adding to the manifest lets a generic reader serve any chunked blob without knowing its modality, but adds redundancy. Defer until measurement.
  • OQ-58 (→ spec/0009): Conformance tests for chunked read path: roundtrip, partial range, GC reachability, forward-compat 4-vs-5-tuple decode. Block v0.X on the test battery.
  • OQ-59 (→ this spec): Should the chunk Object path slot be <timeline>/<modality>/chunks/<hash> (new slot) or remain <timeline>/<modality>/<time-bucket>/<hash> (current Fragment slot, reused)? Current plan: reuse — same time-bucket placement, same address grammar. Chunks ARE Fragments in spec/0002's sense; only the FragmentEntry indirection changes. Defer if first implementation finds a problem.
  • OQ-60 (→ this spec): Per-chunk size in CBOR — currently u64. Capped at 16 MiB (modality-level constraint) so u32 would suffice. Defer; the cost is 4 bytes per chunk × N items × N chunks.
  • OQ-61 (→ spec/0006): Streaming verb's interaction with renditions: when a client calls Stream(range), which rendition does the SDK choose? Probably accept a rendition_hint: <label> parameter; default to default_rendition. Defer to spec/0006 amendment.

Next: spec/0009 amendment with conformance vectors for the new ObjectKinds (ItemManifest, RenditionPlaylist, VectorCompressor, GraphIndex, GraphPage, FederationManifest).