DreamDB Specification — 0009: Conformance
Status: Draft. Builds on
0000–0008(the entire v0 spec). This document defines what "DreamDB-conformant" means for each component (Backend, Storage Connector, Protocol Component / SDK), enumerates the test-vector battery that any conformant implementation MUST pass, and resolves the meta-OQs about concrete tag values: OQ-11 (CBOR tag numbers), OQ-12 (multihash algorithm tags), OQ-17 (time-anchor CBOR tag).
1. Purpose
The DreamDB v0 spec defines a protocol composed of byte formats, address grammars, HTTP semantics, algorithms, and operational disciplines. An implementation is DreamDB-conformant if it correctly realizes the relevant subset for its role. This document:
- Defines the three conformance roles (Backend, Storage Connector, Protocol Component) and what each must implement.
- Resolves the spec's meta-OQs — concrete CBOR tag values, multihash algorithm tags, and other registry-style commitments deferred from earlier docs.
- Enumerates the test-vector categories that a conformant implementation MUST pass, organized by source spec doc.
- Specifies the test-vector exchange format — a structured JSON form that is portable across implementation languages.
- Defines the conformance reporting that an implementation publishes to advertise its level.
What this document does not do:
- Enumerate every individual test vector in full. The actual vectors live in an external repository (see §10) maintained alongside the spec; this document defines the categories and pass criteria.
- Replace earlier specs' normative requirements. Conformance is judged against the entire spec; this document is a roadmap, not a substitute.
2. Conformance Roles
DreamDB's three-layer architecture (0000 §3) admits three conformance roles, each independently testable:
| Role | What it implements | Tested by |
|---|---|---|
| Backend | HTTP semantics (0005) | HTTP-level test suite hitting a deployed backend |
| Storage Connector | HTTP-to-DreamDB transport shim (0005 §8) | Unit + integration tests against a reference backend |
| Protocol Component (SDK) | DreamDB Protocol verbs (0001–0008) | End-to-end tests + algorithmic/byte-format vectors |
A complete implementation is two of these (Connector + Protocol; the Backend is supplied by an existing object store) plus a binding to a chosen Backend that has independently been verified Backend-conformant.
2.1 Backend conformance
A Backend is ContentStore-Conformant if it correctly implements:
- HTTP verbs PUT, GET (with
Range), HEAD, LIST (with prefix and pagination), DELETE per0005§3. - Strong read-after-write consistency for content-addressed paths (
0005§5.1). - Lexicographic byte-order LIST results (
0005§3.5.1). - HTTP/2 (or HTTP/3) connectivity (
0005§6).
A Backend is RefStore-Conformant if, additionally, it correctly implements:
If-None-Match: *andIf-Match: <etag>conditional writes per0005§4.- Linearizable CAS semantics (
0005§5.2).
Each tier is independently certifiable; a Backend MAY be ContentStore-Conformant only.
2.2 Storage Connector conformance
A Storage Connector is conformant if it:
- Translates DreamDB protocol-layer requests into HTTP per
0005. - Iterates LIST pagination to exhaustion (
0005§3.5.2). - Round-trips ETags opaquely for
If-Matchand canonicalizes for cross-request comparison (0005§8.5). - Issues HTTP/2 multiplexed streams over a shared connection per
(backend host, auth identity)pair (0005§8.1). - Translates
bytes:<a>-<b>(DreamDB half-open) →Range: bytes=<a>-<b−1>(HTTP inclusive) (0005§3.3). - Exposes no DreamDB-semantic logic — bytes pass through opaquely (
0005§8.4).
2.3 Protocol Component (SDK) conformance
A Protocol Component is conformant if it correctly implements:
- The eight verbs of
0006§2 (Open,Resolve,Query,Stream,Get,Append,Layer,Publish). - The per-session cache discipline of
0006§3 (cache by content hash; never by path). - Manifest Supremacy — never use list-prefix during steady-state reads (
0005§5.3.1). - The integer-only time arithmetic discipline of
0003§4.1. - The f32-determinism discipline for spatial-key derivation (
0004§5.4). - The deterministic CBOR encoding of all hashable Objects (
0002§3). - The base2 spatial-key encoding, base32 hash encoding, 16-char hex time encoding of
0002§6 / §8 and0003§6. - The byte-format layouts of
0007for every Object kind it produces or reads. - The DAG semantics of
0008— multi-parent Manifests, MUST-refuse-on-SpatialIndex-conflict merges, lex-greatest tiebreak for Constants. - The operational disciplines of
0006§4.1.1, §4.4.1, §7.3 (Ref freshness, Stream prefetch, GC).
3. Resolved Meta-OQs
This section resolves the placeholder values used in earlier docs.
3.1 Multihash algorithm tag for BLAKE3 (resolves OQ-12)
DreamDB aligns with the IPFS multihash table for BLAKE3 (0x1e). This commits DreamDB's algorithm-tag namespace to IPFS-compatible values for any future hash function as well — implementations MAY consult the IPFS multihash registry when adding hash algorithms in v0.1+.
A multihash for BLAKE3-256 on the wire is therefore:
Encoded for use in addresses as base32-lowercase-no-padding (per 0002 §8.1): 53 characters (33 bytes × 8 bits / 5 bits per char = ceil(52.8) = 53).
3.2 DreamDB CBOR tag value (resolves OQ-11, OQ-17)
DreamDB reserves one CBOR tag in the private-use range for foreign-CBOR-embedding scenarios:
Selected for v0 because:
- Already designated for private use by RFC 8949 §6.
- No conflict with IANA-registered CBOR tags.
- Implementations don't need to coordinate with a registry; the tag is DreamDB-private.
- v0.1 MAY apply for IANA registration in the public-use range and renumber, with backward-compatibility readers.
3.2.1 Tag scope: foreign-CBOR contexts only
The tag is explicitly reserved for unstructured / schema-less CBOR contexts — situations where a DreamDB value (a hash, a time anchor, a modality tag) is embedded in a generic CBOR document outside the spec'd DreamDB schemas, and a reader needs the tag to disambiguate it from a generic integer / byte string / text string.
Inside the schema-typed fields of DreamDB Objects, this tag is NOT used. Those fields use plain CBOR types directly:
- Time anchors → plain CBOR uint (per
0003§5). - Hashes → plain CBOR byte string (per
0002§3). - Modality tags → plain CBOR text string.
- Spatial keys → plain CBOR text string (base2 chars).
The schema's field name (or array position) is the type indicator. Adding a tag inside DreamDB schemas would be redundant and adds 2–3 bytes per occurrence — at billion-scale, that is GB-class waste.
If your application doesn't embed DreamDB values in foreign CBOR documents, you never encounter dreamdb.tag. The v0 reference implementation does not produce or consume it on any path.
3.2.2 Maps vs. positional arrays in DreamDB schemas
0002 §3.1 specifies two encoding choices for schema-typed DreamDB Objects:
- Maps with string keys: low-volume schemas (Manifest top-level, Genesis, SpatialIndex Object, Track Object metadata). Self-describing; debugger-friendly.
- Positional CBOR arrays: high-volume schemas (Index Page leaf entries and internal entries per
0007§7.3 / §7.5). Field order pinned in spec text; saves ~40 bytes per entry and ~40 GB on a 1B-entry Track.
Conformance test vectors (§5.1) MUST cover both encodings — including a positional-array round-trip that confirms the field order matches the spec.
4. Test-Vector Format
Conformance test vectors are exchanged as JSON documents with the following schema:
4.1 Test ID convention
Example IDs:
0002.cbor.deterministic.0001— first deterministic-CBOR test in0002.0004.lsh-cosine.f32-rounding.0003— third f32-rounding-edge-case test fordreamdb.lsh-cosine.0005.http.range-translation.0001— first byte-range translation test.0008.dag.refuse-spatial-index-incompat.0001— first MUST-refuse merge test.
This makes failures locatable and grep-friendly.
4.2 Categories
Each category specifies its own input/expected schema. The full schema definitions live in the conformance suite repo (§10); summary follows in §5–§8.
4.3 Pass / fail criteria
Vectors specify their own pass criteria. Common kinds:
- Byte-identical: produced output bytes equal expected bytes exactly.
- Functional equivalence: produced output is semantically equivalent (e.g. an array order-independent set comparison).
- Behavioral assertion: under specified conditions, the implementation makes (or does not make) certain HTTP requests; takes (or does not take) certain actions.
- Tolerance bound: produced metric (e.g. recall, latency) is within a stated bound of the expected value.
5. Test Categories — Byte Formats and Algorithms
5.1 0002: Content addressing
| Category | Pass criterion | Coverage |
|---|---|---|
cbor.deterministic.* | Byte-identical canonical encoding | All hashable Object types |
multihash.encoding.* | Byte-identical multihash + base32 round-trip | BLAKE3 hashes (0x1e) |
address.grammar.* | Path parse and re-emit produces same bytes | All address forms; URI scheme |
modality.tag.parsing.* | Tag → (class, encoding, params) parse correctness | Built-in classes; reverse-DNS |
spatial-key.base2.* | base2 encoding round-trips arbitrary-N bit strings | Bit lengths 1-256, prefix-preservation |
paged-index.btree.* | B-tree page traversal returns expected leaf entries | Inline + paged forms; mixed |
5.2 0003: Time encoding
| Category | Pass criterion | Coverage |
|---|---|---|
time-anchor.hex.* | u64 ↔ 16-char hex round-trip; lex order = numeric order | Min (0), max (2⁶⁴ − 1), mid, near-boundary values |
time-bucket.placement.* | floor(t_start / D) produces correct bucket for boundary cases | t_start = 59.9 s, 60.0 s, 60.1 s, 59.999999999 s with D=60s |
time-bucket.overflow.* | Overflow stress — t at 2⁶⁴ − 1, t + duration overflow detection, bucket-duration near 2⁶³ − 1, bucket-duration = 1 ns (degenerate floor(t/1)=t), bucket-duration = 2⁶³ − 1 (degenerate floor(t/huge)=0) | Edge u64 values; catastrophic-overflow guards |
genesis.cbor.* | Genesis Object round-trips deterministically | Anchored/abstract; with/without horizon |
duration-parsing.overflow.* | Duration suffix parsing rejects values whose ns count would exceed 2⁶³ − 1 | 1_000_000_000h, 2⁶³ s, etc. |
integer-only.violation.* | Implementation rejects f64 in time arithmetic at compile/lint time, OR fuzz tests confirm no f64 path exists | Static analysis hooks |
5.3 0004: Spatial indexing
| Category | Pass criterion | Coverage |
|---|---|---|
lsh-cosine.hyperplanes.* | ChaCha20 seed → hyperplane table is bit-identical across implementations | Multiple seeds; D ∈ {64, 128, 768} |
lsh-cosine.bit-derivation.* | (vector, hyperplanes) → spatial-key bits are bit-identical | Edge cases including <v, h_i> = 0 ties |
lsh-cosine.f32-rounding.* | f32 left-fold dot product produces bit-identical results on every required hardware path | See §5.3.1 — multi-architecture coverage |
recall.locality.* | Empirical bit-collision rate matches 1 − θ/π within tolerance | Synthetic pairs at θ ∈ {1°, 5°, 10°, 30°, 60°, 90°} |
multi-table.union.* | tables=L=4 query union has expected recall improvement | Recall ≥ 1 − (1−p)^4 ± tolerance |
lineage.spatial-index-hash.* | Bucket header spatial_index_hash mismatch triggers abort, NOT silent decode | Deliberate mismatches |
5.3.1 Multi-architecture f32-determinism conformance (mandatory)
Floating-point determinism across heterogeneous hardware is the single most fragile conformance surface. SIMD optimizers (AVX-512 fused-multiply-add, ARM NEON's vector-friendly rounding modes, GPU paths) are at liberty to reorder f32 operations; the resulting drift is invisible until two SDKs disagree on a single bit and silently route data to different Spatial Buckets.
The discipline:
Pure-scalar reference is the canonical implementation. The dreamdb.lsh-cosine algorithm spec (0004 §5) is interpreted to mean the bytes the scalar reference path produces, with all SIMD optimizations disabled. Any SIMD path is conformant if and only if it produces bit-identical output to scalar.
Required hardware coverage: a conformant implementation MUST be tested on the following paths, all producing bit-identical results to the scalar reference:
| Hardware path | Notes |
|---|---|
| x86-64, AVX-512 | FMA, vectorized accumulator |
| x86-64, AVX2 | older Intel/AMD; subset of AVX-512 |
| x86-64, scalar (no SIMD) | reference path; canonical |
| ARM64, NEON | Apple Silicon, AWS Graviton, server ARM |
| ARM64, scalar (no SIMD) | reference path; canonical |
Test-vector design: vectors crafted to expose rounding-order sensitivity:
- Pairs with dot products at exactly
0.5 ulp(the rounding boundary). - Vectors where left-fold and right-fold accumulation order produce different f32 results — and where the spec mandates left-fold (
0004§5.4). - Adversarial inputs: extreme magnitude ranges, near-zero dot products, denormalized floats.
- Pairs at exactly the
<v, h_i> = 0tie boundary (0004§5.1 specifies tie → 1).
Non-conformance: an implementation that produces different bits under SIMD vs. scalar on any required hardware path is non-conformant for that path. The implementation MUST disable SIMD optimization on that platform until parity is achieved. Compromised SIMD paths shipping silently is the failure mode this discipline exists to prevent.
This is the same model used by cryptographic library testing (BoringSSL, RustCrypto, libsodium): bit-identical scalar reference, hardware paths verified against it, no SIMD path ships without bit parity.
5.4 0007: Object byte formats
| Category | Pass criterion | Coverage |
|---|---|---|
bucket.inline.layout.* | Vector at index i lives at byte_offset = 160 + i × record_size | Multiple modalities; record_size variants |
bucket.reference.layout.* | Reference resolves to correct Vector-Storage byte range | Multi-table cases |
fragment.cmaf.* | CMAF Fragment can be played by reference decoder | H.264 / H.265 / AV1 / Opus / FLAC |
time-batch.in-object-index.* | (time_anchor, byte_offset, byte_size) index round-trips | Variable-payload events |
index-page.delta-encoding.* | Delta-encoded leaves round-trip through encode/decode | Page-boundary cases |
dynamic-height.* | Adding items past height-N capacity produces height-(N+1) tree with old root preserved | Heights 1→2→3→...→8 |
multi-bucket-merge.* | Query against spatial-key with 3 bucket splits returns merged result in t_start order | Time-overlapping splits |
6. Test Categories — HTTP Contract (0005)
6.1 ContentStore HTTP semantics
| Category | Pass criterion | Coverage |
|---|---|---|
http.put.idempotent.* | PUT same bytes twice → second is no-op (200/201/412) | With/without If-None-Match: * |
http.range.translation.* | DreamDB bytes:<a>-<b> → HTTP Range: bytes=<a>-<b−1> | Boundaries 0, 1, 1MB, large |
http.list.exhaust.* | Pagination iterates correctly across K×1000 boundary | K ∈ {0, 1, 2}; k×1000+1 cases |
http.list.sort-order.* | Backend returns lex-byte-order across pages | Adversarial keys, ASCII paths |
http.consistency.read-after-write.* | PUT → GET returns the bytes immediately | Multiple regions if applicable |
6.2 RefStore HTTP semantics
| Category | Pass criterion | Coverage |
|---|---|---|
http.refs.create.* | If-None-Match: * → 412 on existing (treat as success) | Initial create + idempotent re-create |
http.refs.cas.* | If-Match: <etag> → 412 on mismatch (treat as conflict) | Concurrent advance |
http.refs.linearizable.* | Two concurrent CAS attempts: exactly one succeeds | Stress test |
http.etag.flavor.* | Round-trip ETag with quotes, with W/, multipart-style | Backend-specific flavors |
6.3 Connector behaviors
| Category | Pass criterion | Coverage |
|---|---|---|
connector.http2.shared-connection.* | Multiple parallel requests share one HTTP/2 connection | 16-parallel-GET hot path |
connector.retry.exponential.* | 5xx / 429 → exponential backoff up to budget | 5xx, 429, network errors |
connector.412.tier-aware.* | 412 on ContentStore PUT = success; 412 on RefStore PUT = conflict | Both tiers |
7. Test Categories — Verbs and SDK behavior (0006)
| Category | Pass criterion | Coverage |
|---|---|---|
verb.append-publish.roundtrip.* | Standard transaction succeeds; new Manifest is reachable | Single-modality + multi-modality |
verb.append.retry.* | Mid-Append failure + retry produces idempotent result | Failure injected at every step |
verb.publish.concurrent.* | Two concurrent Publishes → one wins, loser rebuilds | Rebase + merge paths |
verb.query.cold-vs-hot.* | Cold-start adds 1 round trip; hot path uses cached index | Latency-budget assertion |
verb.query.no-list-prefix.* | Hot-path query issues ZERO LIST HTTP requests (Manifest Supremacy) | Steady-state queries |
verb.gc.algorithm.* | Synthetic orphans + reachable mix → orphans deleted, reachable preserved | Threshold honored |
verb.stream.prefetch.* | Simulated 200–500 ms tail-latency Fragments do not stall playback when lookahead ≥ 2 | Adversarial scheduling |
verb.refs.freshness.* | HEAD-detected ETag mismatch triggers Resolve flow | Long-lived session |
cache.content-hash-keyed.* | Different Tracks Object for same (timeline, modality) produce two cache entries | Cross-Manifest |
8. Test Categories — DAG and Versioning (0008)
| Category | Pass criterion | Coverage |
|---|---|---|
dag.parents-array.* | Manifests with parents: [], [1], and [2+] round-trip | All cardinalities |
dag.fast-forward.* | Detect when one parent is ancestor of the other | Linear, branching, multi-level |
dag.spatial-index-incompat.* | MUST refuse merge with named conflicting hashes; auto-merging implementation is non-conformant | Edge cases |
dag.constant-tiebreak.* | Lexicographically-greatest layer-Track address wins | Multi-layer Constants |
dag.cas-rebase-loop.* | Bounded retries terminate; surface PublishConflict | Retry budget exhaustion |
dag.feature-branch-load.* | 1000 writers on per-user Refs sustain ≥10× throughput vs. refs/main direct | Load test |
dag.monotonic-ts.* | Writer fix-up of ts < max(parents.ts) produces monotonic chain; reader-side Timeline-Jump flag surfaces | Clock-skew injection |
dag.history-walk.* | DAG traversal terminates; visits each Manifest at most once | Cyclic-input rejection (impossible by construction, but verify) |
8.5 Test Categories — Phase-3 ObjectKinds and Verbs
Phase-3 spec drafts (0010, 0012, 0013, 0014) introduce new ObjectKinds and one new verb. The conformance battery for them is OPTIONAL for pre-Phase-3 v0 SDKs and REQUIRED for SDKs claiming Phase-3 conformance. Specific vectors are deferred to a v0.X amendment of this spec but the categories are pinned now so implementations have known targets:
8.5.1 0010 — Vector compression
| Category | Pass criterion | Coverage |
|---|---|---|
vc.qinco.roundtrip.* | encode(decode(encode(v))) == encode(v) for the reference VC Object | Multi-arch (scalar reference) |
vc.qinco.relu-only.* | Implementations using GELU/SiLU in their MLP fail bit-identity vs reference | Negative-test vector |
vc.bucket-header.200-byte.* | Buckets emitted with compress= use 200-byte header carrying vector_compressor_hash | Header layout |
vc.lineage-mismatch-abort.* | Decoding a Bucket with wrong VC Object causes critical-error abort | Cache-mis-keying scenarios |
vc.rerank-storage.* | Re-rank fetch path closes ≥95% of recall gap to exact f32 | BIGANN slice |
8.5.2 0012 — Federation
| Category | Pass criterion | Coverage |
|---|---|---|
fed.federate.push.* | Push moves complete closure; idempotent retry is no-op | Cold + partial-retry |
fed.federate.pull.* | Pull-manifest-list returns ordered hash list; receiver fetches in any order | Delta + full closure |
fed.hash-verify-abort.* | Receiver served wrong bytes for known hash MUST abort | Negative test; backend-corruption sim |
fed.capability.ed25519.* | Expired/forged capability tokens are rejected | Signature roundtrip + tamper |
fed.scatter-gather.partial.* | default_quorum-satisfied response flagged partial:true | One shard offline |
fed.shard-key.range-prune.* | Hybrid-range shards outside the query's anchor range are skipped | Multi-shard time queries |
fed.router.fanout-reduction.* | Router-graph dispatches reduce contacted shards from N to K_router | 100-shard federation |
8.5.3 0013 — Graph indexing
| Category | Pass criterion | Coverage |
|---|---|---|
graph.vamana.build.deterministic.* | Two builds from same (vectors, seed, alpha, L_build, passes) produce bit-identical GraphIndex + GraphPage hashes | Multi-arch |
graph.vamana.search.deterministic.* | Greedy search trajectory is identical across implementations | Reference graph (dim=64, 10K nodes, R=8) |
graph.page.header.lineage.* | GraphPage header's graph_index_hash matches the SDK's current GraphIndex | Cache-mis-keying |
graph.compose-with-compressor.* | dreamdb.vamana-cosine + dreamdb.qinco-cosine reaches ≥0.97 recall@10 on test corpus | End-to-end |
graph.entry-point.medoid.* | Approximate-medoid entry point selection is deterministic across implementations | Sample-size cap |
8.5.4 0014 — Streaming extensions
| Category | Pass criterion | Coverage |
|---|---|---|
chunk.itemmanifest.roundtrip.* | Encode then decode produces byte-identical CBOR | Single-chunk + multi-chunk |
chunk.fragmententry.4-tuple.* | Modalities without chunk-size emit 4-tuple FragmentEntries byte-identical to v0 | Backwards-compat |
chunk.fragmententry.5-tuple.* | Modalities with chunk-size emit 5-tuple FragmentEntries with is_manifest=true | Forward case |
chunk.stitching.range-fetch.* | Byte-range [A,B) fetches only the chunks overlapping range | Partial reads at every chunk boundary |
chunk.total-size-verify.* | Manifest-declared total mismatches actual concatenated size → critical error | Negative test |
chunk.gc.transitive.* | GC walks ItemManifest → chunk hashes transitively; no dangling refs | Mixed reachable/orphan |
playlist.renditions.roundtrip.* | RenditionPlaylist CBOR roundtrips deterministically | 1–5 renditions, audio + video |
playlist.cross-rendition.anchor.* | All renditions of the same Item share [t_start, t_end) | Producer-side aligned encode |
playlist.hls.emission.* | HLS .m3u8 emission is byte-identical given the same RenditionPlaylist + offsets | Translation determinism |
8.5.5 New ObjectKind path-grammar coverage
| Category | Pass criterion | Coverage |
|---|---|---|
path.parser.items.* | <timeline>/<modality>/items/<hash> parses to ItemManifest ObjectKind | All 6 new slots |
path.parser.playlist.* | <timeline>/<modality>/playlist/<hash> parses to RenditionPlaylist | |
path.parser.graph-index.* | graph-index/<hash> parses to GraphIndex | |
path.parser.graph-page.* | <timeline>/<modality>/graph-page/<hash> parses to GraphPage | |
path.parser.vector-compressor.* | vector-compressor/<hash> parses to VectorCompressor | |
path.parser.federation-manifests.* | federation-manifests/<hash> parses to FederationManifest |
8.6 Test Categories — Phase-4 ObjectKinds and Verbs
Phase-4 spec drafts (0015, 0016, 0017, 0018, 0019) introduce hybrid retrieval, streaming freshness, schema evolution, multi-tenancy, and encryption. Conformance for them is OPTIONAL for pre-Phase-4 SDKs and REQUIRED for SDKs claiming Phase-4 conformance. Specific vectors are deferred to a v0.X amendment of this spec; categories are pinned here as known targets.
8.6.1 0015 — Hybrid retrieval
| Category | Pass criterion | Coverage |
|---|---|---|
hybrid.rrf.scale-invariance.* | Same RRF score regardless of sub-query score scales | BM25+cosine; cosine+bm25-plus |
hybrid.planner.deterministic.* | Same Manifest + stats + query → same plan | Multi-implementation |
hybrid.prefilter.threshold.* | Selectivity <1% → pre-filter chosen; >1% → post-filter | Adversarial stats |
hybrid.required.elimination.* | required:true sub-query eliminates non-matching candidates | Boolean-AND semantics |
hybrid.bm25.fp32-determinism.* | BM25 scores bit-identical across implementations | Scalar reference vectors |
hybrid.colbert.maxsim.* | MaxSim aggregation matches reference (within fp32 left-fold discipline) | dim=128 reference corpus |
hybrid.splade.encoder-validation.* | TextIndex with SPLADE algorithm validates encoder_hash against registry | Encoder mismatch → critical error |
8.6.2 0016 — Streaming freshness
| Category | Pass criterion | Coverage |
|---|---|---|
hotshard.append.flush-threshold.* | Buffered items exceeding threshold trigger Track rewrite | Boundary cases |
hotshard.append.ttl.* | TTL expiry forces flush even below threshold | Clock-skew injection |
hotshard.read.merge.* | Query merges Track + HotShard candidates correctly | Time / vector / scalar predicates |
hotshard.staleness.consumer-tolerance.* | max_staleness_seconds honored; refresh triggered on miss | All staleness levels |
fresh-vamana.append-search.* | After N appends, recall@10 stays within 5% of full-rebuild baseline | N ∈ {10K, 100K, 1M} |
fresh-vamana.consolidation.recall.* | Post-consolidation recall returns to full-rebuild baseline | After 10× threshold of appends |
fresh-vamana.tombstone.delete.* | Tombstoned items absent from query results; remain in graph until consolidation | Mixed insert/delete |
index-health.drift-metric.* | Drift metric monotonically tracks distribution shift | Synthetic drift injection |
index-health.recommended-action.* | Threshold transitions produce correct recommended_action transitions | All policy levels |
8.6.3 0017 — Schema evolution
| Category | Pass criterion | Coverage |
|---|---|---|
evolve.multi-version-registry.* | Registry with v1 + v2 both query correctly | Both versions, independent queries |
evolve.reencode.resumable.* | Crash mid-Reencode + resume produces same final state as uninterrupted run | Failure injected per-batch |
evolve.reencode.checkpoint-monotonic.* | last_anchor strictly increases across batches | Adversarial batch orderings |
evolve.planner.all-versions-fallback.* | version_preference: "all" fills v2 coverage gap with v1 scores | Partial coverage scenarios |
evolve.gc.decommission-v1.* | After v1 removed from registry, its Objects become eligible after safety threshold | Standard GC test |
evolve.compatible-with.semantics.* | coverage: "complete" means every v1 Item present in v2; verifier asserts | Mixed coverage states |
8.6.4 0018 — Multi-tenant
| Category | Pass criterion | Coverage |
|---|---|---|
tenant.quota.storage-507.* | Storage cap exceeded → HTTP 507 + correct DreamDB-Quota header | Multiple storage types |
tenant.quota.rate-429.* | Rate limit exceeded → HTTP 429 + Retry-After header | Per-resource rate |
tenant.token.tenant-id-mismatch.* | Token tenant_id ≠ Space tenant_id → HTTP 403 | All scope levels |
tenant.token.cross-tenant-isolation.* | Token from tenant A can never access tenant B's paths | Negative test |
tenant.usage-batch.publish-cadence.* | Batches emitted on schedule; chain of previous_batch links unbroken | Multi-window scenario |
tenant.usage-batch.violations-recorded.* | 429/507 events surfaced in subsequent batch's violations array | Adversarial load |
tenant.fair-share.anti-monopoly.* | One tenant cannot capture >50% capacity when other tenants are active | Multi-tenant load test |
tenant.federation.cross-issuer.* | Federation hop preserves tenant_id; no escalation | Cross-issuer scenarios |
tenant.offboarding.gc.* | After quota=0 + retention window, all tenant Objects reclaimed | Standard GC + retention |
8.6.5 0019 — Data-plane encryption (sketch-level)
Conformance for encryption is provisional pending the v0.X+1 promotion of spec/0019 from sketch to full draft. Provisional categories:
| Category | Pass criterion (preliminary) | Coverage |
|---|---|---|
enc.convergent.dedup.* | Same plaintext + same salt → identical ciphertext hash | Multi-writer; multi-arch |
enc.per-space.uniqueness.* | Same plaintext + per-space mode → different ciphertext hash per write | Negative test |
enc.aead.tamper-detect.* | Modified ciphertext byte → AEAD tag verification fails | Bit-flip injection |
enc.kms.dek-cache.* | Per-Object DEK cached after first KMS call; verified by reduced KMS RPS | Cache HIT/MISS ratios |
enc.cross-arch.aes-roundtrip.* | AES-256-GCM round-trip bit-identical on x86-64 / ARM64 / Apple Silicon | Multi-arch |
enc.metadata.federation-safe.* | EncryptionMeta header + ciphertext round-trip through federate verb intact | Single hop + multi-hop |
8.6.6 Phase-4 path-grammar coverage
| Category | Pass criterion | Coverage |
|---|---|---|
path.parser.text-index.* | <timeline>/<modality>/text-index/<hash> parses to TextIndex | All grammar cases |
path.parser.text-index.posting.* | <timeline>/<modality>/text-index/posting/<hash> parses to TextIndex page | Two-segment disambiguation |
path.parser.multi-vec-index.* | <timeline>/<modality>/multi-vec-index/<hash> parses to MultiVectorIndex | All grammar cases |
path.parser.hot-shard.* | <timeline>/<modality>/hot-shard/<hash> parses to HotShard | All grammar cases |
path.parser.tenant-usage.* | tenant-usage/<hash> parses to TenantUsageBatch | Top-level namespace |
path.parser.tenant-usage-refs.* | tenant-usage-refs/<tenant_id> parses to TenantUsageRef | Top-level refs namespace |
9. Test Categories — End-to-End Worked Example
A full end-to-end test that exercises the entire spec in one scenario:
10. Conformance Suite Repository
The actual test vectors and runner code live in an external repository (designed to live at dreamdb-conformance adjacent to the spec repo):
Vectors are JSON per §4 with binary expected-output files alongside (.expected.cbor, .expected.bin). The runners are reference implementations in Rust; ports to other languages are welcome.
A v0 implementation publishes a conformance report (a structured JSON document listing which test categories passed/failed) and a chosen Tier ("ContentStore-Conformant Backend," "Full Protocol Component," etc.) when claiming DreamDB compatibility.
11. Self-Certification
Until a formal conformance authority exists, implementations self-certify:
- Run the conformance suite against your implementation.
- Publish the resulting report (machine-readable JSON + human-readable summary).
- State your claimed Tier in the README.
- Cite the spec version (e.g. "DreamDB v0.1.0").
The community SHOULD maintain a public list of self-certified implementations and their reports. Discrepancies between claimed and actual conformance can be challenged via the public report; this is the same model used by Web Platform Tests, IETF interop reports, and similar.
A formal certification authority MAY emerge in v0.1+; v0 leans on community-maintained transparency.
12. Compatibility Across Spec Revisions
DreamDB spec versions follow semantic versioning:
- Patch (v0.1.x → v0.1.y): clarifications, OQ resolutions, conformance test additions. Existing implementations remain conformant.
- Minor (v0.1 → v0.2): feature additions (new modalities, new algorithms, new optional verbs). Existing implementations remain conformant for the subset they support; new features are opt-in.
- Major (v0 → v1): breaking changes. Implementations MAY support multiple major versions in parallel via a version-prefixed namespace (TBD in v1).
Within v0:
- v0 freezes the data model, address grammar, time encoding, hash function, and HTTP contract.
- v0.1+ MAY add: new spatial-indexing algorithms (
dreamdb.pq-ivf,dreamdb.lsh-l2,dreamdb.learned-mlp-v1), additional modalities (AV1, HEVC, FLAC, sensor types), federation primitives, named tags, push-style notifications.
13. Open Questions Resolved by This Document
This document explicitly resolves:
- OQ-11 (concrete CBOR tag numbers): § 3.2.
- OQ-12 (multihash algorithm tags — IPFS-aligned): § 3.1.
- OQ-17 (concrete value for the time-anchor CBOR tag): § 3.2 — folded into the single
dreamdb.tag = 65521reserved for foreign-CBOR contexts; not used inside DreamDB schemas.
All other "OQ-NN → 0009" pointers from 0000–0008 are resolved by the test-vector categories enumerated in §5–§9.
14. Open Questions Not Resolved
Two operational decisions deferred to community / operator coordination:
- OQ-42 (→ community / future spec): Public registry of conformance reports. v0 is self-certification only; a formal authority may emerge.
- OQ-43 (→ community / future spec): Publication cadence for the conformance suite repository. v0 is unscheduled — releases tied to spec revisions.
End of v0 specification. Implementations interested in v0 conformance should start with 0000-overview.md, build forward through 0001–0008, and verify against the test categories in this document.