DreamDB Specification — 0003: Time Encoding
Status: Draft. Builds on
0000-overview.md,0001-data-model.md, and0002-content-addressing.md. This document fixes the byte and string formats for<time-anchor>and<time-bucket>, the contents of the Timeline Genesis Object'soriginfield, and resolves0000OQ-1 (absolute vs. Genesis-relative).
1. Purpose
0001 introduced time as the sole primary key. 0002 defined the slots in the address grammar where time appears (<time-anchor>, <time-bucket>). This document pins down the actual bits and characters:
- The reference frame for timestamps (Genesis-relative, with an optional absolute back-reference).
- The resolution and range of time values.
- The CBOR encoding inside hashable Objects (Genesis, Track, Manifest, Index Pages).
- The string encoding inside
dreamdb://URIs and backend object keys. - The placement-rule arithmetic for
<time-bucket>.
What stays as defined elsewhere:
- The abstract data model of time anchors (point / interval / all-of-time) —
0001§3. - The role of time as primary key —
0000§5.1. - The fragment-spans-bucket-boundary placement rule —
0002§6.3.1. - The two-part address structure —
0002§4.
What this document does not fix:
- Multi-Timeline alignment (mapping anchors between two distinct Timelines) — out of scope for v0 per
0001§11. - The byte format inside Fragments / Buckets / Time-batches —
0007.
2. Resolves OQ-1: Genesis-Relative is Canonical
The canonical form of every DreamDB time anchor is an integer count of resolution units measured from the Timeline's Genesis origin. t = 0 sits at the Genesis origin; positive values are after.
The Genesis Object's origin field carries an optional absolute back-reference — a Unix-nanosecond timestamp at which t = 0 sits in wall-clock time. Three modes are supported:
Genesis origin field | Timeline mode | Meaning |
|---|---|---|
| Unix nanosecond integer | Anchored | t = 0 corresponds to the given Unix-ns instant. Wall-clock conversion is well-defined. |
null | Abstract | The Timeline has no wall-clock anchor. Only relative comparisons within itself are meaningful. |
| (Reserved future tags) | (Future) | Other epoch references — TAI, GPS, monotonic-since-boot. Not in v0. |
Why Genesis-relative as the canonical form:
- Compactness. A 10-year recording at ns resolution fits in 53 bits — comfortably inside a 64-bit integer with room to spare. Genesis-relative integers stay small; absolute Unix-ns integers are already at ~1.8 × 10¹⁸ in 2026 and burn most of the i64 range.
- Abstract timelines. Test data, simulations, fictional timelines, and offline replays don't have a meaningful wall-clock origin. Genesis-relative makes them first-class.
- Wall-clock conversion stays cheap when needed. A reader who wants absolute time fetches the (small, immutable) Genesis Object once and adds the offset.
- No system-clock dependence in the protocol. DreamDB never assumes the writer's clock matches anything in particular. The optional
originfield declares an alleged absolute reference; no protocol verb requires it.
3. Resolution: Nanoseconds (Implicit)
DreamDB v0 fixes time resolution at nanoseconds for every Timeline. The resolution is implicit — it is not stored in the Genesis Object. Every time anchor in v0 is an integer count of nanoseconds since the Genesis origin.
Why ns-only in v0:
- Covers the highest-rate modalities that matter (96 kHz audio, sub-µs sensor pings, video frame edges).
- Slow events on a ns timeline simply use larger numbers. The cost of the larger numbers is bounded — i64 handles 584 years of ns.
- One resolution means there's no Timeline-vs-Timeline unit conversion to get wrong.
- Storing nothing about resolution saves a field per Genesis (small at billion-Genesis scale, but cleaner schema).
A v0.1+ spec extension MAY introduce coarser resolutions (ms, s, etc.) by adding a resolution field to Genesis with a forward-compat default of "ns"; v0 Genesis Objects (no resolution field) are interpreted as ns by all v0.1+ readers.
4. Range
v0 time anchors are unsigned 64-bit integers: 0 ≤ t < 2⁶⁴.
- At ns resolution, this gives ~584 years of range from the Genesis origin. Adequate for every practical recording, surveillance feed, simulation, or archive.
- Negative anchors (before the Genesis origin) are forbidden in v0. Writers SHOULD set the Genesis origin at or before the earliest anchor they expect to record. If items are discovered earlier than the origin, the correct response is to publish them on a new Timeline with an earlier origin (Genesis is immutable).
- A future spec version MAY introduce signed anchors via a bias scheme (e.g. anchors stored as
bias + valueso lex-order remains numeric-order). Compatibility is forward — v0 readers reading a v0 Genesis never encounter negative anchors.
4.1 All time arithmetic is integer arithmetic
Time anchors, time-bucket indices, bucket-durations, and wall-clock back-references are all 64-bit integers. No protocol step may use floating-point arithmetic to compute a time value. This is a hard requirement, not a recommendation.
Why this matters:
- IEEE 754
double(f64) has a 52-bit mantissa. Values above2⁵² ≈ 4.5 × 10¹⁵lose ns-level precision on f64 round-trip. - Unix-nanosecond timestamps in 2026 already exceed
1.7 × 10¹⁸. Any f64 conversion silently quantizes to ~250 ns granularity at this magnitude — destroying every precision guarantee this document makes. - A writer that computes
seconds × 1e9in f64 to produce a ns-resolution anchor introduces drift on the very first conversion. The drift is invisible to the writer but corrupts cross-implementation comparisons forever.
The discipline:
- Anchor arithmetic. All
+,-,floor-div,modoperations between time anchors and bucket-durations MUST use 64-bit (or wider) integer paths. No f64, f32, or "approximate" math anywhere on the time axis. - Wall-clock conversion. Computing
t_relative = t_wall_clock_ns - origin_unix_nsMUST be integer subtraction. Any wall-clock representation (RFC-3339 strings,(seconds, nanoseconds)pairs,time.Timestructs) MUST be converted to a single i64 ns count via integer multiplication and addition (e.g.seconds × 1_000_000_000 + nanoseconds) before any arithmetic — never via an f64 intermediary. - Duration parsing. Parsing the bucket-duration suffix grammar (§8) —
500ms,60s,5m— MUST use integer multiplication against the suffix's integer constant (1_000_000,1_000_000_000,60_000_000_000, …). No f64 path. - Overflow guarding. Writers SHOULD use checked-multiplication primitives when normalizing durations to ns and reject any duration whose ns count would exceed
2⁶³ − 1. (24hfits with vast headroom;1_000_000_000 hdoes not. Guard accordingly.) - Display vs. computation. Output formatting (e.g. printing
tas a decimal-fraction-of-second for human eyes) MAY use floats. Anything that consumes a time value back into protocol space MUST round-trip through an i64 representation first.
This rule applies to writers, readers, SDKs, and any tool that emits or interprets DreamDB time values. Storage formats already enforce integer representation (CBOR int, hex digits in addresses) — this discipline closes the loophole between storage and arithmetic. A conforming implementation has zero f64 ops on the time path.
4.1.1 Non-Rust SDKs and arbitrary-precision integers
Rust, Go, Java, C/C++, and most systems languages have native u64 / i64 types — the discipline is straightforward. Some target languages do not:
- JavaScript / TypeScript: native
Numberis f64. Time arithmetic in plain JS silently loses ns precision the moment Unix-ns values are involved. Conformant JS SDKs MUST useBigIntfor all time values — no exceptions. - Python:
intis arbitrary-precision (good), buttime.time()returns f64 (bad).datetime.timestamp()is also f64. Conformant Python SDKs MUST convert wall-clock to ns viatime.time_ns()(Python 3.7+) or equivalent ns-resolution APIs, never viatime.time() * 1e9. - PHP / older Lua / older shell: similar gotchas. Conformant SDKs MUST use the language's BigInt or arbitrary-precision-integer library throughout the time path.
- WebAssembly: usually backed by Rust/C++/AssemblyScript with native i64 support — fine.
Implementations targeting browser or scripting environments SHOULD include explicit conformance tests demonstrating BigInt round-trips for the standard test vectors. A JS SDK that "works for small Tracks but loses precision at production scale" is non-conformant; it produces silently-wrong addresses.
5. CBOR Encoding (inside hashable Objects)
Time anchors appearing inside CBOR-encoded DreamDB Objects use plain CBOR integers in the canonical encoding — not the dreamdb.tag (per 0002 §3.2). The schema field name (t_start, t_end, coverage, t_min, t_max, origin, etc.) is already authoritative about the field's meaning, so the tag would be redundant.
The dreamdb.tag from 0002 §3.2 remains reserved for use in foreign or ambiguous CBOR contexts — for example, a DreamDB time anchor embedded inside a generic CBOR document outside the spec'd DreamDB schemas, where readers cannot otherwise distinguish it from a plain integer.
This decision is sized for billion-scale: the redundant tag would cost ~6 GB of additional storage on a 1B-fragment Track Object's index across all leaves. See §10.1 for the full sizing argument.
5.1 Point anchor
5.2 Interval anchor
5.3 "All of time" anchor (Constants)
Not encoded. A Constant Track's object_index is a single constant_address; the per-item time anchor is implicit — coverage is the entire Timeline span.
The Track-level coverage field (0001 §4) still carries [t_min, t_max) so readers know the Timeline span being asserted.
5.4 Genesis Object's origin field
The origin field, like all other time-typed fields in DreamDB schemas, is a plain CBOR integer. The Genesis Object does NOT carry a resolution field in v0 — all timestamps are nanoseconds (per §3).
6. String Encoding (inside addresses)
Time anchors and time buckets appearing as path segments in dreamdb:// URIs and backend object keys use a 16-character lowercase hex, big-endian, fixed-width format:
(One concrete example below in §9.1 with arithmetic spelled out.)
Why 16-char fixed-width hex:
- Prefix-orderable. Lexicographic order on the string equals numeric order on the underlying integer. List-prefix queries on time-bucket segments (per
0002§6.3.1) work correctly without requiring backends to interpret the encoding. - Compact. 16 chars vs. 20 chars for fixed-width decimal (i64 max is 20 decimal digits). In an address that already runs to ~150 chars, the savings are marginal but free.
- Universal. Hex is supported everywhere; no encoding-table edge cases.
- Aligns with content-hash style. Although content hashes use base32 (per
0002§8.1) for case-insensitivity in object keys, time anchors don't carry the same case-collision risk — they're already lowercase by construction. Mixing hex (for time) and base32 (for hashes) is intentional and follows the same role-based principle as base2 for spatial keys (0002§6.3.2).
For interval anchors appearing in addresses: only the t_start is used (per the placement rule in 0002 §6.3.1 — the time-bucket of an interval is determined by its start). Address segments therefore only encode point values; intervals appear only inside CBOR-encoded Objects.
7. Time-Bucket Encoding
The <time-bucket> address segment (per 0002 §6.3) is the integer index of a time bucket, derived from the placement rule:
Both t_start and bucket-duration-in-ticks are unsigned 64-bit integers; the division is integer floor-division (per §4.1). No f64 conversion is involved at any step. The bucket-duration is declared in the modality's parameters (e.g. transcript.turn.bucket=10s → bucket-duration = 10 × 1_000_000_000 = 10_000_000_000 ns, computed by integer multiplication).
The result is a non-negative integer encoded as 16-char lowercase hex — same format as time anchors. Reusing one format keeps the address grammar uniform.
Example: for a Fragment with t_start = 60.0 s and bucket-duration = 60 s:
8. Bucket Duration Encoding (Modality Parameters)
The bucket-duration carried in a modality tag (e.g. transcript.turn.bucket=10s) uses a small human-readable suffix grammar:
Examples: bucket=500ms, bucket=10s, bucket=5m, bucket=1h, bucket=24h.
The unit suffixes are a lexical convenience — at protocol level the duration always becomes an integer ns count via integer multiplication against the suffix's integer constant. The constants:
| Suffix | Multiplier (ns) |
|---|---|
ns | 1 |
us | 1_000 |
ms | 1_000_000 |
s | 1_000_000_000 |
m | 60_000_000_000 |
h | 3_600_000_000_000 |
d | 86_400_000_000_000 |
Writers and readers MUST normalize via integer multiplication (per §4.1); no f64 path is permitted. Implementations SHOULD use checked multiplication and reject any duration whose ns count would exceed 2⁶³ − 1.
The set of bucket-duration unit suffixes is fixed in v0; user-defined units are forbidden (the multiplicative semantics of m, h, d are constant SI seconds; calendar-relative units like "month" are deliberately excluded).
9. Worked Examples
9.1 Point anchor, full round-trip
A Discrete Event Track records a button-press at t = 152.481 s after Genesis origin.
- Tick value (ns):
152.481 × 10⁹ = 152_481_000_000 - In hex:
152_481_000_000 = 0x2380936A40(10 hex digits) - Zero-padded to 16 chars (the address-segment form):
0000002380936a40
The CBOR encoding inside an Index Page is just the integer:
9.2 Interval anchor in CBOR
A scene-boundary Event spans t = [60.0 s, 60.1 s):
In the address (when this Event is placed): only t_start = 60_000_000_000 becomes the time-bucket key; the t_end lives in the Track Object's object_index.
9.3 Time-bucket placement across a boundary
Per 0002 §6.3.1, an Item that spans a bucket boundary is placed by t_start. With bucket-duration = 60 s and an interval anchor [59.9 s, 60.1 s):
Even though the Item's t_end extends into bucket 1, its placement segment is 0000000000000000. Time-range queries consult the Track Object's object_index (which records t_end = 60_100_000_000) and find this Item correctly via interval-overlap.
9.4 Genesis with absolute back-reference
A camera Genesis at start of recording, 2026-05-06T09:00:00.000000000Z:
A query for "what's at t = 152.481 s" computes the address segment 0000002380936a40 and prefix-lists. A query for "what's at wall-clock 2026-05-06T09:02:32.481Z" first reads the Genesis (1 small GET, cached), then performs the wall-clock conversion entirely in integer arithmetic (per §4.1):
All operands are i64; no f64 step appears anywhere. The result 152_481_000_000 is then encoded to 0000002380936a40 and the query proceeds identically to the Genesis-relative case.
10. Billion-Scale Considerations
DreamDB's primary proof point is a billion-scale retrieval benchmark. Time-encoding choices at scale are not merely aesthetic — at 1B+ items, small per-anchor costs compound into GB-scale storage and bandwidth. This section gathers the time-encoding implications writers must understand to operate at that scale.
10.1 Per-anchor encoding cost
A Fragment-track Index Page leaf entry, sized concretely:
| Field | Plain CBOR uint | If dreamdb.tag-wrapped |
|---|---|---|
t_start | ~9 B | ~12 B |
t_end | ~9 B | ~12 B |
byte_size | ~5 B | ~5 B |
fragment_address | ~37 B | ~37 B |
| Map overhead | ~5 B | ~5 B |
| Per entry | ~65 B | ~71 B |
At 1 B fragments → ~65 GB vs. ~71 GB total index storage. The 6 GB saved by encoding time anchors as plain CBOR ints inside schema-typed DreamDB fields (per §5) is real I/O and storage cost across millions of Index Pages. This is why §5 reserves the dreamdb.tag for ambiguous foreign-CBOR contexts and uses plain ints in the canonical DreamDB schema encoding.
A future deferred optimization (OQ-20): delta encoding of time anchors within Index Page leaves. Most fragments are sequentially ordered, so t_start_delta_from_page_t_min collapses to ~3-byte CBOR varints, saving an additional ~12 GB on a 1B-fragment track. Deferred to 0007 so the Index Page byte layout can pin the scheme alongside the rest of the Object format.
10.2 Bucket-duration sizing
At 1B items, the choice of bucket-duration determines bucket-Object count on the backend. Concrete table for video at 60 fps over 10 years:
bucket= | Total buckets per modality | Backend behavior |
|---|---|---|
1ms | 315B | Fatal — list-prefix unusable, request fees lethal |
100ms | 3.2B | Still unworkable |
1s | 315M | Borderline; list-prefix slow |
10s | 32M | Workable |
60s | 5.3M | Comfortable |
1h | 88K | Great for archival/cold queries |
1d | 3.7K | Too coarse for typical latency targets |
Writers SHOULD target bucket counts in the 10⁴ – 10⁷ range per modality. This keeps list-prefix latencies bounded, request fees tolerable, and Index Page traversals shallow.
10.3 Bucket-duration ≠ Item duration
A common error at scale: setting bucket=2s because Fragments are 2 s long. Result: every Fragment falls in its own bucket. The bucket structure adds zero aggregation, but the address segment overhead is paid in full.
The SHOULD from 0002 §6.3.1 (max(item-duration) ≤ bucket-duration) is a floor, not a target. At billion-scale, target 10–100× that ratio so each bucket holds tens to hundreds of Fragments. Example: 2 s Fragments with bucket=120s → ~60 Fragments per bucket → meaningful aggregation, sane bucket count.
10.4 Genesis is one cached fetch — even at billion-scale
The Genesis Object is small (~100 bytes) and fetched once per Timeline per session. Caching it in the SDK eliminates Genesis as a hot-path concern, regardless of how many Items exist on the Timeline. No scaling concern; mentioned here only to head off unnecessary worry.
10.5 Integer arithmetic discipline matters more at scale
§4.1's no-f64 rule has a sharper edge at billion-scale: a 250 ns f64 quantization error compounds across 1B comparisons, producing visible drift in interval-overlap tests and bucket-placement decisions. Implementations targeting the billion-scale benchmark MUST audit their time arithmetic paths for f64 contamination — there is zero margin for "this case is small enough not to matter."
11. Comparison and Ordering Rules
- Two point anchors compare by their integer value.
- Two interval anchors compare lexicographically on
(t_start, t_end). - A point at
tand an interval[a, b)overlap iffa ≤ t < b. Two intervals[a, b)and[c, d)overlap iffa < d ∧ c < b. - An "all of time" anchor (Constants) overlaps every other anchor on the same Timeline.
These rules are protocol-level; they govern how time-range queries (0006) and bucket-placement decisions (0002 §6.3.1) interpret anchors.
12. Out of Scope for this Document
- Wall-clock-skew handling. If a writer's system clock is incorrect, items may be tagged with anchors that don't match wall-clock reality. DreamDB does not detect or correct this. The Genesis
originfield documents the writer's claim; readers may compare it against trusted sources externally. - Leap seconds. Unix-ns nominally includes leap-second smearing (depending on the writer's NTP setup). DreamDB inherits whatever the writer's clock provided. If a stricter time domain is needed, future spec MAY add a
"tai"or"utc-no-leap"resolution mode. - Calendar units. Bucket-duration suffixes
m,h,dare constant SI multiples of seconds — not "calendar month" or "civil day with daylight saving." Calendar arithmetic is an application concern. - Multi-Timeline alignment. Per
0001§11, expressing "two Timelines describe the same physical event" requires a higher-level alignment artifact. Out of scope for v0.
13. Open Questions Surfaced by This Document
- OQ-17 (→ 0009 §3.2): Concrete CBOR tag value for time anchors. Resolved alongside OQ-11: single tag
dreamdb.tag = 65521for foreign-CBOR contexts; NOT used inside DreamDB schemas (where time anchors are plain CBOR uints). - OQ-18 (→ future spec): Should v0.1 add coarser resolutions (
"ms","s")? Currently the answer is "no — ns is enough"; revisit if measurements show storage waste from large ns integers in slow-event modalities. - OQ-19 (→ future spec): Should "before-Genesis" (negative) anchors be supported via a bias scheme? Currently out of scope for v0; revisit if real workloads need it.
- OQ-20 (→ 0007 §7.3): Index Page time-anchor delta encoding. Resolved: per-leaf-entry deltas
(t_start - t_min, duration); full u64 anchors at page header (t_min,t_max). - OQ-21 (→ 0007 §7.4): Byte-size delta encoding. Resolved as "no in v0":
byte_sizestored as-is. Encoding complexity not justified relative to time-anchor delta savings; v0.1 may revisit.
Next: 0004-spatial-indexing.md — the hardest doc. Defines the spatial-key derivation algorithm (LSH? PQ? Learned hashing?) that turns a vector into an N-bit string for the spatial-key segment, with quantified locality guarantees that make the 1B-scale benchmark feasible.