TypeScript SDK Reference
The @dreamlake/dreamdb package provides a TypeScript-first interface to the DreamDB protocol. It mirrors the Python SDK's design while using idiomatic TypeScript patterns (async iterators, typed generics, tree-shakeable exports).
Core concepts
| Concept | Description |
|---|---|
| Schema | Declares the fields (image, embedding, scalar, video, audio) a dataset contains. Chainable builder API. |
| Dataset | A versioned collection of multimodal records anchored to a shared timeline. |
| Space | A resolved view of a dataset's Manifest and Tracks, ready for reads. |
| Backend | An S3-compatible HTTP endpoint where Objects are stored (MinIO, AWS S3, R2, etc.). |
Creating a Schema
A Schema declares the shape of every record in a dataset. Methods are chainable:
Schema field methods
| Method | Description |
|---|---|
addImage(name, opts?) | Image field. opts.mime defaults to "jpeg". |
addVideo(name, opts?) | Video field. opts.mime defaults to "mp4". |
addAudio(name, opts?) | Audio field. opts.mime defaults to "wav". |
addEmbedding(name, opts) | Vector embedding. Requires dim. Optional algorithm, lshBits, compressor, spatialIndex, rerank. |
addScalarCategorical(name) | Categorical string scalar (e.g. labels, splits). |
addScalarBool(name) | Boolean scalar. |
addScalarInt(name) | 64-bit integer scalar. |
addScalarFloat(name) | 64-bit float scalar. |
addScalarString(name) | Free-form string scalar. |
addScalarTimestamp(name) | Nanosecond-precision timestamp scalar. |
Every field accepts an optional required parameter (default true).
Opening and creating datasets
Appending data
Records are plain objects keyed by field name. Embedding values are number[] or Float32Array.
Caller-specified time anchors
By default, each record receives a nanosecond timestamp as its anchor. To supply explicit anchors (e.g. when replaying a log file), include the reserved _anchor key:
Within a single appendMany call, either every record must carry _anchor or none may -- mixing is rejected.
Vector queries
iterVector returns the top-K nearest neighbors to a query vector, optionally filtered by scalar predicates.
Parameters
| Parameter | Type | Description |
|---|---|---|
field | string | Name of the embedding field in the schema. |
query | number[] | Query vector (must match the field's dim). |
topK | number | Total results to return across all batches. |
batchSize | number | Rows per batch (default 64). |
whereEq | Record<string, string | number | boolean> | Restrict results to records matching these scalar values. |
Streaming iteration
For full-dataset scans (training, export, analytics), use iterStream to iterate all records with bounded memory:
Unlike eager methods that materialize the entire result set, iterStream yields one batch at a time. This is essential for datasets with millions of records.
Snapshots
Snapshots pin a named label to the current dataset state. They are immutable -- subsequent writes to the dataset do not affect the snapshot.
Branches and merging
Branches allow parallel writes that are later merged back into the main ref.
Merge strategies
| Strategy | Description |
|---|---|
"fast-forward" | Advance the ref pointer without a new Manifest. Only works when the source is a strict descendant. |
"union-tracks" | Three-way fused merge with per-cell reconciliation. Writes a multi-parent Manifest. |
Deletion (tombstones)
DreamDB supports GDPR-style deletion. Tombstoned anchors are suppressed on all subsequent reads. Storage reclamation is a separate operator pass.
Inspection
Compaction
After many appends, bucket fragmentation can slow queries. compact merges fragments back to one bucket per cell.
Compaction is read-online (queries keep hitting the old Manifest until the final atomic ref update), idempotent, and safe to run in production.
Browser usage
In the browser, use DreamDBSpace to resolve a Space URI and read samples directly from an S3-compatible backend -- no application server required:
See the Browser Demo for a live example.