Spec 0018 — Multi-Tenant Operation

Status: Draft (Phase 4 design). Depends on: spec/0001, spec/0005, spec/0006, spec/0008, spec/0012. Motivation: A single-tenant DreamDB deployment works because the operator implicitly owns all storage budgets, query capacity, and access control. Any production SaaS hosting multiple customers — or any internal platform serving multiple teams — needs explicit primitives: per-tenant resource isolation (quota + rate limits), per-tenant identity (so a token from tenant A can't read tenant B), per-tenant cost accounting (so the billing layer knows who consumed what), and noisy-neighbor protection (so one tenant's pathological query can't starve every other tenant). Without these primitives, every multi-tenant deployment reinvents them inconsistently. spec/0018 pins the contract.

1. Purpose

The Space concept (spec/0001 §1) is the natural tenant boundary — one Space ≈ one tenant. The protocol already provides cryptographic isolation (different Genesis Objects ⇒ different Timeline IDs ⇒ no path collisions). What's missing is operational isolation: when tenant A's worker runs a 10B-vector scan inside their Space, the backend must not let it consume so much capacity that tenant B's lightweight query starves.

By the end of this document the following are concrete:

The Space-config quota fields: per-Space storage cap, queries/sec, concurrent streams, GET/PUT bandwidth.
The tenant_id field in capability tokens (extends spec/0012 §5.2).
The TenantUsageBatch ObjectKind: per-Space rolling usage statistics published by the backend; the cost-accounting source of truth.
The rate-limit response contract: when quota is exceeded, the backend returns HTTP 429 with Retry-After; SDKs handle this without conflating with retryable transient failures.
The fair-share scheduling discipline: when multiple tenants share a single backend, query and ingest cost MUST be distributed fairly (no single tenant captures >50% steady-state by default).
The cross-tenant federation contract: when federating, capability tokens MUST identify the SAME tenant on both ends (no implicit privilege escalation).

What stays defined elsewhere:

Per-Space cryptographic identity (Genesis Object) — spec/0001.
HTTP semantics — spec/0005.
Capability token base structure — spec/0012 §5.

What this document does NOT define:

Billing. Cost accounting produces TenantUsageBatch Objects; how operators convert them to invoices is operator-layer.
Authentication / identity provisioning. Who issues the Ed25519 keypair to a tenant is an operator concern (LDAP, OAuth, SSO).
Quota enforcement consistency model. Best-effort eventually-consistent quota tracking is the v0.X contract; strict transaction-bounded quota requires consensus and is out.
Per-modality quotas. Storage budget is per-Space; per-modality sub-allocation is operator-layer.

2. The Space as the tenant boundary

A Space (spec/0001 §1) is one writer's universe — one Genesis Object, one or more Timelines, one Ref namespace. Operationally a Space is owned by exactly one tenant, identified by a stable tenant_id.

tenant_id is OPAQUE to the protocol — a UTF-8 string up to 256 bytes. Operators choose its format (UUID, email, account-id; example: acme-corp-prod). The protocol cares only that it's stable per Space and verifiable in capability tokens.

2.1 Space-config quota fields

The Manifest's Space-config sub-Object (spec/0002 §7.2.0) gains an OPTIONAL quotas field:

"space_config": {
  …,
  "quotas": {
    "tenant_id":          "<utf-8 string>",
    "max_storage_bytes":  <u64 | null>,                  ;; null ⇒ unbounded
    "max_queries_per_sec": <u32 | null>,
    "max_writes_per_sec":  <u32 | null>,
    "max_concurrent_streams": <u32 | null>,
    "max_get_bytes_per_sec": <u64 | null>,
    "max_put_bytes_per_sec": <u64 | null>,
    "max_concurrent_reencodes": <u32 | null>,            ;; spec/0017 jobs
  }
}

Absent quotas ⇒ unbounded (single-tenant deployment behavior).

The quotas live in the Space-config because they're part of the Space's identity from the backend's perspective. An operator changing a tenant's quota publishes a new Manifest with updated Space-config and updates refs/main via CAS — same machinery as any other registry change.

2.2 Enforcement contract

When a request would exceed a quota:

Storage cap reached → PUT returns HTTP 507 Insufficient Storage. Backend includes a DreamDB-Quota: storage,used=<N>,limit=<M> header.
Rate limit reached → GET / PUT returns HTTP 429 Too Many Requests with Retry-After: <seconds>. Same DreamDB-Quota header naming the offending resource.

SDKs MUST distinguish 429 (quota) from 5xx (transient backend failure):

429 → surface as QuotaExceeded error; the caller decides whether to wait and retry.
5xx → spec/0005 exponential-backoff retry path; the SDK handles transparently.

2.3 Quota measurement granularity

Token-bucket per (tenant_id, resource) is the recommended backend implementation. Bucket refill rate equals the quota; bucket depth allows brief bursts.

Refill rate: per the quota.
Bucket depth: typically 5–10× the per-second rate (allows 5–10 second bursts).
Reset on rate-window boundaries; no carryover.

This is implementation-defined; the wire contract is just "exceed → 429 + Retry-After."

3. Tenant identity in capability tokens

spec/0012 §5.2 capability tokens already carry subject. spec/0018 extends with a mandatory tenant_id field (parallel to subject; in practice these are usually the same value for a per-tenant deployment, but separating them allows multi-user-per-tenant scenarios).

;; Extended capability token
{
  "issuer":      <url>,
  "subject":     <bytes>,                 ;; opaque per-user identifier
  "tenant_id":   "<utf-8 string>",        ;; NEW; mandatory in multi-tenant deployments
  "scope":       "read" | "write" | "admin",
  "scope_path":  "<utf-8 string | null>", ;; OPTIONAL: restrict to a sub-tree
  "expires_at":  <u64>,
  "signature":   <bytes>,
}

3.1 Verification at the backend

Every authenticated request includes its capability token (header or query parameter; per spec/0012). The backend verifies:

Signature against the issuer's published pubkey (spec/0012 §5.2).
Not expired.
tenant_id matches the Space-config's tenant_id of the targeted Space.
scope_path, if set, covers the targeted address path.

Step 3 is the load-bearing isolation check. A request to a path under Space S whose Space-config tenant_id = "alice", presented with a token where tenant_id = "bob", MUST be rejected with HTTP 403 Forbidden.

3.2 Cross-tenant federation safety

When a federate verb (spec/0012 §4) crosses tenant boundaries, the source tenant's token MUST be accepted by the destination tenant's backend — which means the destination operator explicitly trusted the source operator's issuer pubkey at deployment time. The token's tenant_id field is preserved across hops:

A pulls from B: the token presented to B carries A's tenant_id. B verifies it has a federation agreement with A.
A pushes to B: the token presented to B carries A's tenant_id. B verifies the destination Space (in B) is the one A is authorized to write to.

There is no implicit privilege escalation. A capability token never gains scope across a federation hop.

4. The TenantUsageBatch Object

Cost accounting needs a protocol-level signal. spec/0018 defines a backend-emitted ObjectKind that summarizes per-Space resource consumption over a rolling window.

4.1 Address path

tenant-usage/<batch-hash>

(New top-level namespace, parallel to manifests/, refs/. The bucket is operator-defined — typically one bucket per backend, but a federation MAY have one per region.)

4.2 CBOR encoding

{
  "version":         1,
  "tenant_id":       "<utf-8 string>",
  "window_start":    <u64>,                    ;; Unix ns
  "window_end":      <u64>,                    ;; Unix ns
  "metrics": {
    "storage_bytes_avg":    <u64>,             ;; average over window
    "storage_bytes_peak":   <u64>,
    "get_count":            <u64>,
    "get_bytes":            <u64>,
    "put_count":            <u64>,
    "put_bytes":            <u64>,
    "list_count":           <u64>,
    "delete_count":         <u64>,
    "query_count":          <u64>,             ;; SDK-level Query/HybridQuery invocations
    "stream_seconds":       <u64>,             ;; aggregate concurrent-stream time
    "reencode_items":       <u64>,             ;; spec/0017 progress
  },
  "violations": [                              ;; OPTIONAL: 429 / 507 events
    { "code": "rate_limit_exceeded",
      "resource": "max_queries_per_sec",
      "count": <u32>,
      "first_at": <u64>,
      "last_at": <u64>
    },
    …
  ],
  "previous_batch":  <multihash | null>,       ;; chain for continuous billing series
}

4.3 Publication cadence

Backends emit TenantUsageBatch Objects on a schedule (recommended: every 1 hour or every 1 GB of activity, whichever first). The chain (previous_batch link) gives operators a content-addressed audit trail.

The most recent TenantUsageBatch per Space is exposed via a Ref:

tenant-usage-refs/<tenant_id>

Pointing at the latest batch hash. Operators poll this Ref for billing extracts.

4.4 Trust model

TenantUsageBatch is produced by the backend operator. Tenants MAY verify their own data by sampling (each batch references its predecessor by hash — a tampering operator would need to rewrite the entire chain). For production billing trust, operators are expected to log batches to an independent audit system; cryptographic verifiability of every counter requires per-operation signatures (out of scope for v0.X).

5. Fair-share scheduling

When a single backend hosts multiple tenants, the backend's scheduler MUST distribute capacity fairly. v0.X requires:

5.1 Per-tenant queues

The backend maintains a queue per tenant (or per (tenant, resource) pair). Requests are admitted from queues in a fair-share order — typically Weighted Fair Queueing or Deficit Round Robin.

5.2 Anti-monopoly invariant

In steady state, no single tenant SHOULD capture more than 1 / N_active_tenants of total capacity for any single resource, where N_active_tenants is the count of tenants with non-empty queues. This bounds noisy-neighbor impact.

(Exception: if other tenants are idle, a single active tenant MAY use up to its full quota. The "more than" bound applies only when contention exists.)

5.3 Pathological-query protection

A single DreamDB query that touches many Bucket Objects (e.g. a federated 100-shard scatter-gather) can consume disproportionate backend capacity. The backend MAY:

Limit fan-out concurrency per tenant (recommended default: 16 concurrent in-flight requests per tenant).
Throttle large-byte responses progressively as the tenant's bucket depletes.
Surface query-cost-estimate headers (DreamDB-Query-Cost: <estimated-units>) on responses, letting tenants self-throttle.

These are implementation-defined; the contract is just "no single query starves the rest."

6. Tenant onboarding and offboarding

6.1 Onboarding

1. Operator generates Ed25519 keypair for the new tenant (or accepts tenant's pubkey).
2. Operator publishes a Manifest with Space-config carrying:
   - tenant_id: <new-tenant-id>
   - quotas: <starting quotas>
3. Operator issues a capability token with scope = "write", tenant_id matching.
4. Tenant SDK uses the token to start writing.

6.2 Offboarding

1. Operator publishes a Manifest with Space-config quotas all set to 0.
   → tenant can no longer write or query.
2. Optional: operator schedules a snapshot-rollup (spec/0008 §9.3) to consolidate
   the tenant's data into a final archival Manifest.
3. After a retention window (operator-defined; typically 30-90 days):
   - All tenant Refs are deleted.
   - GC reclaims all reachable Objects.
4. Final TenantUsageBatch is published for billing closure.

Offboarding is a clean operation in DreamDB because content-addressing makes the storage trivially GC-able once Refs are gone. No "delete user data" scan-and-purge across tables.

A user-level "delete my data" request requires the operator to:

Identify which Items in the tenant's Space contain the user's data.
Publish a Layer Track with a tombstone marking those Items.
After the tombstone propagates and any active sessions reach the new Manifest, schedule a snapshot-rollup that physically excludes the tombstoned Items from the new Manifest's index.
GC reclaims the excluded Items after the safety threshold.

This is operator-driven; the protocol provides the primitives. Per spec/0008, immutability + Layer composition gives the right semantics.

7. Conformance categories (per spec/0009 §8.6.4)

Category	Pass criterion	Coverage
`tenant.quota.storage-507.*`	Storage cap exceeded → HTTP 507 + correct `DreamDB-Quota` header	Multiple storage types
`tenant.quota.rate-429.*`	Rate limit exceeded → HTTP 429 + `Retry-After` header	Per-resource rate
`tenant.token.tenant-id-mismatch.*`	Token tenant_id ≠ Space tenant_id → HTTP 403	All scope levels
`tenant.token.cross-tenant-isolation.*`	Token from tenant A can never access tenant B's paths	Negative test
`tenant.usage-batch.publish-cadence.*`	Batches emitted on schedule; chain of `previous_batch` links unbroken	Multi-window scenario
`tenant.usage-batch.violations-recorded.*`	429/507 events surfaced in subsequent batch's `violations` array	Adversarial load
`tenant.fair-share.anti-monopoly.*`	One tenant cannot capture >50% capacity when other tenants are active	Multi-tenant load test
`tenant.federation.cross-issuer.*`	Federation hop preserves tenant_id; no escalation	Cross-issuer scenarios
`tenant.offboarding.gc.*`	After quota=0 + retention window, all tenant Objects reclaimed	Standard GC + retention

8. Sizing and operational notes

8.1 Quota granularity

Default quotas for a "typical" tier:

Tier	Storage	Queries/sec	Writes/sec	Concurrent streams	GET bandwidth
Free	1 GiB	10	5	2	10 MB/s
Pro	100 GiB	100	50	16	100 MB/s
Enterprise	10 TiB	1000	500	128	1 GB/s
Custom	per-contract … … …

These are operator suggestions, not protocol-mandated. The protocol specifies the wire contract; the values are deployment-policy.

8.2 TenantUsageBatch storage cost

For 1000 tenants, 1-hour cadence, 90-day retention:

24 × 90 = 2160 batches per tenant.
~2 KB per batch (~500 metric ints + small overhead).
Total: 1000 × 2160 × 2 KB = ~4.3 GB.

Negligible at production scale.

8.3 Latency overhead

Quota checks add ~1 ms per request (in-memory token-bucket check). Acceptable; comparable to per-request auth.

9. Out of scope

Cross-region tenant quotas. Each federation participant tracks its own per-tenant quota; a tenant exceeding global quota by spreading load across regions is an operator-policy concern.
Predictive quota. "Tenant will hit cap in 3 days" warnings — operator-layer analytics.
Resource bin-packing. Which backend a new tenant lands on — operator-layer placement.
Token revocation lists. Capability tokens expire by time; revocation requires short TTLs and re-issuance, not a revocation list. Operator decision.

10. Open questions

OQ-75 (→ this spec): Should TenantUsageBatch be signed by the backend's operator key? Currently unsigned (the content-hash chain is the integrity proof); a signature would let tenants verify operator hasn't rewritten history. Defer to first multi-tenant deployment.
OQ-76 (→ this spec): Multi-region quotas. If a tenant has 100 GB across 3 federated backends, is their effective quota 100 GB total or 100 GB per backend? Probably per-backend for operational simplicity; aggregation is a billing-layer concern.
OQ-77 (→ spec/0009): Conformance vectors for fair-share scheduling — synthetic load over N tenants asserting the anti-monopoly invariant. Block multi-tenant conformance on this.
OQ-78 (→ spec/0012): Capability token tenant_id becomes mandatory in multi-tenant deployments. spec/0012's existing token format should be amended to make tenant_id REQUIRED (not optional) when the issuer is a multi-tenant operator.

Next: spec/0019 — data-plane encryption (sketch). Regulated industries need encryption at rest with content-addressing-preserving properties. Last spec in the Phase-4 batch.