Schemas - Cirron Documentation

Two surfaces are documented here:

The local spool format: what the SDK writes to ./.cirron/. Public API, stable within a major SDK version, consumed by the Cirron ingestion worker and by any third-party tool.
The platform wire schemas: what ends up in the Cirron database after ingestion. Useful when you’re writing queries, building a custom consumer, or exporting to your own storage.

Local spool format (v1)

Directory layout

./.cirron/
  spool/
    <created_ns>-<batch_id>.json       # one batch per file
  snapshots/
    <span_id>/
      weights.safetensors              # sampled / full mode only
      gradients.safetensors            # when gradients are non-None

<created_ns>: wall-clock time the batch was sealed, nanoseconds since Unix epoch, zero-padded to 20 digits. Filenames sort lexicographically in chronological order; the flush thread uses this for oldest-first eviction when the spool cap is exceeded.
<batch_id>: 32-char lowercase hex (UUID4 without dashes).
Files are written via a .json.tmp → os.replace() handoff, so a reader that opens a *.json file always sees a complete batch.

Batch JSON

{
  "schema_version": 1,
  "sdk_version": "0.x.y",
  "batch_id": "abcdef...",
  "created_ns": 1234567890000000000,
  "spans": [ ... ],
  "marks": [ ... ],
  "snapshots": [ ... ]
}

`spans[]`

{
  "id": "hex32",
  "name": "epoch",
  "parent_id": "hex32 | null",
  "index": 0,
  "start_ns": 0,
  "end_ns": 0,
  "cpu_ns": null,
  "gpu_ns": null,
  "memory_peak_bytes": null,
  "thread_id": 140000000,
  "pid": 12345,
  "rank": 0,
  "attrs": { "key": "value" },
  "mark_ids": ["hex32", "..."]
}

cpu_ns, gpu_ns, and memory_peak_bytes default to null. gpu_ns is set by torch CUDA event pairs when profiling a CUDA forward/backward pass. cpu_ns and memory_peak_bytes are reserved and not populated today. mark_ids holds the IDs of every mark attached to this span.

`marks[]`

{
  "id": "hex32",
  "span_id": "hex32 | \"root\"",
  "name": "loss",
  "value_type": "float | int | string | bool",
  "value": 0.5,
  "attrs": { "step": 10 },
  "ts_ns": 0,
  "kind": "point | summary"
}

A mark attaches to the innermost open scope on the producing thread. When no scope is open, it attaches to the cirron.session root. Marks emitted before ci.profile() was called (or after shutdown()) use the legacy "root" sentinel instead of a real span ID.

`snapshots[]`

{
  "id": "hex32",
  "span_id": "hex32",
  "tensor_name": "layer1.0.conv1.weight",
  "shape": [64, 3, 7, 7],
  "dtype": "float32",
  "mode": "stats",
  "stats": {
    "mean": 0.0,
    "std": 0.0,
    "min": 0.0,
    "max": 0.0,
    "norm": 0.0,
    "histogram": {
      "bins":   ["... 17 floats ..."],
      "counts": ["... 16 ints ..."]
    }
  },
  "blob_uri": null,
  "ts_ns": 0,
  "attrs": {}
}

mode values:

"stats": inline statistics only. blob_uri is null. Default.
"sampled": stats + a safetensors blob on random() < sample_rate epoch boundaries. Records that lose the roll stay mode="stats" with blob_uri=null.
"full": stats + blob every epoch. Debug-only; not recommended for 100M+ parameter models.

Sampled and full write one safetensors file per (span, kind): ./.cirron/snapshots/<span_id>/weights.safetensors for weights and gradients.safetensors for gradients. Every record for that span shares the same blob_uri; tensor_name is used verbatim as the key inside the container, so consumers can call container[record["tensor_name"]] directly with no sanitization. Gradient records use tensor_name = "<param>.grad" (e.g. "layer1.0.conv1.weight.grad") and only appear when the gradient was non-None at capture time.

Canonical scope shape

cirron.session
  epoch[n]
    step[n]
      data_load
      forward
      backward
      optimizer_step

Epoch spans are siblings under the session, never nested. When multiple framework hooks coexist (e.g. HF Trainer over a PyTorch DataLoader), only the highest-priority hook owns epoch and step (transformers > tensorflow > torch); others yield on those names so no semantic scope is duplicated. Operations executed before the training loop runs (warmup forwards, sanity checks, optimizer construction) have parent_id == session_id, not an epoch. No epoch exists yet; this is correct behavior, not a bug. For inference, the top-level scope per call is request instead of epoch.

Reading the spool

import json
from pathlib import Path
from safetensors import safe_open

for batch_file in sorted(Path("./.cirron/spool").glob("*.json")):
    batch = json.loads(batch_file.read_text())

    for span in batch["spans"]:
        print(span["name"], span["end_ns"] - span["start_ns"])

    for snap in batch["snapshots"]:
        if snap["blob_uri"] is None:
            continue
        path = snap["blob_uri"].removeprefix("file://")
        with safe_open(path, framework="pt") as f:
            tensor = f.get_tensor(snap["tensor_name"])

Forward compatibility

Readers must tolerate unknown top-level keys and unknown per-span / per-mark fields, so minor SDK bumps can add optional metadata. Removing or renaming existing fields, or changing their types, requires a schema_version bump and follows SemVer.

Wire format: `POST /v1/traces`

When the HTTP transport is active (external runs with an API key), the SDK batches spans / marks / snapshots into the same JSON shape documented above and posts it to POST /v1/traces on the Cirron platform API. The body wraps the batch like this:

POST /v1/traces
  Authorization: Bearer <api_key>
  Content-Type: application/json
  Content-Encoding: gzip
  X-Cirron-SDK-Version: 0.x.y

  {
    "schema_version": 1,
    "sdk_version": "0.x.y",
    "batch_id": "abcdef...",
    "created_ns": 1234567890000000000,
    "spans": [ ... ],
    "marks": [ ... ],
    "snapshots": [ ... ]       # metadata only; blobs upload separately
  }

Successful submissions return 202 Accepted with the batch ID. Idempotent by batch_id (24-hour dedupe window server-side), so retrying the same batch after a timeout is safe. Rate-limited responses return 429 with a Retry-After header the SDK respects via exponential backoff. For self-hosted installs, this is the full wire contract: a custom ingestion worker that accepts the above payload is sufficient to consume SDK traffic.

Platform wire schemas

After ingestion, traces land in these tables. Field names are camelCase (Prisma conventions); the SDK sends snake_case and the ingestion worker maps it.

`TraceSpan`

Field	Type	Notes
`id`	string	cuid
`traceId`	string	Root scope ID for the process session
`parentSpanId`	string?	`null` for root
`name`	string	Scope name (`epoch`, `step`, `forward`, `request`, …)
`index`	int?	Scope index (epoch number, batch number)
`attrs`	json?	Arbitrary user attributes
`startNs`	bigint	Wall time, nanoseconds
`endNs`	bigint	Wall time, nanoseconds
`cpuNs`	bigint?	CPU time
`gpuNs`	bigint?	GPU time; null when CUDA unavailable
`memoryPeakBytes`	bigint?	RSS peak during span
`threadId`	bigint?
`rank`	int	Distributed-training rank (default 0)
`workspaceId`	string	Resource link
`pipelineId`	string?	Resource link
`runId`	string	Resource link
`deploymentId`	string?	Resource link (inference)
`modelId`	string?	Resource link

Indexes: (workspaceId, runId, startNs), (workspaceId, pipelineId, startNs), (workspaceId, deploymentId, startNs), (traceId, parentSpanId).

`TraceMark`

Field	Type	Notes
`id`	string	cuid
`spanId`	string	Owning span
`name`	string	Mark name (`loss`, `grad_norm`, …)
`valueType`	string	`"float"` \| `"int"` \| `"string"` \| `"bool"`
`valueFloat`	float?	Populated when `valueType="float"`
`valueInt`	bigint?	Populated when `valueType="int"`
`valueString`	string?	256-byte cap
`valueBool`	bool?
`attrs`	json?
`tsNs`	bigint	Wall time
`kind`	string	`"point"` (default) \| `"summary"`

`TraceSnapshot`

Field	Type	Notes
`id`	string	cuid
`spanId`	string	Owning span (typically an `epoch`)
`tensorName`	string	e.g. `"layer1.0.conv1.weight"`
`shape`	json	e.g. `[768, 3072]`
`dtype`	string	e.g. `"float32"`
`mode`	string	`"stats"` \| `"sampled"` \| `"full"`
`stats`	json?	`{mean, std, min, max, norm, histogram}` for stats-bearing records
`blobUri`	string?	S3 URI for sampled / full; `null` for pure stats records

Snapshot object-storage layout

s3://<bucket>/traces/<workspace_id>/<run_id>/<span_id>/<snapshot_id>.<ext>

Self-hosted deployments point at MinIO or on-prem S3-compatible storage using the same path scheme.

SDK

Documentation Index

​Local spool format (v1)

​Directory layout

​Batch JSON

​spans[]

​marks[]

​snapshots[]

​Canonical scope shape

​Reading the spool

​Forward compatibility

​Wire format: POST /v1/traces

​Platform wire schemas

​TraceSpan

​TraceMark

​TraceSnapshot

​Snapshot object-storage layout