Attach the profiler to the current process. Call once. Subsequent calls are no-ops and log a warning. Returns aDocumentation Index
Fetch the complete documentation index at: https://docs.cirron.com/llms.txt
Use this file to discover all available pages before exploring further.
Profiler handle most
callers discard.
Calling ci.shutdown() clears the singleton; ci.profile() after
that starts a fresh profiler. This is the supported way to reset
state between tests.
Signature
Parameters
| Name | Type | Default | Purpose |
|---|---|---|---|
config | dict? | None | Runtime feature flags read by your code via config.get(...) |
frameworks | list[str]? | None | Skip autodetect and install hooks for the named frameworks |
snapshots | str? | "stats" | "stats", "sampled", or "full" weight/gradient capture |
sample_rate | float? | 0.01 | Fraction of epoch boundaries that serialize raw tensors |
flush_interval | float? | 1.0 | Background flush thread wake interval in seconds |
enabled | bool | True | Set False to build a no-op profiler (zero overhead, no hooks) |
path | str? | None | Override the local spool directory (./.cirron/ by default) |
output | str or list[str] | "spool" | Local sink fan-out: "spool", "log", "stdout", "none" |
None for snapshots / sample_rate / flush_interval means “use
the Cirron instance default”. Pass an explicit value to override
per-call.
What it does
- Resolves config (explicit
config=→ platform global → SDK defaults). - Reads platform context from
CIRRON_RUN_ID,CIRRON_PIPELINE_ID,CIRRON_DEPLOYMENT_ID,CIRRON_WORKSPACE_ID. - Selects a transport: kernel event stream (inside a Cirron pipeline or deployment), HTTP (with an API key), or file-only (neither).
- Autodetects installed frameworks unless
frameworks=is explicit. - Installs hooks for every detected framework. When multiple are
present, the priority order
transformers > tensorflow > torchdetermines which one owns the semanticepochandstepscopes via a sharedHookContext.owned_scopesmap; lower-priority hooks yield on those names and still produce their own lower-level scopes (torch still emitsforward/backward/optimizer_step/data_loadunder the transformers-ownedstep). - Starts the background flush thread.
- Registers
atexit, SIGTERM, and SIGINT handlers for clean shutdown. - Opens the
cirron.sessionroot scope withframework,device,cuda_count, andmixed_precisionattributes.
Snapshot modes
| Mode | Cost per epoch boundary | What’s captured |
|---|---|---|
"stats" | ≤ 50 ms (typical model) | {mean, std, min, max, norm, histogram[16]} per tensor |
"sampled" | ≤ 200 ms on sampled steps | Stats + raw tensor values for random() < sample_rate epochs |
"full" | unbounded; debug-only | Stats + raw tensor values every epoch |
"full" is not recommended for models over 100M parameters. At 7B+,
even "sampled" is expensive; drop the sample_rate.
Output sinks
Theoutput= parameter selects which local sinks the flush thread
writes each batch to. It is independent of the platform transport:
when CIRRON_RUN_ID is set, batches still flow over the kernel event
stream regardless of output. Sinks control the local experience
(disk, logs, terminal), which is why output="none" is safe even
inside a Cirron pipeline.
| Value | Behavior |
|---|---|
"spool" | (Default.) Write each batch as a JSON file under ./.cirron/spool/. Public spool format. |
"log" | Emit one logging.INFO line per closed span on the cirron.trace logger. |
"stdout" | Print one line per closed span to stdout. Same format as "log", no logging configuration. |
"none" | No spool, no log, no print. Traces stay in the in-memory buffer for ci.trace() only. |
list[str] | Multiple sinks fan-out simultaneously, e.g. ["spool", "log"]. |
ValueError at ci.profile() time, before any
hook is installed.
Returns
AProfiler handle exposing
health, flush,
trace, and shutdown.
Examples
Zero-touch
Explicit snapshot mode
sample_rate is the fraction of epoch boundaries that serialize raw
tensors. Higher values give more fidelity for debugging (e.g. you can
inspect the actual weight values at epoch 7 when loss spiked); lower
values keep storage and flush cost bounded. The default 0.01 (1 %)
is conservative. For small models or short runs, 0.05–0.1 is
reasonable; at 7 B+ parameters, stay at 0.01 or lower.
Disable hooks selectively
Dev-only kill switch
Notebook-friendly inspection
ci.trace for the full read-back surface.
Distributed training
Every rank callsci.profile(). The SDK reads RANK, LOCAL_RANK,
and WORLD_SIZE from the environment and tags every span with its
rank. The platform merges views at query time.
Related
Profiling guide
Narrative walk-through of training instrumentation.
Lifecycle
flush, health, and shutdown.