Protocol Documentation
Implementation specifications for Argus Active Monitoring and the Atlas Protocol.
Quickstart Implementation
The Argus SDK is a zero-block async client designed to wrap your training loop with zero latency overhead. The following example establishes a authenticated handshake and initiates a monitored session.
from argussdk import Argus
# Initialize the Protocol Gate
argus = Argus(
api_key="PA-XXXX-XXXX",
run_id="model-70b-helltest",
mode="AUTO" # Automated fail-safe interventions
)
# Active Monitoring Loop
for step, (x, y) in enumerate(dataloader):
loss = model(x, y)
# Report telemetry with zero-latency prefetch
argus.step(
loss=loss.item(),
grad_norm=compute_grad_norm(model),
epoch=current_epoch
)The Security Pipeline
Every signal transmitted by the Argus SDK follows a multi-layer verification protocol before ingestion.
Stage 01: Node.js Gate
The SDK first hits our Node.js gateway for high-speed authentication and account status verification. This layer ensures active budget allocation and session legitimacy.
Stage 02: Python Data Acceptor
Once authorized, telemetry is passed to the internal Python module. This is where the Data Acceptor performs clinical sanitization, enforcing hard security boundaries and absolute metadata anonymization.
Telemetry Schema (v2.94)
The following telemetry keys are recognized by the Protocol Engine. All signals are validated for mathematical consistency at the parse level.
run_idstringUnique identifier for the training session.
grad_normfloatGlobal L2 norm of the gradient vector.
loss_deltafloatStep-over-step variation in objective function.
sample_marginsfloat[]Vector of prediction margins for anomaly detection.
sparsity_deltafloatLocal health signal for weight distribution shifts.
sentinel_handshakestringEncrypted intent confirmation for remote halts.
Certified Rolling Checkpoints
Argus implements a 10-slot circular buffer on-disk. Unlike traditional scheduled saves, Argus only triggers a SAVE_NOW signal when the engine detects an early-stage failure smell.
Automatic Fallback on API death.
10-Slot circular weight buffer.
Certified-stable recovery anchors.