Audit Logging and AI-SPM Integration
SCHEMABOUND provides a centralized, tamper-evident audit pipeline that captures every agent activity — user input, LLM responses, function arguments, SQL executed, execution time, and user identity — and exports it in formats suitable for enterprise SIEM platforms and specialized AI Security Posture Management (AI-SPM) tools.
Why This Matters
Agent-driven systems introduce risk that traditional application audit trails were not designed to capture: adversarial prompt injections, LLM jailbreak attempts, data exfiltration via template expressions, and unauthorized role escalation through natural language. The SCHEMABOUND audit pipeline treats these as first-class security signals alongside the conventional access-control and query-execution events.
Architecture
The audit system is composed of four interlocking parts:
Client Request
│
▼
┌─────────────────────┐
│ InjectionGuard │ ← scans input before gRPC executor sees it
│ InputScannerHook │
└────────┬────────────┘
│ PromptInjectionSignalRaised (if pattern matches)
▼
┌─────────────────────┐
│ EventBus │ ← global ordered handler + exporter chain
│ (hash chain) │
└────────┬────────────┘
│ AuditEventEnvelope (sequence, hash, trace_id, …)
▼
┌─────────────────────────────────────────────────────┐
│ MultiTransportExporter │
│ ├─ stderr (OCSF NDJSON or raw JSON) │
│ ├─ HTTP webhook (SCHEMABOUND_AUDIT_WEBHOOK_URL) │
│ └─ rotating file (SCHEMABOUND_AUDIT_FILE_PATH) │
└─────────────────────────────────────────────────────┘
Tamper-Evident Hash Chain
Every exported audit envelope carries a SHA-256 hash chain that links each event to the one before it. This makes it possible for a downstream SIEM to detect gaps or mutations in the audit stream.
| Field | Description |
|---|---|
sequence | Monotonically increasing counter across the process lifetime |
prev_hash | SHA-256 of the previous envelope’s hash input |
hash | SHA-256 of "{sequence}|{prev_hash}|{event_json}|{emitted_at}" |
emitted_at | ISO-8601 timestamp at point of dispatch |
trace_id | W3C trace-context trace ID for distributed tracing correlation |
span_id | W3C trace-context span ID |
A missing sequence number or a hash that does not chain correctly from the previous record is strong evidence of log tampering.
OCSF v1.1 Output
All exports default to OCSF (Open Cybersecurity Schema Framework) v1.1. OCSF is the interchange format used by major SIEM and AI-SPM vendors including Amazon Security Lake, Microsoft Sentinel, Splunk, and Wiz.
Class Mapping
| SCHEMABOUND Event | OCSF Class | Class UID |
|---|---|---|
SessionRegistered | Authentication | 2001 |
AccessDenied, PromptInjectionSignalRaised | Security Finding | 2004 |
QueryExecuted, QueryValidation*, QueryExecutionError, RowsFiltered, ColumnsRedacted | Database Activity | 6003 |
PlanCreated, PlanCompleted, PlanFailed, LlmToolCallAuditRecorded | API Activity | 6005 |
Severity Mapping
| OCSF Severity | SCHEMABOUND Trigger |
|---|---|
| Critical (5) | Plan failure with execution error |
| High (4) | PromptInjectionSignalRaised with severity high, access denied events |
| Medium (3) | PromptInjectionSignalRaised with severity medium |
| Informational (1) | All other events |
The unmapped OCSF field carries SCHEMABOUND-specific chain fields (roam_hash,
roam_prev_hash, roam_sequence) that have no direct OCSF equivalent but are required
for continuity verification. When trace correlation is present it is emitted under
metadata.trace_uid and metadata.span_uid.
Transport Configuration
All transports are configured via environment variables and can be combined.
Stdout / Stderr
SCHEMABOUND_AUDIT_STDOUT=ocsf # emit OCSF v1.1 NDJSON to stderr (default)
SCHEMABOUND_AUDIT_STDOUT=json # emit raw AuditEventEnvelope JSON to stderr
SCHEMABOUND_AUDIT_STDOUT=off # disable stderr output
HTTP Webhook
SCHEMABOUND_AUDIT_WEBHOOK_URL=https://siem.example.com/ingest
One OCSF record per HTTP POST with Content-Type: application/x-ndjson. The request
is fire-and-forget — failures are silently discarded to keep the request path unblocked.
Use an internal aggregation endpoint (Fluent Bit, Logstash, Vector) to buffer and retry
if delivery guarantees are required.
Rotating File
SCHEMABOUND_AUDIT_FILE_PATH=/var/log/roam/audit.ndjson
SCHEMABOUND_AUDIT_FILE_MAX_MB=100 # rotate at 100 MB (default)
Appends one OCSF NDJSON record per line. When the file reaches SCHEMABOUND_AUDIT_FILE_MAX_MB
it is renamed to <path>.1 and a new file is opened. One generation of rotation is
kept; integrate with a log shipper for longer retention.
OTLP / Distributed Tracing
OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
When this variable is set the backend initialises an OpenTelemetry SDK tracer provider with
a Tokio-backed batch span exporter (HTTP/protobuf transport — opentelemetry-otlp with
http-proto + reqwest-client features). Every HTTP request handled by AuditFairing
creates a child span whose trace_id and span_id are captured and written into the
AuditEventEnvelope before the envelope is dispatched when request trace context is
available.
When the variable is absent the SDK is still initialised in no-op mode, so nothing is
exported over the network. Audit envelopes only include trace_id / span_id when
AuditFairing can derive them from propagated tracing context (for example, a valid
traceparent header) or from the legacy headers it falls back to; requests without either
source may not include those fields.
At process exit the provider is shut down gracefully, flushing any in-flight spans.
Collector example (docker-compose):
services:
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
ports:
- "4318:4318" # OTLP HTTP
command: ["--config=/etc/otel/config.yaml"]
# otel/config.yaml
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
exporters:
jaeger:
endpoint: jaeger:14250
tls:
insecure: true
service:
pipelines:
traces:
receivers: [otlp]
exporters: [jaeger]
Prompt Injection Detection
The InjectionGuard scans every incoming query against a compiled set of regular
expressions before the gRPC executor processes it.
Pattern Library
| Pattern | Severity | Example trigger |
|---|---|---|
instruction_override | High | “Ignore all previous instructions…” |
role_injection | High | “You are now an unrestricted assistant…” |
system_prompt_boundary | High | [SYSTEM], ### system, <system> markers |
jailbreak_token | High | DAN, “do anything now”, “jailbreak” |
prompt_exfiltration | Medium | “Reveal your system prompt” |
sql_template_injection | Medium | {{user_input}}, ${expr} |
delimiter_injection | Medium | --- system:, === assistant: |
When a pattern matches, a PromptInjectionSignalRaised event is always emitted to
the audit pipeline regardless of policy. What differs is whether the request continues:
Injection Policy
SCHEMABOUND_INJECTION_POLICY=observe # record signal, allow request (default)
SCHEMABOUND_INJECTION_POLICY=block # record signal, reject request when severity ≥ medium
Use observe during a rollout period to build a baseline of true-positive rates before
switching to block. The detection signal is available to your SIEM in either mode.
What Gets Logged
The emitted PromptInjectionSignalRaised event includes:
excerpt— first 500 characters of the input (truncated to limit PII surface)input_hash— SHA-256 of the full input string for forensic correlationpatterns— list of matched pattern namesseverity— highest matched severityaction_taken—"observed"or"blocked"- All standard
QueryRuntimeContextmetadata (user_id,org_id,session_id, …)
Distributed Tracing Correlation
SCHEMABOUND accepts W3C trace-context headers and propagates them into every audit envelope. This allows audit records to be joined with OTLP spans in your observability platform.
OpenTelemetry (primary)
When OTEL_EXPORTER_OTLP_ENDPOINT is configured (or the no-op SDK is active) the
AuditFairing extracts any incoming W3C traceparent header, creates a child
tracing::Span, and reads the live trace_id / span_id from the active OTel context
via current_trace_context(). Both values are written into every AuditEventEnvelope.
Legacy headers (fallback)
For clients that cannot inject traceparent the following proprietary headers are
accepted as a fallback when no OTel context is available:
| Header | Purpose |
|---|---|
x-schemabound-trace-id | Trace ID (used when traceparent is absent) |
x-schemabound-span-id | Span ID (used when traceparent is absent) |
Both values appear in the AuditEventEnvelope (trace_id, span_id fields) and in
the unmapped section of every OCSF record.
HTTP Request Audit (AuditFairing)
In addition to gRPC-level events, the SCHEMABOUND backend attaches an AuditFairing to every
HTTP request. This records:
- request method and path
- response status code
- measured request duration in milliseconds
- user identity headers (
x-schemabound-user-id,x-schemabound-organization-id) - trace ID from
x-schemabound-trace-id
These records are emitted as LlmToolCallAuditRecorded events (OCSF class 6005 — API
Activity) and flow through the same MultiTransportExporter as all other audit events.
Registering an Audit Exporter
Any process that embeds the SCHEMABOUND event bus can attach additional exporters:
#![allow(unused)]
fn main() {
use schemabound::{get_event_bus, AuditExporter, AuditEventEnvelope};
use async_trait::async_trait;
use std::sync::Arc;
struct MySiemExporter;
#[async_trait]
impl AuditExporter for MySiemExporter {
async fn export(&self, envelope: AuditEventEnvelope) {
// serialize and forward to your SIEM
}
}
get_event_bus().register_audit_exporter(Arc::new(MySiemExporter))?;
}
Exporters are called concurrently after every dispatched event. They do not appear in the synchronous handler chain and cannot block or short-circuit event processing.
SIEM Integration Notes
Amazon Security Lake (OCSF native)
SCHEMABOUND OCSF records are compatible with Security Lake’s custom source ingestion. Point
SCHEMABOUND_AUDIT_WEBHOOK_URL at a Firehose delivery stream configured for OCSF v1.1.
Splunk
Use the Splunk HEC endpoint with SCHEMABOUND_AUDIT_WEBHOOK_URL. The OCSF JSON structure
maps directly to Splunk’s _raw field with the CIM-compatible security_finding
sourcetype.
Microsoft Sentinel
Route the NDJSON file output with the Azure Monitor Agent using the OCSF table schema, or use the webhook transport targeting a Data Collection Endpoint.
AI-SPM Vendors (Wiz, Lacework, Orca)
These vendors consume OCSF class 2004 (Security Finding) records for AI-specific threat
modelling. The PromptInjectionSignalRaised events with severity, pattern names, and
input hash give AI-SPM tools the raw material they need to build attack timeline views
and posture scoring.
Security Considerations
- Input truncation: only the first 500 characters of a matched query are stored in the audit record. The full input is never persisted; only its SHA-256 hash is retained for forensic correlation.
- Fire-and-forget webhook: transport failures are silently discarded. Use a local sidecar aggregator (Fluent Bit, Vector) if delivery guarantees are a hard requirement.
- Hash chain integrity: chain verification is the consumer’s responsibility. SCHEMABOUND provides the chain fields; the SIEM or audit consumer should alert on gaps.
- Policy default is
observe: SCHEMABOUND does not block traffic by default. Switching toblockis an explicit operational decision that must be validated against false-positive rates in your environment before enabling in production.