Audit Logging and AI-SPM Integration
SCHEMABOUND provides a centralized, tamper-evident audit pipeline that captures every agent activity — user input, LLM responses, function arguments, SQL executed, execution time, and user identity — and exports it in formats suitable for enterprise SIEM platforms and specialized AI Security Posture Management (AI-SPM) tools.
Why This Matters
Agent-driven systems introduce risk that traditional application audit trails were not designed to capture: adversarial prompt injections, LLM jailbreak attempts, data exfiltration via template expressions, and unauthorized role escalation through natural language. The SCHEMABOUND audit pipeline treats these as first-class security signals alongside the conventional access-control and query-execution events.
Architecture
The audit system is composed of four interlocking parts:
Client Request
│
▼
┌─────────────────────┐
│ InjectionGuard │ ← scans input before gRPC executor sees it
│ InputScannerHook │
└────────┬────────────┘
│ PromptInjectionSignalRaised (if pattern matches)
▼
┌─────────────────────┐
│ EventBus │ ← global ordered handler + exporter chain
│ (hash chain) │
└────────┬────────────┘
│ AuditEventEnvelope (sequence, hash, trace_id, …)
▼
┌─────────────────────────────────────────────────────┐
│ MultiTransportExporter │
│ ├─ stderr (OCSF NDJSON or raw JSON) │
│ ├─ HTTP webhook (SCHEMABOUND_AUDIT_WEBHOOK_URL) │
│ └─ rotating file (SCHEMABOUND_AUDIT_FILE_PATH) │
└─────────────────────────────────────────────────────┘
Tamper-Evident Hash Chain
Every exported audit envelope carries a SHA-256 hash chain that links each event to the one before it. This makes it possible for a downstream SIEM to detect gaps or mutations in the audit stream.
| Field | Description |
|---|---|
sequence | Monotonically increasing counter across the process lifetime |
prev_hash | SHA-256 of the previous envelope’s hash input |
hash | SHA-256 of "{sequence}|{prev_hash}|{event_json}|{emitted_at}" |
emitted_at | ISO-8601 timestamp at point of dispatch |
trace_id | W3C trace-context trace ID for distributed tracing correlation |
span_id | W3C trace-context span ID |
A missing sequence number or a hash that does not chain correctly from the previous record is strong evidence of log tampering.
OCSF v1.1 Output
All exports default to OCSF (Open Cybersecurity Schema Framework) v1.1. OCSF is the interchange format used by major SIEM and AI-SPM vendors including Amazon Security Lake, Microsoft Sentinel, Splunk, and Wiz.
Class Mapping
| SCHEMABOUND Event | OCSF Class | Class UID |
|---|---|---|
SessionRegistered | Authentication | 2001 |
AccessDenied, PromptInjectionSignalRaised | Security Finding | 2004 |
QueryExecuted, QueryValidation*, QueryExecutionError, RowsFiltered, ColumnsRedacted | Database Activity | 6003 |
PlanCreated, PlanCompleted, PlanFailed, LlmToolCallAuditRecorded | API Activity | 6005 |
Severity Mapping
| OCSF Severity | SCHEMABOUND Trigger |
|---|---|
| Critical (5) | Plan failure with execution error |
| High (4) | PromptInjectionSignalRaised with severity high, access denied events |
| Medium (3) | PromptInjectionSignalRaised with severity medium |
| Informational (1) | All other events |
The unmapped OCSF field carries SCHEMABOUND-specific chain fields (roam_hash,
roam_prev_hash, roam_sequence) that have no direct OCSF equivalent but are required
for continuity verification. When trace correlation is present it is emitted under
metadata.trace_uid and metadata.span_uid.
Transport Configuration
All transports are configured via environment variables and can be combined.
Stdout / Stderr
SCHEMABOUND_AUDIT_STDOUT=ocsf # emit OCSF v1.1 NDJSON to stderr (default)
SCHEMABOUND_AUDIT_STDOUT=json # emit raw AuditEventEnvelope JSON to stderr
SCHEMABOUND_AUDIT_STDOUT=off # disable stderr output
HTTP Webhook
SCHEMABOUND_AUDIT_WEBHOOK_URL=https://siem.example.com/ingest
One OCSF record per HTTP POST with Content-Type: application/x-ndjson. The request
is fire-and-forget — failures are silently discarded to keep the request path unblocked.
Use an internal aggregation endpoint (Fluent Bit, Logstash, Vector) to buffer and retry
if delivery guarantees are required.
Rotating File
SCHEMABOUND_AUDIT_FILE_PATH=/var/log/roam/audit.ndjson
SCHEMABOUND_AUDIT_FILE_MAX_MB=100 # rotate at 100 MB (default)
Appends one OCSF NDJSON record per line. When the file reaches SCHEMABOUND_AUDIT_FILE_MAX_MB
it is renamed to <path>.1 and a new file is opened. One generation of rotation is
kept; integrate with a log shipper for longer retention.
Prompt Injection Detection
The InjectionGuard scans every incoming query against a compiled set of regular
expressions before the gRPC executor processes it.
Pattern Library
| Pattern | Severity | Example trigger |
|---|---|---|
instruction_override | High | “Ignore all previous instructions…” |
role_injection | High | “You are now an unrestricted assistant…” |
system_prompt_boundary | High | [SYSTEM], ### system, <system> markers |
jailbreak_token | High | DAN, “do anything now”, “jailbreak” |
prompt_exfiltration | Medium | “Reveal your system prompt” |
sql_template_injection | Medium | {{user_input}}, ${expr} |
delimiter_injection | Medium | --- system:, === assistant: |
When a pattern matches, a PromptInjectionSignalRaised event is always emitted to
the audit pipeline regardless of policy. What differs is whether the request continues:
Injection Policy
SCHEMABOUND_INJECTION_POLICY=observe # record signal, allow request (default)
SCHEMABOUND_INJECTION_POLICY=block # record signal, reject request when severity ≥ medium
Use observe during a rollout period to build a baseline of true-positive rates before
switching to block. The detection signal is available to your SIEM in either mode.
What Gets Logged
The emitted PromptInjectionSignalRaised event includes:
excerpt— first 500 characters of the input (truncated to limit PII surface)input_hash— SHA-256 of the full input string for forensic correlationpatterns— list of matched pattern namesseverity— highest matched severityaction_taken—"observed"or"blocked"- All standard
QueryRuntimeContextmetadata (user_id,org_id,session_id, …)
Distributed Tracing Correlation
SCHEMABOUND accepts W3C trace-context headers and propagates them into every audit envelope. This allows audit records to be joined with OTLP spans in your observability platform.
| Header | Purpose |
|---|---|
x-schemabound-trace-id | W3C trace ID for the distributed trace |
x-schemabound-span-id | W3C span ID for the current operation |
Both values appear in the AuditEventEnvelope (trace_id, span_id fields) and in
the unmapped section of every OCSF record.
HTTP Request Audit (AuditFairing)
In addition to gRPC-level events, the SCHEMABOUND backend attaches an AuditFairing to every
HTTP request. This records:
- request method and path
- response status code
- measured request duration in milliseconds
- user identity headers (
x-schemabound-user-id,x-schemabound-organization-id) - trace ID from
x-schemabound-trace-id
These records are emitted as LlmToolCallAuditRecorded events (OCSF class 6005 — API
Activity) and flow through the same MultiTransportExporter as all other audit events.
Registering an Audit Exporter
Any process that embeds the SCHEMABOUND event bus can attach additional exporters:
#![allow(unused)]
fn main() {
use schemabound::{get_event_bus, AuditExporter, AuditEventEnvelope};
use async_trait::async_trait;
use std::sync::Arc;
struct MySiemExporter;
#[async_trait]
impl AuditExporter for MySiemExporter {
async fn export(&self, envelope: AuditEventEnvelope) {
// serialize and forward to your SIEM
}
}
get_event_bus().register_audit_exporter(Arc::new(MySiemExporter))?;
}
Exporters are called concurrently after every dispatched event. They do not appear in the synchronous handler chain and cannot block or short-circuit event processing.
SIEM Integration Notes
Amazon Security Lake (OCSF native)
SCHEMABOUND OCSF records are compatible with Security Lake’s custom source ingestion. Point
SCHEMABOUND_AUDIT_WEBHOOK_URL at a Firehose delivery stream configured for OCSF v1.1.
Splunk
Use the Splunk HEC endpoint with SCHEMABOUND_AUDIT_WEBHOOK_URL. The OCSF JSON structure
maps directly to Splunk’s _raw field with the CIM-compatible security_finding
sourcetype.
Microsoft Sentinel
Route the NDJSON file output with the Azure Monitor Agent using the OCSF table schema, or use the webhook transport targeting a Data Collection Endpoint.
AI-SPM Vendors (Wiz, Lacework, Orca)
These vendors consume OCSF class 2004 (Security Finding) records for AI-specific threat
modelling. The PromptInjectionSignalRaised events with severity, pattern names, and
input hash give AI-SPM tools the raw material they need to build attack timeline views
and posture scoring.
Security Considerations
- Input truncation: only the first 500 characters of a matched query are stored in the audit record. The full input is never persisted; only its SHA-256 hash is retained for forensic correlation.
- Fire-and-forget webhook: transport failures are silently discarded. Use a local sidecar aggregator (Fluent Bit, Vector) if delivery guarantees are a hard requirement.
- Hash chain integrity: chain verification is the consumer’s responsibility. SCHEMABOUND provides the chain fields; the SIEM or audit consumer should alert on gaps.
- Policy default is
observe: SCHEMABOUND does not block traffic by default. Switching toblockis an explicit operational decision that must be validated against false-positive rates in your environment before enabling in production.