Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Audit Logging and AI-SPM Integration

SCHEMABOUND provides a centralized, tamper-evident audit pipeline that captures every agent activity — user input, LLM responses, function arguments, SQL executed, execution time, and user identity — and exports it in formats suitable for enterprise SIEM platforms and specialized AI Security Posture Management (AI-SPM) tools.

Why This Matters

Agent-driven systems introduce risk that traditional application audit trails were not designed to capture: adversarial prompt injections, LLM jailbreak attempts, data exfiltration via template expressions, and unauthorized role escalation through natural language. The SCHEMABOUND audit pipeline treats these as first-class security signals alongside the conventional access-control and query-execution events.


Architecture

The audit system is composed of four interlocking parts:

 Client Request
       │
       ▼
 ┌─────────────────────┐
 │  InjectionGuard     │  ← scans input before gRPC executor sees it
 │  InputScannerHook   │
 └────────┬────────────┘
          │ PromptInjectionSignalRaised (if pattern matches)
          ▼
 ┌─────────────────────┐
 │  EventBus           │  ← global ordered handler + exporter chain
 │  (hash chain)       │
 └────────┬────────────┘
          │ AuditEventEnvelope (sequence, hash, trace_id, …)
          ▼
 ┌─────────────────────────────────────────────────────┐
 │  MultiTransportExporter                             │
 │  ├─ stderr (OCSF NDJSON or raw JSON)               │
 │  ├─ HTTP webhook  (SCHEMABOUND_AUDIT_WEBHOOK_URL)          │
 │  └─ rotating file (SCHEMABOUND_AUDIT_FILE_PATH)            │
 └─────────────────────────────────────────────────────┘

Tamper-Evident Hash Chain

Every exported audit envelope carries a SHA-256 hash chain that links each event to the one before it. This makes it possible for a downstream SIEM to detect gaps or mutations in the audit stream.

FieldDescription
sequenceMonotonically increasing counter across the process lifetime
prev_hashSHA-256 of the previous envelope’s hash input
hashSHA-256 of "{sequence}|{prev_hash}|{event_json}|{emitted_at}"
emitted_atISO-8601 timestamp at point of dispatch
trace_idW3C trace-context trace ID for distributed tracing correlation
span_idW3C trace-context span ID

A missing sequence number or a hash that does not chain correctly from the previous record is strong evidence of log tampering.


OCSF v1.1 Output

All exports default to OCSF (Open Cybersecurity Schema Framework) v1.1. OCSF is the interchange format used by major SIEM and AI-SPM vendors including Amazon Security Lake, Microsoft Sentinel, Splunk, and Wiz.

Class Mapping

SCHEMABOUND EventOCSF ClassClass UID
SessionRegisteredAuthentication2001
AccessDenied, PromptInjectionSignalRaisedSecurity Finding2004
QueryExecuted, QueryValidation*, QueryExecutionError, RowsFiltered, ColumnsRedactedDatabase Activity6003
PlanCreated, PlanCompleted, PlanFailed, LlmToolCallAuditRecordedAPI Activity6005

Severity Mapping

OCSF SeveritySCHEMABOUND Trigger
Critical (5)Plan failure with execution error
High (4)PromptInjectionSignalRaised with severity high, access denied events
Medium (3)PromptInjectionSignalRaised with severity medium
Informational (1)All other events

The unmapped OCSF field carries SCHEMABOUND-specific chain fields (roam_hash, roam_prev_hash, roam_sequence) that have no direct OCSF equivalent but are required for continuity verification. When trace correlation is present it is emitted under metadata.trace_uid and metadata.span_uid.


Transport Configuration

All transports are configured via environment variables and can be combined.

Stdout / Stderr

SCHEMABOUND_AUDIT_STDOUT=ocsf   # emit OCSF v1.1 NDJSON to stderr (default)
SCHEMABOUND_AUDIT_STDOUT=json   # emit raw AuditEventEnvelope JSON to stderr
SCHEMABOUND_AUDIT_STDOUT=off    # disable stderr output

HTTP Webhook

SCHEMABOUND_AUDIT_WEBHOOK_URL=https://siem.example.com/ingest

One OCSF record per HTTP POST with Content-Type: application/x-ndjson. The request is fire-and-forget — failures are silently discarded to keep the request path unblocked. Use an internal aggregation endpoint (Fluent Bit, Logstash, Vector) to buffer and retry if delivery guarantees are required.

Rotating File

SCHEMABOUND_AUDIT_FILE_PATH=/var/log/roam/audit.ndjson
SCHEMABOUND_AUDIT_FILE_MAX_MB=100    # rotate at 100 MB (default)

Appends one OCSF NDJSON record per line. When the file reaches SCHEMABOUND_AUDIT_FILE_MAX_MB it is renamed to <path>.1 and a new file is opened. One generation of rotation is kept; integrate with a log shipper for longer retention.


Prompt Injection Detection

The InjectionGuard scans every incoming query against a compiled set of regular expressions before the gRPC executor processes it.

Pattern Library

PatternSeverityExample trigger
instruction_overrideHigh“Ignore all previous instructions…”
role_injectionHigh“You are now an unrestricted assistant…”
system_prompt_boundaryHigh[SYSTEM], ### system, <system> markers
jailbreak_tokenHighDAN, “do anything now”, “jailbreak”
prompt_exfiltrationMedium“Reveal your system prompt”
sql_template_injectionMedium{{user_input}}, ${expr}
delimiter_injectionMedium--- system:, === assistant:

When a pattern matches, a PromptInjectionSignalRaised event is always emitted to the audit pipeline regardless of policy. What differs is whether the request continues:

Injection Policy

SCHEMABOUND_INJECTION_POLICY=observe   # record signal, allow request (default)
SCHEMABOUND_INJECTION_POLICY=block     # record signal, reject request when severity ≥ medium

Use observe during a rollout period to build a baseline of true-positive rates before switching to block. The detection signal is available to your SIEM in either mode.

What Gets Logged

The emitted PromptInjectionSignalRaised event includes:

  • excerpt — first 500 characters of the input (truncated to limit PII surface)
  • input_hash — SHA-256 of the full input string for forensic correlation
  • patterns — list of matched pattern names
  • severity — highest matched severity
  • action_taken"observed" or "blocked"
  • All standard QueryRuntimeContext metadata (user_id, org_id, session_id, …)

Distributed Tracing Correlation

SCHEMABOUND accepts W3C trace-context headers and propagates them into every audit envelope. This allows audit records to be joined with OTLP spans in your observability platform.

HeaderPurpose
x-schemabound-trace-idW3C trace ID for the distributed trace
x-schemabound-span-idW3C span ID for the current operation

Both values appear in the AuditEventEnvelope (trace_id, span_id fields) and in the unmapped section of every OCSF record.


HTTP Request Audit (AuditFairing)

In addition to gRPC-level events, the SCHEMABOUND backend attaches an AuditFairing to every HTTP request. This records:

  • request method and path
  • response status code
  • measured request duration in milliseconds
  • user identity headers (x-schemabound-user-id, x-schemabound-organization-id)
  • trace ID from x-schemabound-trace-id

These records are emitted as LlmToolCallAuditRecorded events (OCSF class 6005 — API Activity) and flow through the same MultiTransportExporter as all other audit events.


Registering an Audit Exporter

Any process that embeds the SCHEMABOUND event bus can attach additional exporters:

#![allow(unused)]
fn main() {
use schemabound::{get_event_bus, AuditExporter, AuditEventEnvelope};
use async_trait::async_trait;
use std::sync::Arc;

struct MySiemExporter;

#[async_trait]
impl AuditExporter for MySiemExporter {
    async fn export(&self, envelope: AuditEventEnvelope) {
        // serialize and forward to your SIEM
    }
}

get_event_bus().register_audit_exporter(Arc::new(MySiemExporter))?;
}

Exporters are called concurrently after every dispatched event. They do not appear in the synchronous handler chain and cannot block or short-circuit event processing.


SIEM Integration Notes

Amazon Security Lake (OCSF native)

SCHEMABOUND OCSF records are compatible with Security Lake’s custom source ingestion. Point SCHEMABOUND_AUDIT_WEBHOOK_URL at a Firehose delivery stream configured for OCSF v1.1.

Splunk

Use the Splunk HEC endpoint with SCHEMABOUND_AUDIT_WEBHOOK_URL. The OCSF JSON structure maps directly to Splunk’s _raw field with the CIM-compatible security_finding sourcetype.

Microsoft Sentinel

Route the NDJSON file output with the Azure Monitor Agent using the OCSF table schema, or use the webhook transport targeting a Data Collection Endpoint.

AI-SPM Vendors (Wiz, Lacework, Orca)

These vendors consume OCSF class 2004 (Security Finding) records for AI-specific threat modelling. The PromptInjectionSignalRaised events with severity, pattern names, and input hash give AI-SPM tools the raw material they need to build attack timeline views and posture scoring.


Security Considerations

  • Input truncation: only the first 500 characters of a matched query are stored in the audit record. The full input is never persisted; only its SHA-256 hash is retained for forensic correlation.
  • Fire-and-forget webhook: transport failures are silently discarded. Use a local sidecar aggregator (Fluent Bit, Vector) if delivery guarantees are a hard requirement.
  • Hash chain integrity: chain verification is the consumer’s responsibility. SCHEMABOUND provides the chain fields; the SIEM or audit consumer should alert on gaps.
  • Policy default is observe: SCHEMABOUND does not block traffic by default. Switching to block is an explicit operational decision that must be validated against false-positive rates in your environment before enabling in production.