Introduction
OAM — Object Agent Mapping — is the framework for giving agents, services, and automation structured and policy-aware access to data.
Where an ORM (Object Relational Mapping) maps application objects to relational rows, OAM maps agent intent to data operations. The framework controls how an agent discovers, accesses, and works with data — and enforces those rules consistently across languages and deployment patterns.
SCHEMABOUND is the runtime that implements OAM. It provides identity-aware execution, policy enforcement, and agent-ready context across application and service boundaries.
Schema Modes
OAM defines three operating modes that determine how an agent interacts with data:
| Mode | Description | Access |
|---|---|---|
| Data-First | The agent discovers the database schema at runtime through introspection. No application model registration required. Best for exploring legacy or external databases. | Read-only |
| Code-First | Only tables explicitly registered by the application are accessible. The application controls validation and access rules. Best when the codebase owns the data. | Read-write |
| Hybrid | Registered models take precedence; unknown tables fall back to introspection. Provides coverage without sacrificing safety where code coverage ends. | Read-only |
Choose Data-First when agents need to explore data without committing to a code model. Choose Code-First when your application owns the data and must enforce validation rules. Use Hybrid when your codebase covers some tables but you still want introspection for the rest.
What This Book Covers
- Architecture explains how SCHEMABOUND fits into application, service, and event-driven systems.
- Runtime Context explains how request metadata and runtime augmentation travel with execution.
- Contributing explains how to propose changes to the public runtime, SDKs, and documentation.
- SDK Guides help you choose the best starting point for Python and .NET integrations.
Where SCHEMABOUND Fits
SCHEMABOUND is designed for teams that want to:
- add policy-aware execution to application and service workflows
- carry stable identity and organization context through runtime operations
- integrate agent-driven or automation-driven behavior without rewriting existing systems
- standardize public integration contracts across multiple languages
- capture a tamper-evident audit trail of every agent action for SIEM and AI-SPM ingestion
- detect and respond to prompt injection, jailbreak attempts, and adversarial inputs at the gateway
Operating Patterns
SCHEMABOUND typically appears in one of two patterns:
- Application-intercepted flows where SCHEMABOUND validates and enriches requests as they move through an API or service boundary.
- Event-driven flows where SCHEMABOUND observes or participates in runtime decisions driven by messages, RPC calls, or automation pipelines.
Quick Links
- schemabound-public for the public Rust core and shared runtime contract
- schemabound-python for Python integrations and automation workflows
- schemabound-dotnet for .NET services and typed enterprise integrations
Suggested Starting Path
- Start with Architecture Overview to understand the public runtime model.
- Read Runtime Context if you need request metadata and runtime-augmentation guidance.
- Choose your SDK: Python or .NET.
- Use API Reference when you are ready for package and protocol details.
Architecture Overview
SCHEMABOUND is designed as a public runtime layer that sits close to execution boundaries. It helps products and services carry identity, policy, and agent-aware context through application logic without forcing teams to redesign the rest of their stack.
The Object Agent Mapping Model
OAM gives the architecture a stable contract for how agents interact with data. Instead of giving an agent direct or unrestricted database access, OAM sits between the agent and the data layer and mediates what the agent can see and do based on the active schema mode.
Schema mode is selected at agent registration time and governs the entire session:
- In Data-First mode, the runtime introspects the live database so the agent can discover and query data without a pre-defined application model. Access is read-only.
- In Code-First mode, only explicitly registered application models are accessible. The application controls validation and data access rules, enabling safe read-write operations.
- In Hybrid mode, registered models take precedence for known tables and the runtime falls back to introspection for everything else. Access is read-only where code coverage ends.
All three modes carry identity and runtime context through the execution path so agent queries stay aligned with organizational policy regardless of which mode is active.
Public Runtime At A Glance
The public SCHEMABOUND surface is organized around three adoption layers:
- Core runtime for shared execution, reflection, and protocol behavior.
- Language SDKs for integrating SCHEMABOUND into Python and .NET applications.
- Shared protocol definitions for teams that need language-neutral contracts or generated bindings.
Public Building Blocks
Runtime Model
The runtime model gives SCHEMABOUND a consistent way to:
- interpret structured requests and tool-facing operations
- apply identity and organization context to execution
- attach runtime augmentation metadata before validation and execution
- emit audit-safe events and observable outcomes
SDK Layer
The Python and .NET SDKs package that runtime model into language-specific integration surfaces. They are the fastest path for teams that want to add SCHEMABOUND to existing products, services, and automation workflows.
Shared Contract
When teams need multi-language interoperability, generated clients, or direct protocol-level integration, the shared protobuf and gRPC contract provides the stable public boundary.
Where SCHEMABOUND Fits In The Stack
SCHEMABOUND usually appears in one of these roles:
- Request-path integration where a service or API validates and enriches execution context before work continues.
- Runtime coordination where a client or middleware layer passes identity, tool, and organization context into downstream execution.
- Event-driven integration where SCHEMABOUND participates in decisions triggered by messages, jobs, or automation pipelines.
Adoption Paths
Start With An SDK
Use an SDK when you want to move quickly inside an application stack. This is the best fit for:
- service teams integrating SCHEMABOUND into existing APIs
- platform teams standardizing runtime context across applications
- automation teams building product or operations workflows
Start With The Shared Contract
Use the protocol definitions when you want to:
- generate your own client bindings
- align multiple services around one public contract
- integrate from a language or platform that does not yet have a first-party SDK
Operating Modes
SCHEMABOUND supports both user-driven and event-driven execution patterns.
Application-Driven Mode
In application-driven flows, a user or upstream service initiates the request. SCHEMABOUND enriches or validates that request as it moves through an application or API boundary.
Event-Driven Mode
In event-driven flows, SCHEMABOUND participates when a message, RPC call, or background job creates a decision point that needs shared runtime context or policy-aware behavior.
Middleware Architecture
SCHEMABOUND middleware is the integration layer that lets products apply shared runtime context, identity-aware execution, and policy decisions close to the point where work actually happens.
Runtime Flow At The Boundary
At a high level, SCHEMABOUND middleware does three things for every participating request:
- establish who or what is acting
- attach the runtime context needed to make a safe decision
- pass only validated execution downstream
The diagram below shows the logical shape of that flow.
Request Flow Diagram
This diagram shows the abstract pipeline a request moves through before execution continues.
sequenceDiagram
participant Client as Client Application
participant API as Product API / Service Boundary
box "SCHEMABOUND Middleware Layer" #f9f9f9
participant Identity as Identity Context
participant Runtime as Runtime Interceptor
participant Policy as Policy Review
end
participant Data as Downstream Service / Data Layer
Note over Client, API: Request enters a SCHEMABOUND-enabled boundary
Client->>API: 1. Submit request
rect rgb(35, 35, 35)
Note over API, Identity: Layer 1: Identity and request context
API->>Identity: 2. Resolve identity and context
Identity->>Identity: Match trusted identity inputs
alt Identity Invalid
Identity-->>API: 401 Unauthorized
API-->>Client: Authentication error
else Identity Valid
Identity->>Runtime: 3. Attach runtime context
end
end
rect rgb(40, 35, 35)
Note over Runtime, Policy: Layer 2: Validation and policy review
Runtime->>Runtime: Interpret request intent
Runtime->>Policy: 4. Evaluate request
Policy->>Policy: Apply execution rules
alt Request Rejected
Policy-->>Runtime: Block execution
Runtime-->>API: 403 Forbidden
API-->>Client: Request rejected
else Request Approved
Policy-->>Runtime: 5. Continue
end
end
rect rgb(35, 40, 35)
Note over Runtime, Data: Layer 3: Downstream execution
Runtime->>Data: 6. Execute downstream work
Data-->>Runtime: 7. Return result
end
Runtime-->>API: 8. Format response
API-->>Client: 9. Final result
What This Layer Adds
- Identity-aware context so requests carry the organization, user, or tool information needed for safe execution.
- Interception at the right boundary so SCHEMABOUND can enrich or validate intent before it reaches core business logic.
- Policy-aware execution so only approved operations continue into downstream systems.
- Minimal disruption to existing systems so teams can integrate SCHEMABOUND without redesigning their application architecture.
Integration Models
SCHEMABOUND is designed to be protocol-agnostic and to sit as close as possible to the moment where intent becomes execution.
Model 1: Embedded Request Integration
This model fits products that already have an API layer and want SCHEMABOUND to participate in request handling without changing the rest of the application stack.
Common fit:
- existing APIs and service boundaries
- application teams adding runtime context and policy checks
- products that want SCHEMABOUND close to synchronous request handling
sequenceDiagram
participant User as End User (UI)
participant API as Product API
participant OAM as SCHEMABOUND Middleware
participant Runtime as SCHEMABOUND Runtime
participant DB as Data Layer
Note over User: 1. User action begins in the product
User->>API: 2. API Request
rect rgb(35, 35, 35)
Note over API, OAM: Embedded request boundary
API->>OAM: Intercept and enrich request
OAM->>OAM: Resolve identity and runtime context
par Runtime Coordination
OAM--)Runtime: Emit runtime event
and Request Validation
OAM->>OAM: Evaluate request policy
end
alt Valid
OAM->>API: Continue request
API->>DB: Execute application work
DB-->>API: Result
API-->>User: Response
else Rejected
OAM-->>User: 403 Rejected Request
end
end
Model 2: Proxy Or Sidecar Boundary
This model fits systems that do not expose a clean application middleware layer but still need a controlled integration boundary.
Common fit:
- legacy applications
- direct data-access clients
- environments that need interception outside the primary application codebase
sequenceDiagram
participant User as Existing Application
participant Proxy as SCHEMABOUND Proxy / Sidecar
participant Runtime as SCHEMABOUND Runtime
participant DB as Data Layer
participant Identity as Identity Source
Note over User: 1. Existing system issues a request
User->>Proxy: 2. Forward request to boundary layer
rect rgb(40, 35, 35)
Note over Proxy: Boundary interception
Proxy->>Identity: Resolve permissions and context
Proxy--)Runtime: Emit runtime event
Proxy->>Proxy: Evaluate request rules
alt Approved
Proxy->>DB: Execute downstream request
DB-->>Proxy: Rows
Proxy-->>User: Result
else Blocked
Proxy-->>User: Rejected Request
end
end
Model 3: Hybrid Runtime Placement
This model fits teams that need local execution boundaries but still want shared governance, visibility, or centralized coordination patterns.
Common fit:
- high-compliance deployments
- organizations with mixed hosted and self-managed infrastructure
- teams that need execution inside their own perimeter
Model 4: Fully Self-Hosted Runtime
This model fits teams that want full control of runtime placement and operational ownership.
Common fit:
- air-gapped or highly restricted environments
- self-managed platform teams
- development and experimentation workflows built entirely on the public stack
sequenceDiagram
participant Source as Product Event Source
participant OAM as Local SCHEMABOUND Runtime
participant Logic as Business Logic
participant DB as System of Record
Source->>OAM: Event / RPC Call
rect rgb(35, 35, 45)
Note over OAM: Local execution boundary
OAM->>OAM: Apply local validation and context
OAM->>Logic: Invoke Handler
end
Logic->>DB: Update State
DB-->>Logic: Success
Logic-->>OAM: Confirmation
OAM-->>Source: Result
Choosing The Right Boundary
Pick the model that keeps SCHEMABOUND closest to the boundary where your system already makes trust and execution decisions. For most teams, that means starting with an SDK or embedded middleware pattern and expanding only when deployment constraints require it.
Event Pipeline
The SCHEMABOUND event pipeline is implemented as a Chain of Responsibility over the global
EventBus. Every domain event — query execution, validation failures, session registration,
trigger fires — flows through a registered sequence of handlers before any subscribers see it.
Handlers are synchronous, ordered, and can short-circuit the chain by returning
HandleOutcome::Stop. The default pipeline always returns HandleOutcome::Continue so the
full chain runs for every event.
Handler Trait
#![allow(unused)]
fn main() {
use schemabound::interceptor::{Event, EventHandler, HandleOutcome};
pub struct MyHandler;
impl EventHandler for MyHandler {
fn handle(&self, event: &Event) -> HandleOutcome {
println!("[my-handler] {}", event.event_type());
HandleOutcome::Continue
}
}
}
Register with the global bus:
#![allow(unused)]
fn main() {
use schemabound::get_event_bus;
get_event_bus().register_handler(Box::new(MyHandler))?;
}
Handlers are invoked in registration order for every dispatched event.
Built-In Handlers (schemabound::handlers)
AuditLogHandler
Writes a one-line JSON audit entry to stderr for every event. This is always the
first handler registered in the default gRPC pipeline so that every domain event is
recorded before downstream processing.
#![allow(unused)]
fn main() {
use schemabound::AuditLogHandler;
get_event_bus().register_handler(Box::new(AuditLogHandler))?;
// → [audit] {"event_type":"QueryExecuted","db_identifier":"...","query":"..."}
}
QueryMetricsHandler
Tracks cumulative counts of QueryExecuted, QueryValidationFailed, and
QueryExecutionError events through lock-free atomic counters. Readable from any thread
at any point.
#![allow(unused)]
fn main() {
use schemabound::{SharedHandler, handlers::QueryMetricsHandler};
use std::sync::Arc;
let metrics = Arc::new(QueryMetricsHandler::new());
get_event_bus().register_handler(Box::new(SharedHandler(Arc::clone(&metrics))))?;
// later — read from a health endpoint
let snap = metrics.snapshot();
println!("executed={} failures={} errors={}",
snap.queries_executed, snap.validation_failures, snap.execution_errors);
}
SessionActivityHandler
Counts SessionRegistered events so the gRPC layer can expose a live session counter
without a database round-trip.
#![allow(unused)]
fn main() {
use schemabound::{SharedHandler, handlers::SessionActivityHandler};
use std::sync::Arc;
let activity = Arc::new(SessionActivityHandler::new());
get_event_bus().register_handler(Box::new(SharedHandler(Arc::clone(&activity))))?;
println!("active sessions: {}", activity.session_count());
}
SharedHandler<T>
A newtype wrapper that lets an Arc<T: EventHandler> be registered with the bus without
transferring ownership, so the same handle can be retained for metrics reads.
#![allow(unused)]
fn main() {
use schemabound::SharedHandler;
// Arc retained for reading; clone registered with bus
get_event_bus().register_handler(Box::new(SharedHandler(Arc::clone(&my_handler))))?;
}
Default Handler Chain
DefaultHandlerChain bundles QueryMetricsHandler and SessionActivityHandler into a
single struct with pre-made Arc handles — the recommended starting point for gRPC
services.
#![allow(unused)]
fn main() {
use schemabound::{AuditLogHandler, DefaultHandlerChain, SharedHandler, get_event_bus};
let chain = DefaultHandlerChain::new();
let bus = get_event_bus();
bus.register_handler(Box::new(AuditLogHandler))?;
bus.register_handler(Box::new(SharedHandler(Arc::clone(&chain.query_metrics))))?;
bus.register_handler(Box::new(SharedHandler(Arc::clone(&chain.session_activity))))?;
// Retain chain for health probes
let queries = chain.query_metrics.snapshot().queries_executed;
let sessions = chain.session_activity.session_count();
}
Handler vs. Subscriber
| Handler | Subscriber | |
|---|---|---|
| API | register_handler | register_subscriber |
| Ordering | Explicit — registration order | Unordered |
| Short-circuit | HandleOutcome::Stop stops the chain | No chain to stop |
| Use cases | audit log, metrics, rate limiting | cache invalidation, notifications |
Use handlers when execution order or short-circuiting matters. Use subscribers for fire-and-forget side effects where order is irrelevant.
Event Reference
All events are variants of the Event enum. The table below shows which events each
built-in handler processes:
| Event variant | AuditLogHandler | QueryMetricsHandler | SessionActivityHandler |
|---|---|---|---|
QueryExecuted | ✓ | increments queries_executed | — |
QueryValidationFailed | ✓ | increments validation_failures | — |
QueryExecutionError | ✓ | increments execution_errors | — |
SessionRegistered | ✓ | — | increments sessions_registered |
TriggerFired | ✓ | — | — |
ModelChanged | ✓ | — | — |
RuntimeAugmentationAuditRecorded | ✓ | — | — |
LlmToolCallAuditRecorded | ✓ | — | — |
PromptInjectionSignalRaised | ✓ | — | — |
| (all others) | ✓ | — | — |
For full details on
LlmToolCallAuditRecordedandPromptInjectionSignalRaised, including OCSF class mapping, transport configuration, and injection policy, see Audit Logging and AI-SPM Integration.
Audit Exporters
In addition to synchronous handlers, SCHEMABOUND provides an async exporter pipeline that runs
after every dispatch. Exporters receive an AuditEventEnvelope containing the full event,
a tamper-evident SHA-256 hash chain, and distributed trace correlation fields.
Register an exporter alongside handlers:
#![allow(unused)]
fn main() {
use schemabound::{get_event_bus, AuditExporter, AuditEventEnvelope};
use async_trait::async_trait;
use std::sync::Arc;
struct MyExporter;
#[async_trait]
impl AuditExporter for MyExporter {
async fn export(&self, envelope: AuditEventEnvelope) {
// forward to SIEM, file, or webhook
}
}
get_event_bus().register_audit_exporter(Arc::new(MyExporter))?;
}
The MultiTransportExporter in schemabound-backend is the reference implementation: it fans
events to stderr (OCSF NDJSON), an HTTP webhook, and a rotating log file, all driven
by environment variables.
Audit Logging and AI-SPM Integration
SCHEMABOUND provides a centralized, tamper-evident audit pipeline that captures every agent activity — user input, LLM responses, function arguments, SQL executed, execution time, and user identity — and exports it in formats suitable for enterprise SIEM platforms and specialized AI Security Posture Management (AI-SPM) tools.
Why This Matters
Agent-driven systems introduce risk that traditional application audit trails were not designed to capture: adversarial prompt injections, LLM jailbreak attempts, data exfiltration via template expressions, and unauthorized role escalation through natural language. The SCHEMABOUND audit pipeline treats these as first-class security signals alongside the conventional access-control and query-execution events.
Architecture
The audit system is composed of four interlocking parts:
Client Request
│
▼
┌─────────────────────┐
│ InjectionGuard │ ← scans input before gRPC executor sees it
│ InputScannerHook │
└────────┬────────────┘
│ PromptInjectionSignalRaised (if pattern matches)
▼
┌─────────────────────┐
│ EventBus │ ← global ordered handler + exporter chain
│ (hash chain) │
└────────┬────────────┘
│ AuditEventEnvelope (sequence, hash, trace_id, …)
▼
┌─────────────────────────────────────────────────────┐
│ MultiTransportExporter │
│ ├─ stderr (OCSF NDJSON or raw JSON) │
│ ├─ HTTP webhook (SCHEMABOUND_AUDIT_WEBHOOK_URL) │
│ └─ rotating file (SCHEMABOUND_AUDIT_FILE_PATH) │
└─────────────────────────────────────────────────────┘
Tamper-Evident Hash Chain
Every exported audit envelope carries a SHA-256 hash chain that links each event to the one before it. This makes it possible for a downstream SIEM to detect gaps or mutations in the audit stream.
| Field | Description |
|---|---|
sequence | Monotonically increasing counter across the process lifetime |
prev_hash | SHA-256 of the previous envelope’s hash input |
hash | SHA-256 of "{sequence}|{prev_hash}|{event_json}|{emitted_at}" |
emitted_at | ISO-8601 timestamp at point of dispatch |
trace_id | W3C trace-context trace ID for distributed tracing correlation |
span_id | W3C trace-context span ID |
A missing sequence number or a hash that does not chain correctly from the previous record is strong evidence of log tampering.
OCSF v1.1 Output
All exports default to OCSF (Open Cybersecurity Schema Framework) v1.1. OCSF is the interchange format used by major SIEM and AI-SPM vendors including Amazon Security Lake, Microsoft Sentinel, Splunk, and Wiz.
Class Mapping
| SCHEMABOUND Event | OCSF Class | Class UID |
|---|---|---|
SessionRegistered | Authentication | 2001 |
AccessDenied, PromptInjectionSignalRaised | Security Finding | 2004 |
QueryExecuted, QueryValidation*, QueryExecutionError, RowsFiltered, ColumnsRedacted | Database Activity | 6003 |
PlanCreated, PlanCompleted, PlanFailed, LlmToolCallAuditRecorded | API Activity | 6005 |
Severity Mapping
| OCSF Severity | SCHEMABOUND Trigger |
|---|---|
| Critical (5) | Plan failure with execution error |
| High (4) | PromptInjectionSignalRaised with severity high, access denied events |
| Medium (3) | PromptInjectionSignalRaised with severity medium |
| Informational (1) | All other events |
The unmapped OCSF field carries SCHEMABOUND-specific chain fields (roam_hash,
roam_prev_hash, roam_sequence) that have no direct OCSF equivalent but are required
for continuity verification. When trace correlation is present it is emitted under
metadata.trace_uid and metadata.span_uid.
Transport Configuration
All transports are configured via environment variables and can be combined.
Stdout / Stderr
SCHEMABOUND_AUDIT_STDOUT=ocsf # emit OCSF v1.1 NDJSON to stderr (default)
SCHEMABOUND_AUDIT_STDOUT=json # emit raw AuditEventEnvelope JSON to stderr
SCHEMABOUND_AUDIT_STDOUT=off # disable stderr output
HTTP Webhook
SCHEMABOUND_AUDIT_WEBHOOK_URL=https://siem.example.com/ingest
One OCSF record per HTTP POST with Content-Type: application/x-ndjson. The request
is fire-and-forget — failures are silently discarded to keep the request path unblocked.
Use an internal aggregation endpoint (Fluent Bit, Logstash, Vector) to buffer and retry
if delivery guarantees are required.
Rotating File
SCHEMABOUND_AUDIT_FILE_PATH=/var/log/roam/audit.ndjson
SCHEMABOUND_AUDIT_FILE_MAX_MB=100 # rotate at 100 MB (default)
Appends one OCSF NDJSON record per line. When the file reaches SCHEMABOUND_AUDIT_FILE_MAX_MB
it is renamed to <path>.1 and a new file is opened. One generation of rotation is
kept; integrate with a log shipper for longer retention.
Prompt Injection Detection
The InjectionGuard scans every incoming query against a compiled set of regular
expressions before the gRPC executor processes it.
Pattern Library
| Pattern | Severity | Example trigger |
|---|---|---|
instruction_override | High | “Ignore all previous instructions…” |
role_injection | High | “You are now an unrestricted assistant…” |
system_prompt_boundary | High | [SYSTEM], ### system, <system> markers |
jailbreak_token | High | DAN, “do anything now”, “jailbreak” |
prompt_exfiltration | Medium | “Reveal your system prompt” |
sql_template_injection | Medium | {{user_input}}, ${expr} |
delimiter_injection | Medium | --- system:, === assistant: |
When a pattern matches, a PromptInjectionSignalRaised event is always emitted to
the audit pipeline regardless of policy. What differs is whether the request continues:
Injection Policy
SCHEMABOUND_INJECTION_POLICY=observe # record signal, allow request (default)
SCHEMABOUND_INJECTION_POLICY=block # record signal, reject request when severity ≥ medium
Use observe during a rollout period to build a baseline of true-positive rates before
switching to block. The detection signal is available to your SIEM in either mode.
What Gets Logged
The emitted PromptInjectionSignalRaised event includes:
excerpt— first 500 characters of the input (truncated to limit PII surface)input_hash— SHA-256 of the full input string for forensic correlationpatterns— list of matched pattern namesseverity— highest matched severityaction_taken—"observed"or"blocked"- All standard
QueryRuntimeContextmetadata (user_id,org_id,session_id, …)
Distributed Tracing Correlation
SCHEMABOUND accepts W3C trace-context headers and propagates them into every audit envelope. This allows audit records to be joined with OTLP spans in your observability platform.
| Header | Purpose |
|---|---|
x-schemabound-trace-id | W3C trace ID for the distributed trace |
x-schemabound-span-id | W3C span ID for the current operation |
Both values appear in the AuditEventEnvelope (trace_id, span_id fields) and in
the unmapped section of every OCSF record.
HTTP Request Audit (AuditFairing)
In addition to gRPC-level events, the SCHEMABOUND backend attaches an AuditFairing to every
HTTP request. This records:
- request method and path
- response status code
- measured request duration in milliseconds
- user identity headers (
x-schemabound-user-id,x-schemabound-organization-id) - trace ID from
x-schemabound-trace-id
These records are emitted as LlmToolCallAuditRecorded events (OCSF class 6005 — API
Activity) and flow through the same MultiTransportExporter as all other audit events.
Registering an Audit Exporter
Any process that embeds the SCHEMABOUND event bus can attach additional exporters:
#![allow(unused)]
fn main() {
use schemabound::{get_event_bus, AuditExporter, AuditEventEnvelope};
use async_trait::async_trait;
use std::sync::Arc;
struct MySiemExporter;
#[async_trait]
impl AuditExporter for MySiemExporter {
async fn export(&self, envelope: AuditEventEnvelope) {
// serialize and forward to your SIEM
}
}
get_event_bus().register_audit_exporter(Arc::new(MySiemExporter))?;
}
Exporters are called concurrently after every dispatched event. They do not appear in the synchronous handler chain and cannot block or short-circuit event processing.
SIEM Integration Notes
Amazon Security Lake (OCSF native)
SCHEMABOUND OCSF records are compatible with Security Lake’s custom source ingestion. Point
SCHEMABOUND_AUDIT_WEBHOOK_URL at a Firehose delivery stream configured for OCSF v1.1.
Splunk
Use the Splunk HEC endpoint with SCHEMABOUND_AUDIT_WEBHOOK_URL. The OCSF JSON structure
maps directly to Splunk’s _raw field with the CIM-compatible security_finding
sourcetype.
Microsoft Sentinel
Route the NDJSON file output with the Azure Monitor Agent using the OCSF table schema, or use the webhook transport targeting a Data Collection Endpoint.
AI-SPM Vendors (Wiz, Lacework, Orca)
These vendors consume OCSF class 2004 (Security Finding) records for AI-specific threat
modelling. The PromptInjectionSignalRaised events with severity, pattern names, and
input hash give AI-SPM tools the raw material they need to build attack timeline views
and posture scoring.
Security Considerations
- Input truncation: only the first 500 characters of a matched query are stored in the audit record. The full input is never persisted; only its SHA-256 hash is retained for forensic correlation.
- Fire-and-forget webhook: transport failures are silently discarded. Use a local sidecar aggregator (Fluent Bit, Vector) if delivery guarantees are a hard requirement.
- Hash chain integrity: chain verification is the consumer’s responsibility. SCHEMABOUND provides the chain fields; the SIEM or audit consumer should alert on gaps.
- Policy default is
observe: SCHEMABOUND does not block traffic by default. Switching toblockis an explicit operational decision that must be validated against false-positive rates in your environment before enabling in production.
Runtime Context
Runtime context is how SCHEMABOUND keeps execution grounded in the real application state that surrounds a request. It gives clients and services a public way to attach stable metadata before validation and execution begin.
Why Runtime Context Matters
Runtime context helps SCHEMABOUND answer practical questions such as:
- which tool or product surface initiated this request
- which user or organization the request belongs to
- which domain tags or table scopes matter for this execution
- which runtime augmentation should be applied before downstream work continues
Without that context, a request may still be valid at the protocol level but incomplete from a product and governance perspective.
Runtime Augmentation
Runtime augmentation is the public mechanism for selecting additional execution context before a request is evaluated.
Clients can use:
x-schemabound-runtime-augmentation-idwhen they already know the specific augmentation identifierx-schemabound-runtime-augmentation-keywhen they want to reference a stable application-facing key
Additional headers help SCHEMABOUND match the right augmentation and preserve the meaning of the request:
x-schemabound-tool-namex-schemabound-tool-intentx-schemabound-user-idx-schemabound-organization-idx-schemabound-domain-tagsx-schemabound-table-names
Distributed Tracing Headers
Two additional headers carry W3C trace-context identifiers for distributed tracing
correlation. When present, these values are propagated into every AuditEventEnvelope
and appear as metadata.trace_uid and metadata.span_uid in every OCSF audit record.
| Header | Purpose |
|---|---|
x-schemabound-trace-id | W3C trace ID — identifies the distributed trace this request belongs to |
x-schemabound-span-id | W3C span ID — identifies the current operation within the trace |
Send these headers when your upstream instrumentation (OpenTelemetry, Jaeger, or similar) already generates trace context. SCHEMABOUND will propagate them through execution events so audit records and OTLP spans can be joined in your observability platform.
What Clients Should Send
Send the smallest stable set of metadata that explains why the request exists and which business boundary it belongs to.
Good examples include:
- the name of the calling product surface or tool
- the user or service identity associated with the request
- the tenant or organization boundary
- domain tags that explain business meaning
- table or resource hints when the execution path depends on them
What SCHEMABOUND Emits
SCHEMABOUND emits resolved augmentation identity into normal query and runtime events so downstream systems can observe which public context was selected.
Sensitive rendered content is intentionally separated from generic event metadata and reserved for dedicated audit handling.
Integration Guidance
Use runtime context when you want SCHEMABOUND behavior to stay aligned with application intent rather than just raw transport details.
Typical uses include:
- attaching product identity in a multi-surface application
- carrying organization context through service-to-service calls
- selecting augmentation rules for automation or assistant workflows
- keeping audit and observability signals consistent across clients
Identity And BYOI
SCHEMABOUND follows a Bring Your Own Identity approach so teams can integrate with the identity systems they already trust instead of recreating users, roles, and organization structure from scratch.
What BYOI Looks Like In Practice
With BYOI, SCHEMABOUND aligns runtime behavior with your existing identity model by mapping external identity information into the public execution context.
That usually means carrying forward:
- organization or tenant boundaries
- user and service identity
- role or permission context
- capability or scope information that affects execution decisions
Why This Matters
Identity-aware execution helps teams:
- keep SCHEMABOUND aligned with existing access-control boundaries
- preserve organizational context across application and service calls
- reduce drift between product identity and runtime behavior
- support agent and automation workflows without inventing a parallel permission system
Common Identity Sources
SCHEMABOUND is well suited to identity models that originate from systems such as:
- enterprise directory providers
- source-control and collaboration platforms
- service-owned role and entitlement systems
- data-layer roles or scope definitions
The exact integration path can vary, but the goal stays the same: keep runtime decisions grounded in the identity model your organization already operates.
Identity In The Execution Path
Identity becomes most useful when it arrives with the request itself. In practice, that means SCHEMABOUND can use identity context to:
- interpret which organization or tenant owns the request
- understand which actor initiated the work
- choose the right runtime augmentation or policy path
- emit more meaningful, audit-safe runtime events
Integration Guidance
The best BYOI integrations keep identity signals stable, explicit, and close to the request boundary.
Start by identifying:
- which system is the source of truth for identity
- which parts of that identity must influence runtime decisions
- which fields need to travel through the public SCHEMABOUND headers or protocol surface
From there, use SCHEMABOUND to preserve that context consistently across clients, services, and execution paths.
Agent Memory
Agent Memory gives SCHEMABOUND sessions an isolated, auditable memory store. Each session accumulates its own history of observations, tool calls, and decisions — scoped so that one session never reads or contaminates another, and past context can always be retrieved exactly as it was recorded.
What It Provides
- Per-session isolation — memory written during one session is never visible to another
- Chronological history — entries are ordered and retrievable in the sequence they were recorded
- Reproducibility — prior agent observations, tool call results, and decisions remain accessible across the full lifetime of a session
- Prompt augmentation — session memory can be injected automatically into prompt hooks so agents carry prior context without the caller managing it manually
REST API
Sessions
GET /api/agent-sessions
Returns the list of all active sessions.
Response
{
"data": [
{
"id": "abc-123",
"created_at": "2026-04-19T10:00:00Z",
"last_seen_at": "2026-04-19T10:42:00Z"
}
]
}
Memory Entries
GET /api/agent-memory/:session_id
Returns all memory entries for a session in chronological order.
Response
{
"data": [
{
"entry_type": "tool_call",
"content": "{ \"tool\": \"query\", \"result\": \"...\" }",
"created_at": "2026-04-19T10:05:00Z"
}
]
}
POST /api/agent-memory/:session_id
Content-Type: application/json
{
"entry_type": "observation",
"content": "User confirmed the report looked correct."
}
Appends a new entry to the session’s memory store.
All responses follow the { data: ... } envelope used by the rest of the SCHEMABOUND API.
Prompt Hook Augmentation
When a prompt hook resolve request includes a session_id, SCHEMABOUND automatically fetches the session’s memory entries and makes them available as {{memory_context}} inside the hook template. The caller does not need to retrieve or serialize prior context manually.
POST /api/prompt-hooks/:id/resolve
Content-Type: application/json
{
"session_id": "abc-123",
"context": { "user_id": "alice", "organization_id": "acme" }
}
The resolved template receives {{memory_context}} alongside the standard context variables before rendering.
Frontend
The Agent Memory dashboard provides:
- Session list — all active sessions with timestamps
- Memory detail — selecting a session shows its entries in order, with entry type and content
Control Plane
The control plane gives SCHEMABOUND a structured way to track and execute multi-step agent workflows. Instead of issuing tool calls one at a time with no shared state, a client can submit a complete plan and drive execution step by step, receiving a feedback object after each step that carries tool output and any schema changes back to the LLM for the next invocation.
Why a Control Plane
A single tool call is stateless. The agent asks a question, SCHEMABOUND answers it, and the conversation moves on. Most production workflows are not that simple — they involve dependent queries, operations that discover schema at runtime, and decisions that build on earlier results.
Without a control plane:
- the LLM must track intermediate state itself, which is unreliable across long conversations
- schema changes that occur mid-workflow reach the LLM late or not at all
- there is no stable record of what the agent intended versus what actually executed
The control plane solves this by making the plan a first-class object that persists through the full execution lifecycle.
Concepts
Plan
A plan is a named, versioned collection of steps with explicit dependency relationships. Steps declare their dependencies with depends_on; the control plane validates the dependency graph before any step executes and rejects cycles.
A submitted plan is assigned a stable plan_id that clients use for all subsequent operations.
Step
Each step in a plan corresponds to one tool call. A step carries:
- a
tool_nameandtool_intentthat govern policy evaluation - a
query_templatethat may reference prior step output via the{{step.<id>.output}}syntax schema_table_hintsthat tell the control plane which tables to snapshot for schema diffing- a
depends_onlist naming steps that must complete before this step can execute
LLM Context Update
When a step finishes, the control plane returns an LlmContextUpdate alongside the step result. This is the explicit feedback object the SDK passes to the next LLM API call.
It carries:
tool_output_json— the serialised result of the step, ready to include in the next messageschema_additions— a list ofSchemaTableDeltaentries (NEW, MODIFIED, or REMOVED) for any tables named inschema_table_hintsthat changed during step executionaugmentation_hints— human-readable strings derived from schema deltas, ready to append to the system prompt
The LLM always sees the current schema state before choosing its next action, which means tool definitions stay accurate even when schema evolves mid-workflow.
Template Substitution
Query templates support {{step.<id>.output}} placeholders. The control plane resolves these server-side before calling the query service — the LLM does not need to construct final SQL or query strings directly.
For example, a step template like:
SELECT * FROM orders WHERE customer_id = {{step.lookup_customer.output}}
becomes a fully resolved query once the lookup_customer step has completed and its output is available.
gRPC API
The control plane is exposed as a gRPC service defined in schemabound-proto.
service ControlPlaneService {
rpc SubmitPlan (SubmitPlanRequest) returns (SubmitPlanResponse);
rpc GetPlanStatus (GetPlanStatusRequest) returns (GetPlanStatusResponse);
rpc ExecuteStep (ExecuteStepRequest) returns (ExecuteStepResponse);
rpc CancelPlan (CancelPlanRequest) returns (CancelPlanResponse);
rpc StreamPlanEvents (StreamPlanEventsRequest) returns (stream PlanEvent);
}
Submit a Plan
message SubmitPlanRequest {
string session_id = 1;
string name = 2;
string description = 3;
repeated PlanStepDef steps = 4;
}
message PlanStepDef {
string step_id = 1;
string name = 2;
string tool_name = 3;
string intent = 4;
string query_template = 5;
repeated string depends_on = 6;
repeated string schema_table_hints = 7;
}
A successful response returns a plan_id. All subsequent calls reference this identifier.
Execute a Step
message ExecuteStepRequest {
string plan_id = 1;
string step_id = 2;
}
message ExecuteStepResponse {
bool success = 1;
StepStatus step_result = 2;
LlmContextUpdate llm_context_update = 3;
string error_message = 4;
}
The llm_context_update field is the value to pass back to the LLM before it chooses the next step.
Stream Plan Events
message StreamPlanEventsRequest {
string plan_id = 1;
}
message PlanEvent {
string plan_id = 1;
string event_type = 2;
string payload_json = 3;
string timestamp = 4;
}
Event types include PlanCreated, PlanStepExecuted, PlanCompleted, and PlanFailed.
REST API
The control plane is also accessible over HTTP.
Create a Plan
POST /api/plans
Content-Type: application/json
{
"session_id": "sess-abc123",
"name": "Customer order lookup",
"description": "Find customer then retrieve their orders",
"steps": [
{
"id": "lookup_customer",
"name": "Look up customer by email",
"tool_name": "query_customers",
"intent": "read_select",
"query_template": "SELECT id FROM customers WHERE email = 'alice@example.com'",
"depends_on": [],
"schema_table_hints": ["customers"]
},
{
"id": "get_orders",
"name": "Get orders for customer",
"tool_name": "query_orders",
"intent": "read_select",
"query_template": "SELECT * FROM orders WHERE customer_id = {{step.lookup_customer.output}}",
"depends_on": ["lookup_customer"],
"schema_table_hints": ["orders"]
}
]
}
Response
{
"data": {
"plan_id": "plan-8f1c2b3d",
"status": "pending"
}
}
Get Plan Status
GET /api/plans/:plan_id
Response
{
"data": {
"plan_id": "plan-8f1c2b3d",
"status": "running",
"steps": [
{ "step_id": "lookup_customer", "status": "completed" },
{ "step_id": "get_orders", "status": "pending" }
]
}
}
Execute a Step
POST /api/plans/:plan_id/steps/:step_id/execute
Response
{
"data": {
"step_result": {
"step_id": "lookup_customer",
"status": "completed",
"row_count": 1,
"executed_at": "2026-04-20T14:00:00Z"
},
"llm_context_update": {
"plan_id": "plan-8f1c2b3d",
"step_id": "lookup_customer",
"tool_output_json": "{\"id\": 42}",
"schema_additions": [],
"augmentation_hints": []
}
}
}
Cancel a Plan
DELETE /api/plans/:plan_id
Runtime Context Headers
Steps execute under the same gRPC metadata model as ordinary queries. Two additional headers carry control-plane identity:
| Header | Purpose |
|---|---|
x-schemabound-plan-id | Identifies the active plan for audit and event correlation |
x-schemabound-step-index | Position of the executing step within the plan |
These are emitted into query events alongside the standard session, user, and organization fields.
OSS Trait Contract
The WorkflowOrchestrator trait in schemabound-public defines the full public interface. Any implementation — including custom ones — must satisfy:
#![allow(unused)]
fn main() {
#[async_trait]
pub trait WorkflowOrchestrator: Send + Sync {
async fn create_plan(
&self,
session_id: &str,
definition: PlanDefinition,
) -> Result<PlanRecord, String>;
async fn get_plan(
&self,
plan_id: &str,
) -> Result<Option<PlanRecord>, String>;
async fn execute_step(
&self,
plan_id: &str,
step_id: &str,
ctx: &QueryRuntimeContext,
) -> Result<(StepResult, LlmContextUpdate), String>;
async fn cancel_plan(
&self,
plan_id: &str,
) -> Result<(), String>;
}
}
NoOpWorkflowOrchestrator is the default in OSS builds. It satisfies the trait boundary and returns an explicit error on any write operation, making the absence of a backing store visible rather than silent.
Event Integration
Control-plane events flow through the same global EventBus used by the rest of the runtime. Four new event variants are available to handlers:
| Event | When emitted |
|---|---|
PlanCreated | Plan accepted and persisted |
PlanStepExecuted | A step finished (success or failure) |
PlanCompleted | All steps completed successfully |
PlanFailed | A step failed or the plan was cancelled |
Existing AuditLogHandler, QueryMetricsHandler, and SessionActivityHandler receive these events the same way they receive query events — no changes to handler registration are needed.
Contribution Workflow
SCHEMABOUND accepts contributions across the public runtime, SDKs, and documentation. The goal of the workflow is simple: keep the public contract stable, keep changes reviewable, and make it clear where each kind of contribution belongs.
Choose The Right Repository
Start in the repository that owns the surface you want to improve.
- Use
schemabound-publicfor shared runtime behavior, public contract changes, and core Rust functionality. - Use
schemabound-pythonfor Python-specific helpers, bindings, packaging, and docs. - Use
schemabound-dotnetfor .NET-specific helpers, bindings, packaging, and docs.
Contribution Flow
- Fork the repository that matches your change.
- Create a small, clearly named branch.
- Add or update tests with the change.
- Update documentation when the public workflow or contract changes.
- Open a pull request against
main. - Address review feedback and keep the branch current until merge.
Branch And PR Expectations
- Keep each change focused enough to review in one pass.
- Prefer small pull requests over large mixed-scope changes.
- If a unit test needs excessive mocking, simplify the production code before adding more scaffolding.
- If a test is called
integration, it should talk to a real started runtime over the network.
Local Validation
Enable the repo-managed hooks after cloning when they are available:
make hooks-install
The local pre-commit path is intended to catch quality and test failures before you open a pull request.
External Contributor Setup
Step 1: Fork And Clone
Example for schemabound-public:
git clone https://github.com/<your-username>/schemabound-public.git
cd schemabound-public
git remote add upstream https://github.com/schemabound/schemabound-public.git
Step 2: Create A Branch
Use a descriptive branch name that matches the change you are making.
git checkout -b improve-runtime-context-docs
Step 3: Make The Change
Guidelines:
- write tests alongside behavior changes
- prefer repo Make targets where they exist
- keep unit tests deterministic and in-process
- keep integration tests runtime-backed and network-bound
- update docs when the public contract or contribution workflow changes
Step 4: Open A Pull Request
Push your branch and open a PR against upstream/main:
git push origin improve-runtime-context-docs
Include:
- a short description of the change
- the problem being solved
- how you validated the change
- any public contract impact or compatibility notes
Review Standards
SCHEMABOUND reviews focus on a few practical questions:
- does the change belong in this repository
- does the implementation stay within the intended public boundary
- do the tests match the behavior being claimed
- does the documentation still describe the public surface accurately
Common Paths
Fixing A Core Runtime Issue
Start with schemabound-public if the change affects shared runtime behavior, protocol shape, or public Rust functionality.
Improving A Python Integration
Start with schemabound-python if the change is specific to the Python developer experience, helper layer, or packaging surface.
Improving A .NET Integration
Start with schemabound-dotnet if the change is specific to the .NET developer experience, helper layer, or packaging surface.
Release Expectations
Pull requests validate changes. Releases publish them.
If you need publication timing or release behavior details, continue with Testing and Release Policy.
Testing and Release Policy
SCHEMABOUND treats testing and release discipline as part of the public contract. The goal is not just to ship code that works, but to ship public surfaces that are validated, predictable, and safe to adopt.
Test Layers
Unit Tests
Unit tests should stay fast, deterministic, and in-process.
Expectations:
- no live RPC or HTTP calls
- no containers
- minimal mocking only
If a unit test needs a large mock hierarchy, treat that as a signal to simplify the production code.
Integration Tests
Integration tests should validate real boundaries.
Expectations:
- real network requests
- a runtime that actually starts
- real dependencies where the boundary matters
- no shortcut local clients standing in for protocol behavior
End-To-End Tests
End-to-end tests validate a deployed environment from the outside. They are the right fit for rollout, wiring, secret, and networking concerns when that environment exists.
CI Expectations
Pull requests to main should pass the quality profile relevant to the repository being changed. That can include:
- linting and formatting
- unit and integration tests
- build validation
- documentation builds
- coverage or maintainability gates where they add real signal
- dependency and security checks
SDK-specific maintainability gates are intentionally selective. They should exist where a codebase contains enough handwritten logic to justify them, not just because a language binding exists.
Local Validation
When available, enable the repo-managed hooks locally:
make hooks-install
The local hook path is intended to catch common failures before you open a pull request. It complements CI, but does not replace it.
Release Discipline
Validation and publication are separate steps.
The expected release path is:
- merge reviewed code to
main - allow validation to complete on the merged revision
- create the appropriate release tag when publication is intended
Current public release patterns include:
public-v*for public subtree publicationsdk-python-v*for Python SDK publicationsdk-<language>-v*as the general form for future SDK release workflows
Why This Separation Exists
Separating merge validation from publication keeps the public SCHEMABOUND surface more predictable for adopters.
It ensures that:
- every published artifact passed review first
- release timing is deliberate rather than accidental
- public packages and docs stay aligned with validated source
- teams can reason about adoption risk more clearly
API Reference
Use this page to choose the fastest way to integrate SCHEMABOUND into your product. Whether you are embedding SCHEMABOUND into an application, automating workflows, or standardizing service-to-service communication, the references below point to the public surfaces intended for real adoption.
Client SDKs
Choose the SDK that best matches your application stack.
Build Python applications and automation flows that integrate SCHEMABOUND with a lightweight, script-friendly client surface.
Integrate SCHEMABOUND into .NET services and enterprise applications with a familiar typed client experience.
Use the core Rust crate when you want maximum control, native performance, or direct access to the public runtime model.
Shared Contract
When you need a language-neutral integration surface, start with the protocol definitions.
Protobuf Definitions
Review the public gRPC contract, message shapes, and service definitions that keep multi-language integrations aligned.
HTTP REST API
The Rocket HTTP backend exposes a live, interactive API reference via Swagger UI. Use it to explore endpoints, inspect request/response schemas, and try calls directly from the browser.
Browse all route schemas, try requests live, and inspect request/response bodies.
Requires the backend to be running (make http-start).
OpenAPI JSON: http://localhost:8000/api/openapi.json
Suggested Starting Points
- Building application logic in Python: start with the Python SDK.
- Shipping a service or platform integration on .NET: start with the .NET SDK.
- Building custom runtimes or native integrations: start with the Rust client crate.
- Aligning multiple clients or generating your own bindings: start with the protobuf definitions.
- Exploring or testing HTTP endpoints interactively: start with the Swagger UI.
Rust SDK Guide (schemabound)
The schemabound crate is the core Rust runtime. It is the reference implementation of the OAM
framework and the foundation that all other SDKs and services build on.
Why Use The Rust SDK Directly
- You are building SCHEMABOUND backend services, proxies, or custom gRPC adapters.
- You need the full execution and introspection surface — not just a client.
- You want zero-overhead integration with Tokio async runtimes.
- You need to implement a custom
MirrorProviderorQueryRuntimeAugmentor.
Installation
Add schemabound to your Cargo.toml:
[dependencies]
schemabound = "0.6"
tokio = { version = "1", features = ["full"] }
Crate Structure
| Module | Purpose |
|---|---|
schemabound::mirror | SQLite schema introspection — tables, columns, indexes, triggers, UDTs, field mappings |
schemabound::executor | SchemaService / QueryService traits + SQLite implementations |
schemabound::grpc_executor | Tonic gRPC server wrapping the executor services |
schemabound::interceptor | Typed event bus — Event enum, EventBus, EventHandler, HandleOutcome |
schemabound::handlers | Built-in CoR handlers — AuditLogHandler, QueryMetricsHandler, SessionActivityHandler, DefaultHandlerChain, SharedHandler |
schemabound::tcp | TCP JSON-RPC transport and per-client auth |
schemabound::policy_engine | Tool-contract policy evaluation and subquery governance |
schemabound::runtime_context | QueryRuntimeContext — the carrier for per-request augmentation metadata |
schemabound::rate_limit | Connection and request rate limiter |
schemabound::mapper | Mapper trait + LocalMapper (SQLite) + TcpMapper (remote) |
Quick Start
Start a gRPC Server
use schemabound::grpc_executor::GrpcExecutor;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let executor = GrpcExecutor::new("path/to/database.db")?;
let handle = executor.start_server("0.0.0.0:50051").await?;
handle.await?;
Ok(())
}
Introspect A SQLite Database
#![allow(unused)]
fn main() {
use schemabound::mirror::introspect_sqlite_path;
let schema = introspect_sqlite_path("path/to/database.db")?;
for table in &schema.tables {
println!("Table: {}", table.name);
for col in &table.columns {
println!(" Column: {} ({}){}", col.name, col.sql_type,
if col.primary_key { " PK" } else { "" });
}
for trigger in &table.triggers {
println!(" Trigger: {} {} {}", trigger.name, trigger.timing, trigger.event);
}
for fm in &table.field_mappings {
println!(" FieldMapping: {} → {} ({})", fm.physical_name, fm.logical_name, fm.orm_convention);
}
}
println!("UDTs: {}", schema.user_defined_types.len());
}
Use The MirrorProvider Trait
MirrorProvider is an async trait for pluggable schema introspection:
#![allow(unused)]
fn main() {
use schemabound::{MirrorProvider, SqliteMirrorProvider};
let provider = SqliteMirrorProvider::new("database.db");
let schema = provider.introspect_schema().await?;
}
Implement the trait to integrate custom databases:
#![allow(unused)]
fn main() {
use schemabound::MirrorProvider;
use schemabound::mirror::SchemaModel;
struct MyCustomProvider;
#[async_trait::async_trait]
impl MirrorProvider for MyCustomProvider {
async fn introspect_schema(&self) -> Result<SchemaModel, String> {
// custom introspection logic
Ok(SchemaModel { tables: vec![], user_defined_types: vec![] })
}
}
}
Register A CoR Handler
Handlers are invoked in registration order for every dispatched event. Use the built-in
collection from schemabound::handlers or implement EventHandler yourself:
#![allow(unused)]
fn main() {
use schemabound::{get_event_bus, AuditLogHandler, DefaultHandlerChain, SharedHandler};
use std::sync::Arc;
// Recommended starting point: audit log first, then metrics/session counters
let chain = DefaultHandlerChain::new();
let bus = get_event_bus();
bus.register_handler(Box::new(AuditLogHandler))?;
bus.register_handler(Box::new(SharedHandler(Arc::clone(&chain.query_metrics))))?;
bus.register_handler(Box::new(SharedHandler(Arc::clone(&chain.session_activity))))?;
// Retain chain handles for health probes
let snap = chain.query_metrics.snapshot();
let sessions = chain.session_activity.session_count();
}
See the Event Pipeline architecture guide for the full handler reference and a comparison with subscribers.
Subscribe To Events (fire-and-forget)
#![allow(unused)]
fn main() {
use schemabound::interceptor::{get_event_bus, Event};
let bus = get_event_bus();
let id = bus.register_subscriber(Box::new(|event: &Event| {
println!("Event: {}", event.event_type());
}))?;
// Unsubscribe when done
bus.unregister_subscriber(id)?;
}
Apply A Policy To A Query
#![allow(unused)]
fn main() {
use schemabound::executor::{QueryServiceImpl, ValidateQueryRequest};
use schemabound::policy_engine::{
AuthorizationContext, AuthorizedSubqueryShape, PolicyContext, SubqueryPolicy, ToolContract, ToolIntent,
};
let mut service = QueryServiceImpl::new();
service.set_db_path("database.db")?;
let request = ValidateQueryRequest {
db_identifier: "db".into(),
query: "SELECT id FROM users WHERE org_id IN (SELECT id FROM organizations)".into(),
parameters: Default::default(),
};
let policy = PolicyContext {
tool: ToolContract {
name: "list-users".into(),
intent: ToolIntent::ReadSelect,
subquery_policy: SubqueryPolicy::AllowListed(vec![
AuthorizedSubqueryShape { table: "organizations".into() },
]),
},
authorization: AuthorizationContext {
allowed_intents: vec![ToolIntent::ReadSelect],
grants: vec!["tool:users.read".into()],
},
};
let response = service.validate_query_with_policy(request, policy).await?;
assert!(response.valid);
}
Schema Introspection Model
The mirror module returns a tree of Rust structs:
SchemaModel
├── tables: Vec<Table>
│ ├── columns: Vec<Column> name, sql_type, nullable, primary_key, default_value
│ ├── indexes: Vec<Index> name, columns
│ ├── unique_indexes: Vec<UniqueIndex>
│ ├── foreign_keys: Vec<ForeignKey>
│ ├── composite_foreign_keys: Vec<CompositeForeignKey>
│ ├── triggers: Vec<Trigger> name, event, timing, table_name, body
│ └── field_mappings: Vec<FieldMapping>
│ └── logical_name, physical_name, orm_convention (Hibernate | EntityFramework)
└── user_defined_types: Vec<UserDefinedType>
└── name, base_type, check_constraint, nullable, default_value
Field mapping convention detection is heuristic:
camelCasecolumn names →HibernatePascalCasecolumn names →EntityFramework(Entity Framework)_prefixedcolumns →EntityFramework(EF Core shadow property convention)
gRPC Proto Mapping
The table_to_table_def function converts a mirror::Table to a proto TableDef:
#![allow(unused)]
fn main() {
use schemabound::grpc_executor::table_to_table_def;
let proto_table = table_to_table_def(&my_mirror_table);
}
Both unique_indexes and indexes are merged into TableDef.indexes, distinguished by
IndexDef.is_unique.
Contributing
See the Contribution Workflow for the full TDD contract — no production code may be written before a failing test exists.
Test Targets
# Unit tests only (fast, no infra)
make test-unit
# Full test suite (schemabound-public)
make test FILTER=schemabound-public
# Proto unit tests
make test-proto
Python SDK Guide
Use the schemabound-python package when you want to integrate SCHEMABOUND into Python applications, automation workflows, and service-side tooling without giving up a script-friendly developer experience.
Why Choose The Python SDK
- Build Python services and internal tools that need direct access to SCHEMABOUND capabilities.
- Add SCHEMABOUND-backed workflow automation to notebooks, jobs, and lightweight application code.
- Move quickly with a familiar Python surface while staying aligned with the public SCHEMABOUND contract.
Installation
pip install schemabound-python
What You Get
The Python SDK gives you:
- A Python-first client surface for integrating SCHEMABOUND into application and automation code.
- Typed bindings over the public runtime model so Python code stays aligned with the supported SCHEMABOUND contract.
- Utility helpers and examples that make it easier to adopt SCHEMABOUND in real product and workflow scenarios.
Quick Start
from roam.mirror import ReflectionEngine
from roam.executor import GrpcServer
# Initialize reflection
engine = ReflectionEngine()
# Start server
server = GrpcServer()
server.run()
This is the fastest path when you want to stand up a Python-based integration, validate connectivity, and start building application logic around SCHEMABOUND.
Runtime Augmentation
SCHEMABOUND runtime calls can carry runtime-augmentation selection metadata so your application can attach stable request context before validation or execution.
Use the public runtime headers when you need SCHEMABOUND behavior to reflect product context such as the calling tool, organization, or domain.
The key runtime headers are:
x-schemabound-runtime-augmentation-idto reference a specific augmentation identifierx-schemabound-runtime-augmentation-keyto reference a stable augmentation keyx-schemabound-tool-name,x-schemabound-tool-intent,x-schemabound-user-id,x-schemabound-organization-id,x-schemabound-domain-tags, andx-schemabound-table-namesto provide matching context
SCHEMABOUND emits resolved augmentation identity into normal query events, while sensitive rendered content remains reserved for dedicated audit handling.
Suggested Starting Points
- Building automation, internal tools, or orchestration logic in Python: start here.
- Prototyping a SCHEMABOUND integration before standardizing it across services: start here.
- Passing runtime context from application code into SCHEMABOUND execution paths: start with the runtime-augmentation headers above.
Contributing
For Rust Core Changes
If you need to change the shared public runtime or core SCHEMABOUND behavior:
- File an issue in schemabound-public
- Submit a PR to schemabound-public (see Contribution Workflow)
- Once merged and exported, the change flows to schemabound-python automatically
For Python Layer Improvements
To improve the Python experience, add helpers, or expand documentation:
- Fork schemabound-python
- Create a feature branch
- Make changes in
roam/(Python layer only) - Submit PR to schemabound-python/main
- We’ll review and merge
Example:
# Add a utility function
# roam/utils/config.py
def load_config_from_file(path: str) -> dict:
"""Load SCHEMABOUND configuration from YAML file."""
...
API Reference
See the full API docs for the Python package surface, method signatures, and integration details.
.NET SDK Guide
Use the schemabound-dotnet package when you want to integrate SCHEMABOUND into .NET services, enterprise applications, and platform components with a typed client experience that fits naturally into the broader .NET ecosystem.
Why Choose The .NET SDK
- Integrate SCHEMABOUND into ASP.NET, worker services, and internal platform components.
- Build strongly typed service-to-service integrations on a familiar .NET foundation.
- Adopt SCHEMABOUND in production applications without dropping to lower-level protocol work unless you need it.
Installation
dotnet add package Schemabound.Dotnet
Or via NuGet:
Install-Package Schemabound.Dotnet
What You Get
The .NET SDK gives you:
- A typed .NET integration surface for embedding SCHEMABOUND into application and platform code.
- Interop over the public runtime model so .NET services stay aligned with the supported SCHEMABOUND contract.
- Utilities and examples that help teams wire SCHEMABOUND into real service and enterprise deployment patterns.
Quick Start
using Schemabound;
var engine = new ReflectionEngine();
var server = new GrpcServer();
await server.RunAsync();
This is the fastest way to stand up a .NET integration, validate the client path, and start wiring SCHEMABOUND into a service or application workflow.
Runtime Augmentation
SCHEMABOUND runtime calls can carry runtime-augmentation selection metadata so your .NET application can attach stable business and request context before validation or execution.
Use the public runtime headers when you need SCHEMABOUND behavior to reflect application identity, tool intent, tenant boundaries, or domain-specific routing context.
The key runtime headers are:
x-schemabound-runtime-augmentation-idto reference a specific augmentation identifierx-schemabound-runtime-augmentation-keyto reference a stable augmentation keyx-schemabound-tool-name,x-schemabound-tool-intent,x-schemabound-user-id,x-schemabound-organization-id,x-schemabound-domain-tags, andx-schemabound-table-namesto provide matching context
SCHEMABOUND emits resolved augmentation identity into normal query events, while sensitive rendered content remains reserved for dedicated audit handling.
Suggested Starting Points
- Shipping a typed platform or product integration on .NET: start here.
- Embedding SCHEMABOUND into an ASP.NET or worker-service architecture: start here.
- Passing application context into SCHEMABOUND execution paths: start with the runtime-augmentation headers above.
Contributing
For Rust Core Changes
If you need to change the shared public runtime or core SCHEMABOUND behavior:
- File an issue in schemabound-public
- Submit a PR to schemabound-public (see Contribution Workflow)
- Once merged and exported, the change flows to schemabound-dotnet automatically
For .NET Layer Improvements
To improve the .NET developer experience, add helpers, or expand documentation:
- Fork schemabound-dotnet
- Create a feature branch
- Make changes in
Schemabound/(.NET layer only) - Submit PR to schemabound-dotnet/main
- We’ll review and merge
Example:
// Add a utility class
// Schemabound/Config/ConfigLoader.cs
public static class ConfigLoader
{
public static RoamConfig LoadFromFile(string path)
{
// Load configuration
}
}
API Reference
See the full API docs for the .NET package surface, method signatures, and integration details.
LlmSchema Derive Macro
The LlmSchema derive macro — provided by the schemabound-schema crate — generates a JSON Schema
description of any Rust struct whose fields are annotated with standard Serde attributes.
This schema is used at runtime to build context-aware prompts and to expose entity structure
to LLM tool-calls, gRPC clients, and the Python SDK.
How it works
#[derive(LlmSchema)] delegates to schemars::schema_for!() under
the hood and exposes a single method:
#![allow(unused)]
fn main() {
pub fn llm_schema() -> schemars::schema::RootSchema
}
The returned RootSchema is fully JSON-serialisable and can be embedded directly in prompts,
returned over gRPC, or published to the OpenAPI spec.
SeaORM integration (Rust)
SeaORM entity models are the primary target for LlmSchema. Add both LlmSchema and
JsonSchema to your DeriveEntityModel derive list:
#![allow(unused)]
fn main() {
use roam_schema::LlmSchema;
use schemars::JsonSchema;
#[derive(Clone, Debug, PartialEq, DeriveEntityModel, Serialize, Deserialize, LlmSchema, JsonSchema)]
#[sea_orm(table_name = "organizations")]
pub struct Model {
#[sea_orm(primary_key, auto_increment = false)]
pub id: Uuid,
pub name: String,
pub slug: String,
pub description: String,
pub owner_id: String,
}
}
With this in place, the schema is available at runtime without reflection:
#![allow(unused)]
fn main() {
let schema = organization::Model::llm_schema();
let schema_json = serde_json::to_string_pretty(&schema).unwrap();
}
Registering with a prompt hook
Pass the JSON schema to a PromptHookSchemaContext when resolving a prompt:
#![allow(unused)]
fn main() {
let schema_json = serde_json::to_string(&organization::Model::llm_schema()).unwrap();
let request = PromptHookResolveRequest {
schema_context: PromptHookSchemaContext {
database_id: Some("prod-db".to_string()),
table_names: vec!["organizations".to_string()],
domain_tags: vec!["identity".to_string()],
},
..Default::default()
};
}
The matching rules in your prompt hook YAML can reference table_names and domain_tags:
schema:
table_names: ["organizations"]
domain_tags: ["identity"]
SQLAlchemy integration (Python SDK)
The Python SDK exposes entity schemas through the SchemaboundClient.get_schema() gRPC call.
You do not need to replicate SeaORM models in Python — the schema travels over the wire.
Installation
pip install schemabound-sdk
# or with uv
uv add schemabound-sdk
Fetching a schema
from schemabound_sdk import SchemaboundClient
client = SchemaboundClient(host="localhost", port=50051)
# Retrieve the JSON Schema for a specific entity table
schema = client.get_schema(table_name="organizations")
print(schema)
# → {"$schema": "http://json-schema.org/draft-07/schema#", "title": "Model", ...}
Using the schema with SQLAlchemy
The returned dict is a standard JSON Schema object. Pass it to your ORM model for dynamic validation or prompt-context injection:
import json
from sqlalchemy import create_engine, text
from schemabound_sdk import SchemaboundClient
client = SchemaboundClient(host="localhost", port=50051)
schema = client.get_schema(table_name="organizations")
# Validate a row dict against the schema (e.g. using jsonschema)
import jsonschema
jsonschema.validate(instance=row_dict, schema=schema)
# Or embed the schema directly in an LLM prompt
prompt_context = json.dumps(schema, indent=2)
Injecting schema into a prompt hook
from schemabound_sdk import SchemaboundClient, PromptHookResolveRequest, SchemaContext
client = SchemaboundClient(host="localhost", port=50051)
request = PromptHookResolveRequest(
schema_context=SchemaContext(
database_id="prod-db",
table_names=["organizations"],
domain_tags=["identity"],
)
)
resolution = client.resolve_prompt_hook(request)
print(resolution.rendered_prompt)
Field-level annotations
All standard Serde rename / skip attributes are respected by schemars:
#![allow(unused)]
fn main() {
#[derive(LlmSchema, JsonSchema, Serialize, Deserialize)]
pub struct Model {
pub id: Uuid,
/// Human-readable display name
pub name: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub description: Option<String>,
#[serde(rename = "ownerId")]
pub owner_id: String,
}
}
The generated schema will include the description from Rust doc-comments, honour rename,
and mark description as non-required because it is Option<…>.
Supported crates
| Crate | JsonSchema support |
|---|---|
serde_json::Value | built-in |
uuid::Uuid | via schemars uuid feature |
chrono::DateTime | via schemars chrono feature |
std::collections::BTreeMap | built-in |
Enable optional features in Cargo.toml:
schemars = { version = "0.8", features = ["derive", "uuid", "chrono"] }
API reference
The live HTTP API exposes all route schemas via Swagger UI. When the backend is running:
- Swagger UI: http://localhost:8000/api/swagger
- OpenAPI JSON: http://localhost:8000/api/openapi.json
Metadata Introspection
SCHEMABOUND’s metadata introspection surface — exposed through the schemabound::mirror module — gives agents
a deep, structured view of the underlying database schema. Beyond basic table-and-column enumeration,
it surfaces triggers, user-defined types, and ORM field-mapping heuristics so that
agents can reason about data contracts and change semantics without manual annotation.
What Is Introspected
| Primitive | Struct | Key Fields |
|---|---|---|
| Table | mirror::Table | name, columns, indexes, unique_indexes, foreign_keys, triggers, field_mappings |
| Column | mirror::Column | name, sql_type, nullable, primary_key, default_value, enum_values |
| Non-unique index | mirror::Index | name, columns |
| Unique index | mirror::UniqueIndex | name, columns |
| Foreign key | mirror::ForeignKey | from_column, to_table, to_column, on_delete, on_update |
| Composite FK | mirror::CompositeForeignKey | from_columns, to_table, to_columns, … |
| Trigger | mirror::Trigger | name, event, timing, table_name, body |
| User-defined type | mirror::UserDefinedType | name, base_type, check_constraint, nullable, default_value |
| Field mapping | mirror::FieldMapping | logical_name, physical_name, orm_convention, notes |
The full model is returned as a SchemaModel:
#![allow(unused)]
fn main() {
pub struct SchemaModel {
pub tables: Vec<Table>,
pub user_defined_types: Vec<UserDefinedType>,
}
}
Trigger Discovery
Triggers represent procedural data-change logic embedded in the database. Surfacing them
allows agents to warn about side-effects, model cascading writes, and fire TriggerFired events.
#![allow(unused)]
fn main() {
use schemabound::mirror::introspect_sqlite_path;
let schema = introspect_sqlite_path("path/to/database.db")?;
for table in &schema.tables {
for trigger in &table.triggers {
println!(
"[{}] TRIGGER {} {} {} ON {}",
table.name, trigger.name, trigger.timing, trigger.event, trigger.table_name
);
// timing: BEFORE | AFTER | INSTEAD OF
// event: INSERT | UPDATE | DELETE | UPDATE OF <col>
}
}
}
Trigger Struct
#![allow(unused)]
fn main() {
pub struct Trigger {
pub name: String,
/// INSERT | UPDATE | DELETE | UPDATE OF <col>
pub event: String,
/// BEFORE | AFTER | INSTEAD OF
pub timing: String,
pub table_name: String,
pub body: String, // the raw CREATE TRIGGER body
}
}
TriggerFired Event (EventBus)
When the SCHEMABOUND runtime detects that a mutation query will fire a trigger, it emits a
TriggerFired variant on the EventBus:
#![allow(unused)]
fn main() {
use schemabound::interceptor::{get_event_bus, Event};
get_event_bus().register_subscriber(Box::new(|event: &Event| {
if let Event::TriggerFired { trigger_name, table_name, .. } = event {
log::warn!("Trigger {} fired on {}", trigger_name, table_name);
}
}))?;
}
User-Defined Types
SQLite encodes custom types as columns with CHECK constraints. The introspector discovers
these and promotes them to first-class UserDefinedType entries.
#![allow(unused)]
fn main() {
pub struct UserDefinedType {
pub name: String,
pub base_type: String, // TEXT, INTEGER, REAL, …
pub check_constraint: Option<String>,
pub nullable: bool,
pub default_value: Option<String>,
}
}
Example — a table with an enum-like column:
CREATE TABLE orders (
status TEXT NOT NULL CHECK(status IN ('pending','shipped','cancelled'))
);
The introspector produces:
{
"name": "status",
"base_type": "TEXT",
"check_constraint": "status IN ('pending','shipped','cancelled')",
"nullable": false,
"default_value": null
}
This is also surfaced per-column via Column.enum_values.
ORM Field-Mapping Heuristics
Physical column names often differ from the logical names that application layers — Hibernate,
Entity Framework, ActiveRecord — use. SCHEMABOUND detects the convention from naming patterns and
generates a FieldMapping per column that appears to be ORM-managed.
#![allow(unused)]
fn main() {
pub struct FieldMapping {
pub logical_name: String,
pub physical_name: String,
/// "hibernate" | "ef" | "ef_shadow"
pub orm_convention: String,
pub notes: Option<String>,
}
}
Convention Detection Rules
| Pattern | Convention | Example |
|---|---|---|
camelCase | hibernate | orderId → order_id |
PascalCase | ef | OrderId → order_id |
_prefix | ef_shadow | _tenantId → TenantId |
These heuristics enable agents to translate between LLM-generated column names and physical database column names without user annotations.
The MirrorProvider Trait
The MirrorProvider async trait lets you plug in any database backend:
#![allow(unused)]
fn main() {
#[async_trait::async_trait]
pub trait MirrorProvider: Send + Sync {
async fn introspect_schema(&self) -> Result<SchemaModel, String>;
}
}
The built-in implementation is SqliteMirrorProvider:
#![allow(unused)]
fn main() {
use schemabound::{MirrorProvider, SqliteMirrorProvider};
let provider = SqliteMirrorProvider::new("path/to/database.db");
let schema = provider.introspect_schema().await?;
}
To extend SCHEMABOUND to a new database engine, implement MirrorProvider and register the provider
with the GrpcExecutor builder:
#![allow(unused)]
fn main() {
let executor = GrpcExecutor::builder()
.mirror(MyPostgresMirrorProvider::new(&connection_string))
.build()?;
}
Proto Surface
The introspected metadata is exposed over gRPC via GetSchemaResponse and GetTableResponse
(see SchemaService proto):
message GetSchemaResponse {
string schema_id = 1;
string database_type = 2;
string generated_at = 3;
repeated TableDef tables = 4;
repeated UserDefinedTypeDef user_defined_types = 5;
}
message GetTableResponse {
string generated_at = 1;
TableDef table = 2;
}
message TableDef {
string name = 1;
repeated ColumnDef columns = 2;
repeated IndexDef indexes = 3; // is_unique distinguishes unique from regular
repeated TriggerDef triggers = 4;
repeated FieldMappingDef field_mappings = 5;
}
message TriggerDef {
string name = 1;
string timing = 2;
string event = 3;
string body = 4;
}
message UserDefinedTypeDef {
string name = 1;
string base_type = 2;
repeated string variants = 3;
}
message FieldMappingDef {
string column_name = 1;
string logical_name = 2;
string convention = 3;
}
The Rust conversion function is table_to_table_def:
#![allow(unused)]
fn main() {
use schemabound::grpc_executor::table_to_table_def;
let proto_def = table_to_table_def(&schema.tables[0]);
}
Future Work
- MSSQL introspector —
MssqlMirrorProviderfor SQL Server schemas, complex UDTs, CLR triggers - PostgreSQL introspector — domain types, row-level security policies, event triggers
- Computed column detection — flag virtual/generated columns explicitly
- Partition metadata — surface range/list/hash partitioning from supported engines
Demo
Family Tree — AI-Narrated Demo (Developer)
The same end-to-end recording narrated for a software developer evaluating the SCHEMABOUND SDK. Highlights SchemaboundDeclarativeBase, _call_gemini + gRPC agent memory, and TestClient for unit testing.
Regenerate:
GEMINI_API_KEY=<key> make narrate DEMO=family-tree PERSONA=developerfrom the repository root.
| Scene | What to notice |
|---|---|
| Login | Entry point to the SCHEMABOUND Enterprise control plane |
| LDAP import | Directory data flows into Dolt — becomes the LLM’s context source |
| Empty hierarchy | Starting state before any agent action |
| LLM agent import | _call_gemini injects LDAP context → Gemini returns function_call parts → Person.from_agent_tool_call(**kwargs) writes to PostgreSQL; gRPC records each tool call |
| Agent memory | Every tool call stored as a structured, queryable event |
| Schema registration | SchemaboundDeclarativeBase.to_roam_schema() auto-generates the function-calling schema — zero manual schema code |
Family Tree — AI-Narrated Demo (Executive)
The same recording narrated for a business executive evaluating SCHEMABOUND. Focuses on cost avoidance, always-current documentation, and SCHEMABOUND’s data-first / code-first / hybrid adoption model — no full rewrite required.
Regenerate:
GEMINI_API_KEY=<key> make narrate DEMO=family-tree PERSONA=executivefrom the repository root.
| Scene | Business outcome |
|---|---|
| Login | Unified control plane — one place for identity, data, and AI governance |
| LDAP import | Existing directory assets re-used instantly; no data migration project |
| Empty hierarchy | Baseline state demonstrating clean-slate adoption |
| LLM agent populates hierarchy | AI-driven data work completed in seconds, not sprint cycles |
| Agent memory | Leadership-visible evidence of AI activity — no developer report required |
| Schema registration | Marketing and API documentation self-updating; collateral never goes stale |
Family Tree — AI-Narrated Demo (DevSecOps)
The same recording narrated for a DevSecOps engineer or security auditor. Covers session tracking, structured audit logging via gRPC, RBAC enforcement at the data layer, and the security observability that SCHEMABOUND Enterprise provides out of the box.
Regenerate:
GEMINI_API_KEY=<key> make narrate DEMO=family-tree PERSONA=devsecopsfrom the repository root.
| Scene | Security / audit focus |
|---|---|
| Login | Identity source is LDAP-backed; session ID assigned at login |
| LDAP import | Directory data validated and ingested before any agent can act on it — least-privilege at the data layer |
| Empty hierarchy | Auditable baseline state captured before agent execution |
| LLM agent populates hierarchy | Every function_call → tool call recorded in agent memory via gRPC in the same transaction as the DB write — no dual-write gap |
| Agent memory | Structured, queryable audit log: session ID, tool name, arguments, timestamp — SIEM-ready |
| Schema registration | Schema locked to the registered model; agent cannot call tools outside the declared interface |