Every team has now anchored an LLM onto a database. When we create a conversational agent, in the demo it answers questions beautifully. Then it reaches production, a user rephrases a request, and the agent happily writes SQL that crosses a tenant boundary or reads a table it was never meant to see. This risks data leaks, privacy and security threats.
The first few lines of treatment would be to fix system prompts, add stronger guardrails, PII redactions — but that still will not make data leaks strongly preventable.
In this session, I will highlight how to build a conversational agent on top of an existing, shared relational database, while giving the agent a narrower, deterministic slice — without copying data into a separate store that would then drift or is not always real-time. We use an e-commerce dataset to demonstrate. The agent is constrained on two axes:
- It works within one tenant at a time (row scope) and can only touch an allow-listed set of tables and operations (functional scope).
- The hard requirement: every “no” must be guaranteed by the database and the tool layer, not by the model’s wording.
I’ll show the architecture as defence in depth across independent layers —
- PostgreSQL role grants for table-level access.
- Row-level security with FORCE and default-deny for tenant isolation
- AST-based SQL validation (parsing, not regex) to reject queries before execution, a tool contract that refuses any call missing its security context
- Post-execution checks plus a deterministic abstinence path so the model doesn’t hallucinate when authorised data simply isn’t there.
- The connection that introspects the schema is privilege-separated, so that credential never touches the user-facing path.
Then I use prompt injection, a cross-tenant request to demonstrate the security layers built across the data isolation.
I’ll also be candid about the limit of this approach: PostgreSQL does not hide the existence of tables from a role, only access to them. The guarantee is deterministic non-access, not metadata secrecy.
Why this is relevant
- As the call notes, the gap between a demo agent and a production one is not a model problem, it’s reliability, observability, and enforcing limits that survive an adversary. This is a governed, OSS-only pattern for letting an agent query real production data safely, applicable to any sensitive-data system which is demonstrated on a neutral multi-tenant e-commerce schema.
Who this is for
- Backend, data, and AI/platform engineers and architects who are putting or andLLM in front of an internal or production database, and who need strong data boundaries that is robust and deterministic within a non-deterministic environment.
Key takeaways
- An insight on building a safe-Agent that prevents accidental data leaks by LLM hallucinations.
- Working patterns for data isolation via PostgreSQL within a non deterministic Agentic AI systems.
- Audience will be able to replicate the all or few of the 5 layers I demonstrate.
SLIDE
https://docs.google.com/presentation/d/1-CVmqnq0yGmLCwCGhKU9j6t_xRF6CC6n/edit?usp=sharing&ouid=117709760681210575135&rtpof=true&sd=true
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}