An Agent That Builds Agents: AI-Powered Recipe Generation for Local-First ETL

Jul 2026

20 Mon

21 Tue

22 Wed

23 Thu

24 Fri

25 Sat

26 Sun

Jul 2026

27 Mon

28 Tue

29 Wed

30 Thu

31 Fri 08:45 AM – 06:00 PM IST

1 Sat

2 Sun

NIMHANS Convention Centre, Bengaluru,

Tickets

All submissions

Previous Next

This submission has been added to the schedule

An Agent That Builds Agents: AI-Powered Recipe Generation for Local-First ETL

Submitted Jun 24, 2026

I am submitting for: Track 2 - Building & implementing AI tools & agents in production Type of session: 30 mins talk

An Agent That Builds Agents: AI-Powered Recipe Generation for Local-First ETL

Session Description

Most organisations have run Spark or Glue jobs for datasets that comfortably fit in a single machine’s memory.
For about five years now, the real bottleneck has not been infrastructure; it has been tooling. OrcaSheets is a desktop application that lets analysts build full ETL workflows locally, with no cluster provisioning, no cloud egress, and no YAML configs. The remaining challenge was adoption: our internal DSL for transformation pipelines could not be exposed directly to users, and the unlearning curve had to be smooth. So we built an agent that generates agents.
The Recipe Engine takes natural language descriptions and produces validated, repeatable pipelines, not just SQL but complex algorithms like k-means clustering and time-series forecasting, all running on a Rust binary with an extremely small footprint. For 90% of real-world workloads, OrcaSheets is a legitimate alternative to cloud ETL.

This talk challenges the default assumption that data engineering requires distributed systems. We’ll show production benchmarks comparing local execution against equivalent Glue/Spark jobs and demonstrate the recipe generation system end-to-end: from a messy 2GB CSV upload through AI-generated transformations to a clean, versioned output with clustering
insights. We’ll dive into the meta-problem of building an agent that understands your internal DSL well enough to generate other agents (recipes), the prompt engineering, validation loops, and guardrails that make this reliable in production. The audience will leave with a concrete framework for deciding when local-first is the right architecture and how an AI-powered recipe engine eliminates the boilerplate that makes ETL expensive.

Takeaways

How to build an “agent that creates agents”: the architecture of a recipe generation engine that translates natural language into validated, repeatable transformation pipelines using an internal DSL, including the prompt engineering and validation patterns that make it production-reliable.
A practical framework for when local-first analytics replaces cloud ETL (Spark/Glue), with real cost comparisons showing 10-50x savings and how AI-generated recipes reduce pipeline development time from days to minutes while keeping data fully on-premises.

Target Audience

Data engineers exploring alternatives to cloud ETL for mid-size datasets; ML engineers interested in local-first analytical tooling; platform engineers building internal DSLs or agent-powered developer tools; anyone building AI agents that generate structured, validated outputs rather than freeform text.

Bio

Mayur, CTO at Orcasheets, Co-founder at DataOrc. I build practical, scale-proof systems, starting from QA, through backend/frontend, to co-founding one of the fastest analytics engines today.

At OrcaSheets, I’m making analytics that run locally on your PC, handling 100M+ rows in seconds and turning raw data into insights, dashboards, and decisions.

Before this, I co-founded DataOrc, a data engineering consultancy that’s delivered for 75+ enterprises, helping them scale data pipelines, cut cloud bills by up to 90%, and stay production-grade at petabyte scale.
Linkedin: https://www.linkedin.com/in/mayur-jadhav-1a3a4652/
Twitter: https://x.com/mayurJ13

Draft Slides

https://drive.google.com/file/d/1nefNm1c_c_9LjFfEaLwl7bKVWVY87aRa/view?usp=sharing

All submissions

Previous Next

Comments

Jul 2026

20 Mon

21 Tue

22 Wed

23 Thu

24 Fri

25 Sat

26 Sun

Jul 2026

27 Mon

28 Tue

29 Wed

30 Thu

31 Fri 08:45 AM – 06:00 PM IST

1 Sat

2 Sun

Get your hybrid access ticket

Hosted by

The Fifth Elephant

Jumpstart better data engineering and AI futures

Supported by

Platinum Sponsor

Atlassian

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Platinum Sponsor

Sahaj Software

Sahaj is an artisanal technology services company crafting purpose-built AI and data-led solutions for businesses.

Gold Sponsor

Skyflow

Skyflow secures the flow of data across datastores, models, and agents. Enterprises turn to Skyflow as their runtime AI data control layer to protect sensitive

Bronze Sponsor

Fastah

Internet infrastructure APIs for IP geolocation and more

Bronze Sponsor

Firebolt Analytics

Open Source Analytical Database for the AI era.

Community sponsor

ClawMetry

Real-time Observability & Governance layer for AI agents

The Fifth Elephant 2026 Annual Conference

An Agent That Builds Agents: AI-Powered Recipe Generation for Local-First ETL

An Agent That Builds Agents: AI-Powered Recipe Generation for Local-First ETL

Session Description

Takeaways

Target Audience

Bio

Draft Slides

Comments