Speak at The Fifth Elephant 2026 Annual Conference
Share you work with the community
Jul 2026
27 Mon
28 Tue
29 Wed
30 Thu
31 Fri 09:00 AM – 06:00 PM IST
1 Sat
2 Sun
Mayur Jadhav
@mjadhav13
Submitted Jun 24, 2026
Most organisations have run Spark or Glue jobs for datasets that comfortably fit in a single machine’s memory.
For about five years now, the real bottleneck has not been infrastructure; it has been tooling. OrcaSheets is a desktop application that lets analysts build full ETL workflows locally, with no cluster provisioning, no cloud egress, and no YAML configs. The remaining challenge was adoption: our internal DSL for transformation pipelines could not be exposed directly to users, and the unlearning curve had to be smooth. So we built an agent that generates agents.
The Recipe Engine takes natural language descriptions and produces validated, repeatable pipelines, not just SQL but complex algorithms like k-means clustering and time-series forecasting, all running on a Rust binary with an extremely small footprint. For 90% of real-world workloads, OrcaSheets is a legitimate alternative to cloud ETL.
This talk challenges the default assumption that data engineering requires distributed systems. We’ll show production benchmarks comparing local execution against equivalent Glue/Spark jobs and demonstrate the recipe generation system end-to-end: from a messy 2GB CSV upload through AI-generated transformations to a clean, versioned output with clustering
insights. We’ll dive into the meta-problem of building an agent that understands your internal DSL well enough to generate other agents (recipes), the prompt engineering, validation loops, and guardrails that make this reliable in production. The audience will leave with a concrete framework for deciding when local-first is the right architecture and how an AI-powered recipe engine eliminates the boilerplate that makes ETL expensive.
Data engineers exploring alternatives to cloud ETL for mid-size datasets; ML engineers interested in local-first analytical tooling; platform engineers building internal DSLs or agent-powered developer tools; anyone building AI agents that generate structured, validated outputs rather than freeform text.
Mayur, Founder at Orcasheets, DataOrc. I build practical, scale-proof systems, starting from QA, through backend/frontend, to co-founding one of the fastest analytics engines today.
At OrcaSheets, I’m making analytics that run locally on your PC, handling 100M+ rows in seconds and turning raw data into insights, dashboards, and decisions.
Before this, I co-founded DataOrc, a data engineering consultancy that’s delivered for 75+ enterprises, helping them scale data pipelines, cut cloud bills by up to 90%, and stay production-grade at petabyte scale.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}