Every data engineer has shipped a pipeline where a column name was misspelled, a type was silently coerced, or a nullable field failed in production.
In Python and Spark, these are runtime surprises.
In Rust, they can be compile-time errors.
This hands-on workshop shows how Rust’s type system can encode data contracts - schemas, column types, valid states, and query structure - so the compiler catches bugs before your pipeline runs.
Participants will build a type-safe mini analytics pipeline from scratch using patterns from production systems like DataFusion and Polars. Each concept is introduced through short, incremental exercises-ending with a system where:
If it compiles, the data is structurally correct.
By the end of this workshop, participants will be able to:
- Use the newtype pattern to prevent mixing domain types (e.g.,
CustomerId vs OrderId)
- Apply phantom types to enforce data lifecycle stages (raw → validated → transformed)
- Design trait-based transform pipelines where incompatible schemas fail at compile time
- Build typestate-driven query APIs that prevent invalid execution paths
- Understand how systems like DataFusion and Polars use these patterns internally
A working type-safe analytics pipeline, built incrementally using:
- Newtypes → Prevent invalid domain mixing
- Phantom types → Encode data lifecycle stages
- Trait-based transforms → Enforce schema compatibility
- Typestate + Builder patterns → Guarantee valid query construction
- Rust developers building data-intensive systems
- Data engineers exploring Rust
- Engineers frustrated with runtime schema/type errors in pipelines
Navdeep Agarwal & Mayur Jadhav are co-founders at OrcaSheets. They build squeaking fast local first analytics engine.
This is a hybrid workshop — you can attend:
- In-person in Bengaluru, or
- Remotely (live online)
To participate, please purchase a workshop ticket for 17 April.
Note:
- Conference tickets do not include workshop access
- A valid workshop ticket is required for both in-person and remote participation
- Seats (both in-person and remote) are limited