Gaurav Sharma

@bewithgaurav

Rust Inside, Python Outside: Building a Blazing-Fast Driver SDK for Bulk Data Ingestion

Submitted Mar 14, 2026

Abstract

Most database drivers pay a hidden tax on every row: Python value to driver object, driver object to protocol format, protocol format to packet, packet to network. Each conversion adds latency and memory pressure that compounds under bulk workloads. In this session, I show how the Microsoft Azure SQL Drivers team built a Rust-Native direct-to-wire BulkCopy path that eliminates this multi-hop penalty. Python submits row data through a thin PyO3 boundary, and Rust converts each value directly to TDS wire format and writes it straight into the packet writer - no intermediate domain objects, no batch-level buffering, no second conversion pass.

In a 4M-row CSV ingestion (batch size 1000), this design reduced total load time from roughly 55 hours (baseline pyodbc executemany) to about 44 minutes (mssql-python bulk_copy) - ref.

The second half of the talk covers the async runtime design behind this path. We wired Tokio into the ingestion pipeline so that row serialization, packet construction, and network I/O are all async operations, with the Python GIL released during network writes. I will walk through how this same architecture gives us a clean path to native async Python wrappers as well. I will also briefly show how fuzz testing on the Rust protocol layer caught crash-class parser bugs early - integer overflows and unwrap-on-None panics that were converted into explicit error paths before release.

Key Takeaways

  1. Eliminating conversion stages matters more than optimizing individual conversions: a zero-hop path from Python values to TDS wire format delivered dramatic throughput gains.
  2. Designing the async runtime boundary correctly (Tokio + GIL discipline) makes the system both fast today and extensible to async Python wrappers tomorrow.

Audience

Rust systems developers, engineers curious about how database drivers work under the hood, and teams building high-throughput interop layers between Rust and higher-level languages.

Bio

I am a Software Engineer on the Microsoft Azure SQL Drivers team working on the open-source mssql-python (MS Python Driver for Azure SQL) and mssql-rs (Rust implementation of the TDS protocol).
My work focuses on high-performance database connectivity, language interop, and building efficient data paths between Python applications and Azure SQL. I have previously presented at Azure SQL UG conference, numerous Python meetups, and am an active participant at The Fifth Elephant and PyCon.

References

Everything discussed in this talk is open source and powered by community contributions and feedback:

We welcome issues, feature requests, and contributions on both repositories.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

A community of Rust language contributors and end-users from Bangalore. We have presence on the following telegram channels https://t.me/RustIndia https://t.me/fpncr LinkedIn: https://www.linkedin.com/company/rust-india/ more