Rows, columns, and consequences

Speak at Rootconf’s Special Edition on Databases

Nitesh Vijay

What Happens When Your Embedding Model Changes and You Have 50 Million Vectors in Production

Submitted Mar 15, 2026

You upgrade your embedding model. The new one is better, obviously. Your recall improves on benchmarks, your vectors are more expressive, and everything looks great in staging. Then you deploy it to production where 50 million vectors were indexed with the old model. Suddenly nothing works. New queries return garbage results because the old vectors and new vectors live in completely different geometric spaces. Sound familiar?

This talk is about the ugly reality of vector schema evolution in production databases. I work on Azure Cosmos DB at Microsoft, and I’ve watched this exact scenario play out with customers running AI workloads at scale. The problem isn’t just re-embedding your data. It’s doing it without downtime, without doubling your storage costs during migration, and without silently returning bad results while the re-indexing runs.

Takeaways

  1. A practical playbook for migrating vector indexes when your embedding model changes: dual-write patterns, shadow indexes, progressive re-indexing, and how to validate that your new index actually works before cutting over.
  2. The tradeoffs between “re-index everything at once” vs. “lazy re-index on read” vs. “run both models in parallel” and when each strategy makes sense depending on your data size, latency budget, and tolerance for stale results.

Who is this for?

Database engineers, backend developers, and ML engineers who run vector search in production and haven’t yet dealt with an embedding model migration (but will). Also useful for anyone designing a vector search system and wants to avoid painting themselves into a corner on day one.

About me

Nitesh Vijay, Senior Software Engineer at Microsoft working on Azure Cosmos DB. I spend my days thinking about how to store, index, and retrieve vectors at scale inside a globally distributed database. Previously worked on distributed systems and cloud infrastructure. BITS Pilani alum.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

We care about site reliability, cloud costs, security and data privacy