Abhishek Khardenavis

Solving data sparsity with a data flywheel built on world foundation models and simulator environment

Submitted Sep 26, 2025

The increasing adoption of vision models in fields like Autonomous Driving and Oil & Gas is exciting, but a significant hurdle remains: the scarcity of readily available, domain-specific vision data. Creating datasets that accurately reflect the complexities of these industries and the specific scenarios they encounter is challenging and often requires substantial time and resources. This limitation can slow down development and impact the performance of AI solutions designed for these critical applications.

To address this challenge, we propose leveraging the power of World Foundation Models in conjunction with advanced simulation environments. This combination enables the creation of a robust “data flywheel” – a continuous loop where simulations generate data for industry-specific scenarios, which in turn can be refined and proliferated using generative world foundation models. This approach has the potential to dramatically accelerate the development and deployment of vision AI solutions by providing a sustainable and scalable source of high-quality training data, all within a controlled and cost-effective environment.

Key Takeaways:

  • Scaling up data enrichment processes using generative models and simulator environments.
  • Approaches for solving data sparsity for domain specific scenarios.
  • Architecting a self-improving process where generated data enhances both the target AI models as well as the foundation model itself.

Target Audience

  • AI practioners and enthusiasts looking to solve data suffiency issues
  • Data strategy leads and stakeholders working on domain specific data provisioning strategies
  • AI practioners interested in the intersection of vision and world foundation models

About Speaker:

Abhishek Khardenavis is a subject matter expert at the Autonomous Driving practice at KPIT. His responsibilities include architecting AI pipelines, with a focus on WFMs and E2E AI based autonomy stacks. In the past, he has worked on solving domain specific problems using AI at Ecozen, Foghorn and Johnson Controls, related to domains like Solar Energy, Oil & Gas, HVAC and Smart Buildings.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures