The Fifth Elephant 2017

On data engineering and application of ML in diverse domains


Gabbar: Machine learning to guard OpenStreetMap

Submitted by Bhargav Kowshik (@bkowshik) on Sunday, 30 April 2017

Preview video

Section: Full talk for data engineering track Technical level: Intermediate

View proposal in schedule


OpenStreetMap is the largest free and open map of the world! An average of 2 million features are touched by volunteers around the world every single day. Amazing isn’t it? The global scale and the local diversity bring in a host of challenges for maintaining a high quality of data on OpenStreetMap.

The Mapbox data team works closely with communities of mappers to validate and protect OpenStreetMap data. In this talk, I will do a deep dive into the diversity of mapping on OpenStreetMap, the intricate and challenging data quality problems, learnings from building open tools to aid mappers and how Gabbar, a machine learning based infrastructure, can guard OpenStreetMap from invalid and suspicious edits.

From this talk, I hope to share how open and collaborative projects like OpenStreetMap and Wikipedia are benefitting from open and collaborative machine learning, the opportunities for us as volunteers to build cool and important technology in the open and use the power of AI for a better world for all of us. The intended audience is people interested and/or practicing machine learning to solve data problems as well as people interested and/or contributing to the tech for open projects like OpenStreetMap and Wikipedia.


Edits in a few minutes on OpenStreetMap


1. OpenStreetMap (OSM)
  • OSM is the largest free and open map of the world, the Wikipedia of maps.
  • On a typical day, 2 million features are created, half a million modified and a quarter features deleted.
2. Validation
  • The OSM community, the heart beat of OpenStreetMap.
  • Interesting problems and inherent challenges.
3. Tools
4. Gabbar
  • Guarding OSM from invalid or suspicious edits.
  • Machine learning based infrastructure collaboratively build in the open.
  • Development workflow with Python data science tools.
  • Learning’s, current model performance and impact.
5. Future
  • Using AI to help make OSM the best map of the world!
  • Using open collaborative machine learning for open collaborative projects.

Speaker bio

Hey, I am Bhargav Kowshik, a Software Engineer at Mapbox, Bengaluru. I build tools to scale data operations at Mapbox. I am passionate about people and communities, open data and technology, creativity and side projects. Previously as the first engineer at Nextdrop, I helped build a platform to track water availability and consumption. You can contact me at:



Preview video


  • Zainab Bawa (@zainabbawa) Reviewer 2 years ago

    Is Gabbar open source?

  • Zainab Bawa (@zainabbawa) Reviewer 2 years ago

    Share a two-min preview video explaining what this talk is about and why participants should attend in order to complete the review.

    • Bhargav Kowshik (@bkowshik) Proposer 2 years ago

      Sure. I will shoot a two minute preview video over the weekend and share by 28th May, 2017 (Sun)

  • Bhargav Kowshik (@bkowshik) Proposer 2 years ago (edited 2 years ago)

    Just uploaded my preview video on YouTube at:

Login with Twitter or Google to leave a comment