The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Bhargav Kowshik

@bkowshik

Gabbar: Machine learning to guard OpenStreetMap

Submitted Apr 30, 2017

OpenStreetMap is the largest free and open map of the world! An average of 2 million features are touched by volunteers around the world every single day. Amazing isn’t it? The global scale and the local diversity bring in a host of challenges for maintaining a high quality of data on OpenStreetMap.

The Mapbox data team works closely with communities of mappers to validate and protect OpenStreetMap data. In this talk, I will do a deep dive into the diversity of mapping on OpenStreetMap, the intricate and challenging data quality problems, learnings from building open tools to aid mappers and how Gabbar, a machine learning based infrastructure, can guard OpenStreetMap from invalid and suspicious edits.

From this talk, I hope to share how open and collaborative projects like OpenStreetMap and Wikipedia are benefitting from open and collaborative machine learning, the opportunities for us as volunteers to build cool and important technology in the open and use the power of AI for a better world for all of us. The intended audience is people interested and/or practicing machine learning to solve data problems as well as people interested and/or contributing to the tech for open projects like OpenStreetMap and Wikipedia.

Outline

Edits in a few minutes on OpenStreetMap

50pbxyi-scaled

1. OpenStreetMap (OSM)
  • OSM is the largest free and open map of the world, the Wikipedia of maps.
  • On a typical day, 2 million features are created, half a million modified and a quarter features deleted.
2. Validation
  • The OSM community, the heart beat of OpenStreetMap.
  • Interesting problems and inherent challenges.
3. Tools
4. Gabbar
  • Guarding OSM from invalid or suspicious edits.
  • Machine learning based infrastructure collaboratively build in the open.
  • Development workflow with Python data science tools.
  • Learning’s, current model performance and impact.
5. Future
  • Using AI to help make OSM the best map of the world!
  • Using open collaborative machine learning for open collaborative projects.

Speaker bio

Hey, I am Bhargav Kowshik, a Software Engineer at Mapbox, Bengaluru. I build tools to scale data operations at Mapbox. I am passionate about people and communities, open data and technology, creativity and side projects. Previously as the first engineer at Nextdrop, I helped build a platform to track water availability and consumption. You can contact me at:

Slides

https://bkowshik.github.io/fifth-elephant-2017/

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures