The Fifth Elephant 2017

On data engineering and application of ML in diverse domains

Gabbar: Machine learning to guard OpenStreetMap

Submitted by Bhargav Kowshik (@bkowshik) on Sunday, 30 April 2017

videocam
Preview video

Technical level

Intermediate

Section

Full talk for data engineering track

Status

Confirmed & Scheduled

View proposal in schedule

Vote on this proposal

Login to vote

Total votes:  +5

Abstract

OpenStreetMap is the largest free and open map of the world! An average of 2 million features are touched by volunteers around the world every single day. Amazing isn’t it? The global scale and the local diversity bring in a host of challenges for maintaining a high quality of data on OpenStreetMap.

The Mapbox data team works closely with communities of mappers to validate and protect OpenStreetMap data. In this talk, I will do a deep dive into the diversity of mapping on OpenStreetMap, the intricate and challenging data quality problems, learnings from building open tools to aid mappers and how Gabbar, a machine learning based infrastructure, can guard OpenStreetMap from invalid and suspicious edits.

From this talk, I hope to share how open and collaborative projects like OpenStreetMap and Wikipedia are benefitting from open and collaborative machine learning, the opportunities for us as volunteers to build cool and important technology in the open and use the power of AI for a better world for all of us. The intended audience is people interested and/or practicing machine learning to solve data problems as well as people interested and/or contributing to the tech for open projects like OpenStreetMap and Wikipedia.

Outline

Edits in a few minutes on OpenStreetMap

50pbxyi-scaled

1. OpenStreetMap (OSM)
  • OSM is the largest free and open map of the world, the Wikipedia of maps.
  • On a typical day, 2 million features are created, half a million modified and a quarter features deleted.
2. Validation
  • The OSM community, the heart beat of OpenStreetMap.
  • Interesting problems and inherent challenges.
3. Tools
4. Gabbar
  • Guarding OSM from invalid or suspicious edits.
  • Machine learning based infrastructure collaboratively build in the open.
  • Development workflow with Python data science tools.
  • Learning’s, current model performance and impact.
5. Future
  • Using AI to help make OSM the best map of the world!
  • Using open collaborative machine learning for open collaborative projects.

Speaker bio

Hey, I am Bhargav Kowshik, a Software Engineer at Mapbox, Bengaluru. I build tools to scale data operations at Mapbox. I am passionate about people and communities, open data and technology, creativity and side projects. Previously as the first engineer at Nextdrop, I helped build a platform to track water availability and consumption. You can contact me at:

Links

Slides

https://bkowshik.github.io/fifth-elephant-2017/

Preview video

https://youtu.be/KtvyFp4hSWo

Comments

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Is Gabbar open source?

    • 1
      Bhargav Kowshik (@bkowshik) Proposer a year ago

      Yes Zainab, Gabbar is an open source project on Github at: https://github.com/mapbox/gabbar

  • 1
    Zainab Bawa (@zainabbawa) Reviewer a year ago

    Share a two-min preview video explaining what this talk is about and why participants should attend in order to complete the review.

    • 1
      Bhargav Kowshik (@bkowshik) Proposer a year ago

      Sure. I will shoot a two minute preview video over the weekend and share by 28th May, 2017 (Sun)

  • 1
    Bhargav Kowshik (@bkowshik) Proposer a year ago (edited a year ago)

    Just uploaded my preview video on YouTube at: https://youtu.be/KtvyFp4hSWo

Login with Twitter or Google to leave a comment