Gabbar: Machine learning to guard OpenStreetMap
OpenStreetMap is the largest free and open map of the world! An average of 2 million features are touched by volunteers around the world every single day. Amazing isn’t it? The global scale and the local diversity bring in a host of challenges for maintaining a high quality of data on OpenStreetMap.
The Mapbox data team works closely with communities of mappers to validate and protect OpenStreetMap data. In this talk, I will do a deep dive into the diversity of mapping on OpenStreetMap, the intricate and challenging data quality problems, learnings from building open tools to aid mappers and how Gabbar, a machine learning based infrastructure, can guard OpenStreetMap from invalid and suspicious edits.
From this talk, I hope to share how open and collaborative projects like OpenStreetMap and Wikipedia are benefitting from open and collaborative machine learning, the opportunities for us as volunteers to build cool and important technology in the open and use the power of AI for a better world for all of us. The intended audience is people interested and/or practicing machine learning to solve data problems as well as people interested and/or contributing to the tech for open projects like OpenStreetMap and Wikipedia.
Edits in a few minutes on OpenStreetMap
1. OpenStreetMap (OSM)
- OSM is the largest free and open map of the world, the Wikipedia of maps.
- On a typical day, 2 million features are created, half a million modified and a quarter features deleted.
- The OSM community, the heart beat of OpenStreetMap.
- Interesting problems and inherent challenges.
- OpenStreetMap changeset analyzer: https://osmcha.mapbox.com/
- Rule based validation with: https://github.com/mapbox/osm-compare
- Guarding OSM from invalid or suspicious edits.
- Machine learning based infrastructure collaboratively build in the open.
- Development workflow with Python data science tools.
- Learning’s, current model performance and impact.
- Using AI to help make OSM the best map of the world!
- Using open collaborative machine learning for open collaborative projects.
Hey, I am Bhargav Kowshik, a Software Engineer at Mapbox, Bengaluru. I build tools to scale data operations at Mapbox. I am passionate about people and communities, open data and technology, creativity and side projects. Previously as the first engineer at Nextdrop, I helped build a platform to track water availability and consumption. You can contact me at:
- An open database of inconsistent edits observed on OSM: http://www.openstreetmap.org/user/manoharuss/diary/40118
- Preparing accurate history and caching changesets: https://www.openstreetmap.org/user/geohacker/diary/40846
- Common errors and unexplained edits observed: https://www.openstreetmap.org/user/nammala/diary/40338
- Gabbar development and workflow: https://github.com/mapbox/gabbar/