Anthill Inside 2019

A conference on AI and Deep Learning



Keshav Joshi


Open Source Tools and Archive for Tackling Misinformation on ChatApps in India

Submitted Nov 7, 2019

Tattle is a civic tech project in India that is creating an archive of content circulated on WhatsApp and other chat apps, and building open source tools to navigate this archive. Such an archive is useful for research on information networks as well as for increasing the efficiency and reach of fact checking efforts. One of Tattle’s goals is opening the archive, even if in a limited scope, to the general public.
We will describe some of the challenges in data collection on encrypted platforms; and our approach for different kinds of search operations (duplicate, approximate, semantic) on multi-lingual and multi-media content. We will conclude with some of the ethical considerations in doing this work.


  • Motivation and Goals of the Project
    • How does it aim to affect the misinformation challenge in India
  • Data Collection
    • Ways of collecting media from Chat Apps
    • Collecting media from allied sources (fact checking websites)
  • Data Processing (Tools to navigate the archive)
    • Duplicate Detection
    • Approximate Search
    • Semantic Search
    • Use of embeddings over hashing
  • Ethical Considerations in this work
    • Consent frameworks for data collection
    • Managing access and use
    • Managing violent and pornographic content


An interest in misinformation!

Speaker bio

Keshav Joshi is a data scientist @Tattle working to bring together an archive of misinformation and keep developing the data science stack. Keshav has several years of experience as a data scientist/researcher/lecturer, with two Masters in Physics & CS from Georgia Tech.



{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

Anthill Inside is a forum for conversations about risk mitigation and governance in Artificial Intelligence and Deep Learning. AI developers, researchers, startup founders, ethicists, and AI enthusiasts are encouraged to: more