Mining information from social media for fake news detection

Tools, techniques and limitations

This event was the outcome of two previous discussions:

  1. Pratik Sinha’s presentation at The Fifth Elephant 2019 about how fact-checking tasks can be automated via technology as there are repeated instances of fake videos and images that are distributed with different narratives. Automation involves creating software stacks than can identify old content which is recirculated on social media - in the form of text, images and videos - against a known database of such content. Once such a stack is available, mobile apps can be created to help people with fact-checking for the content they receive on social media and chat apps.
  2. Sandeep Khurana’s presentation on Thus Critique’s forum on the role of social media networks in the production of discourses.

The key points debated during this session on 18 May were:

  1. Using high-power cloud-based computing solutions (as part of AI and ML) shoots up infrastructure costs. Whereas fact-checking organisations operate with small budgets, and need simple tech solutions. This will make fact-checking faster, efficient and cost-effective for organizations involved in fact-checking fake news.
  2. Old videos are refurbished as new ones and subsequently used for fake news. Tagging old videos on platforms so that fake news in such categories can be stopped continues to remain an unsolved problem.
  3. AI is useful in detecting aggregate patterns which gives clues to what is hate speech, but not necessarily fake news. Fake news is trickier in nature. Given this, are AI/ML techniques reliable for detecting fake news?
  4. At the same time, algorithms used by large tech companies are opaque. This leads to lack of transparency and collaboration between fact checkers and social media platforms, and widely used chat applications.
  5. Fake news has ever-evolving jargons, vocabulary, language with repurposed videos and images. What worked yesterday in detecting fake news will not work today. The training datasets themselves keep evolving with these complexities. Human intervention is reliable in detecting fake news. Solutions have to account for the involvement of humans and simple technolgies.
  6. Scalable solutions for tackling repetitive fake news in India have to account for:
    • Multiple languages in India
    • Covering different platforms
    • Uneven literacy indexes
    • Uneven internet access

In summary, this event demonstrated that the needle for developing simple tech solutions hasn’t moved much. Alt News continues to build on the tech solution that it developed in 2019 - based on the three phases of fact-checking (collating misinformation, prioritising what to fact check and report, and distribution of fact-checked stories). Developers and technologists can contribute to this open source solution.

Participants’ questions from this session, and ongoing discussion continues on https://hasgeek.com/fifthelephant/mining-information-from-social-media-for-fake-news-detection/comments

Speakers in this session were Anna Isaac of The News Minute, Pratik Sinha of Alt News, Denny George of Tattle Civic Technologies, Venkatesh H R of BOOM FactCheck and Sandeep Khurana, a data scientist and research scholar. Prithwiraj Mukherjee, Assistant Professor of Marketing at IIM-Bangalore, moderated the session.

This summary is compiled by participant Manu Raveendran and The Fifth Elephant community organizing team.

The Fifth Elephant is platform for practitioners working with data (engineering, to application of data science for different use cases) to showcase their work and to collaborate.

For further inquiries, contact 7676332020 or write to fifthelephant.editorial@hasgeek.com

Venue

Online