Jul 2013
8 Mon
9 Tue
10 Wed
11 Thu 09:30 AM – 04:30 PM IST
12 Fri 10:15 AM – 05:30 PM IST
13 Sat 10:15 AM – 05:30 PM IST
14 Sun
Pramod N Haritsa
StackOverFlow: Imagine a world without such user collaboration and moderation in maintaining and getting information from discussion forums. In this talk we’ll see how can one make sense of content in QA/Discussion Forums using a palette of text processing techniques.
The talk aims at showcasing how can one use text processing(NLP or otherwise) and statistics to arrive at insights.
In identifying a lot of information regarding how credible is a User’s answer, what is the level of difficulty, what is the discussion thread talking about etc, We currently rely on a lot of collaboration from the users and moderators maintaining such user forums. At times, the information exchange happens through more unstructured and informal medium like mailing list/groups etc.
The talk aims to answer certain bits of the following question.
How can we associate a discussion thread with the key themes?
How can we associate a person to themes which he/she has interacted?
How can we measure the domain difficulty level of a discussion thread?
How can we allocate StackOverflow kind user ratings by just looking at the content of a User’s response?
How can we collate these information for future queries, recommendation etc?
Challenges in analysing Unstructured(speech form) Text.
Simple, yet effective statistical techniques to derive insights.
Simple visualisation.
iAdler - a prototype application for extracting insights from mailing lists.
The talk aims to highlight the idea behind including text for any insight derivation and not just the collaborative/user information.(Section on how numbers can be misleading at times)
Pramod is currently working as an Application Developer in the Analytics Initiative at Thoughtworks Inc. He has spent major part of his academic career working as a research assistant under Dr Srinivasa K G and has experience in applying Machine Learning Techniques onto various computer science problems which include Network Data,Speech Processing, Text Processing and NLP. His research interests include Text & Data mining, Machine learning, Distributed Systems and Game Theory.
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}