Topological Data Analysis Theory and Practice
Submitted by Milan Joshi (@mlnjsh) on Thursday, 17 August 2017
Section: Full talk Technical level: Intermediate
Abstract
As we are already living in the age of big data and it is too big to ignore. Therefore it is important that we find ways to explore, summarize , and answer questions with this data. However the problem is not just that the data is big, but that it is complicated, loaded with surprising patterns, unusual structures, Often that means it is even too complicated for the standard methods to be useful . In this Talk I will discuss a new collection of tools available from the field known collectively as Topological data analysis(TDA). TDA is relatively new branch of Mathematics , it’s an approach to extract shapes(patterns) in data and obtain insights from datasets using techniques from topology, Topology is very old branch of pure Mathematics. I will discuss about the technology called Persistent homology , Barcodes, Persistent Landscape in TDA .Finally we also discuss the scope and future of this field with few applications and some tools and software’s in which TDA can be done .
Outline
What is Topology
what is topological data analysis
What is Persistent Homology
What is Mapper Algorithm
How it helps in solve complex problems
How it differs from traditional machine Learning
How to do TDA in R
Some Applications
Requirements
laptops and R software installed
Speaker bio
Comments


Sumod Mohan (@sumod)
Milan, It would be really great if you can connect with it real world applications and talk about it. I have heard few examples from Ayasdi which seemed interesting (give examples of where Geometry might fail you and Topological Invariance might be the available/better suited property): The more the examples you can provide the merrier. From my experience as speaker, the audience in general have rusty math background (sound linear algebra only <5% of audience might have it, though 90% might have taken a course at some point in their career) and you might need to build a little bit of that as well or might lose audience even before you reach persistence.

Sandhya Ramesh (@sandhyaramesh) Reviewer
Hi Milan, I’ve dropped you an email about the proposal. Could you check and respond?

Milan Joshi (@mlnjsh) Proposer
could you plesase send me an email ? I didnot find any..

Sandhya Ramesh (@sandhyaramesh) Reviewer
Hi Milan, I’ve emailed you on the gmail address you’ve mentioned on this proposal.


Milan Joshi (@mlnjsh) Proposer
I searched but could not be able to find. Will you please resend it ?

Sandhya Ramesh (@sandhyaramesh) Reviewer
Sent it again. Do send me an alternate email if you still haven’t received it.


Milan Joshi (@mlnjsh) Proposer
I am so happy to talk at Pune unfortunately I will be at IIT Khargpur during 2026 Nov ..
Kindly let me Know if any such event in December or thereafter.
I would love to speak …

Milan Joshi (@mlnjsh) Proposer
yes …

Hari C M (@haricm) Reviewer
Milan, you have to share slides and preview video for us to evaluate this talk

Milan Joshi (@mlnjsh) Proposer
Yes Sure RaTher it would be great to start with exampLe from real world ..and then start analysIs ..doing by example is what i believe …

MUSKAN (@muskanphd)
Hi Milan, TDA is used in one of the research papers related to my field (Event detection from twitter). I hope it will be helpful. Is it related to mapping concept of big data?

Milan Joshi (@mlnjsh) Proposer
Thats great that you use TDA , It will be really helpful looking at your topic, generally deals with finding complex patterns in data that is not possible using traditional state of art algorithms in machine learning. This papers will give you great idea what TDA can do
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3084136/ 
chiderasa swaroopini (@chidekrasa)
Would like to attend this event. Do you do it in Python as well?

Milan Joshi (@mlnjsh) Proposer
yes

Milan, thanks for submitting this proposal. It looks interesting. I have a couple of questions:
1. Who is the target audience for this talk?
2. What kind of background knowedledge should the audience already have before they attend your talk?
3. Is the talk primarily a description of topological data analysis methods? If yes, what is the takeaway for the audience? Do you expect the audience to walk away with enough understanding to explore the methods in greater detail? If so, please make this explicit. If not, help us understand the takeaway.
4. Finally, have you presented this talk earlier? If yes, do you have slides you can share? Also, please share links to videos of talks you have delievered in the past for the editorial team to get a fair sense of your speaking skills.
4.Takeaway : After an introduction to topological data analysis, As we know clustering is ill defiened problem , Most of the time while clustering , Using Clustering Algorithms we find it difficult to interpret and being underwhelmed by the amount that they can contribute to the grouping process. It may be unsatisfying if interpreting the clusters feels like a guessing game, if there are seemingly duplicate groups, or even if the groups are really obvious. Similarly, it’s frustrating when people want to but can’t contribute their expertise. They may also want to reinforce the model’s results when it does something well, but it’s not necessarily easy to tell the system to do more of a particular thing.
How can we overcome the drawbacks that accompany unsupervised methods? Put a human in the loop! Make using the algorithm a positive and fruitful experience by leveraging what people can do confidently while avoiding things that are hard. For example, users can likely explain what features are relevant (this is what they know and care about), but they may have a difficult time describing how many groups should exist in the data. Let them influence the algorithm on these kinds of terms, perhaps by providing labels for the grouping process via exemplars selection as well as propagating labels through a question–answer feedback loop from machine to human and back. I’m sure every data scientist has imagined the day when they can more colloquially interact with an algorithm to get better results, even if the majority of today’s feedback only involves cursing that falls on deaf ears.