The Fifth Elephant 2013

An Event on Big Data and Cloud Computing

Similar entity detection in large data

Submitted by Arthi Venkataraman (@arthi) on Tuesday, 26 March 2013

videocam_off

Technical level

Intermediate

Section

Analytics and Visualization

Status

Confirmed

Vote on this proposal

Login to vote

Total votes:  +44

Objective

  1. Understand Similar Entity recognition and it's industrial applicability

  2. Techniques which can be used - Supervised and Unsupervised

  3. Algorithms for Clustering (Mini Batch k-means and Birch ) Classification using Logistic regression and Continuous learning Boosting techniques to combine multiple learners

  4. Implementation challenges and possible approaches to overcome these challenges

Description

One of the fundamental issues across industries is the presence of many similar entities but registered under different names. For example different groups of insurance companies offer different policies to same customers. In the systems these policies are registered under different customer ids. This leads to multiple issues including - Inability to cross / up sell, Identify any fraudulent claim patterns , etc. Same is the case in banks where same customer could be making different loan requests under different names. This presentation is based on our experiences with Similar entity detection in Big Data. It will speak about
1. What is similar entity detection 2. Where is the need for this
3. Techniques for similar entity detection and their applicability
4. Supervised , unsupervised and continuous learning modes 5. Use of Semantic techniques 6. Implementation Challenges Handling large data, Handling large number of comparisons, How to relate similar entities 7. Sample results of our experiments

The above is the outline of what I intend to cover. There would enough time for questions and answers , however if you would like something more to be covered do post a comment and I will see how it can be incorporated.

Requirements

It would be useful to have a basic idea of machine learning techniques, but it's not compulsory as the talk will be in a simple language.

Speaker bio

• Arthi Venkataraman has > 16.5 years of experience in the design, development and testing of projects in different domains • She is currently a Senior Architect in the Chief Technology Office of Wipro Technologies • Her current role involves solution development for different business problems spanning the area of Big Data, Machine Learning and Semantics Technologies • She has a B.E Degree in Computer Science from University Visvesvariah College of Engineering, Bangalore and an MBA (PGDSM) from IIM, Bangalore. She is also a PMP. • She has previously presented papers and spoken at other international conferences This presentation is based on Arthi's experience in area of Similar entity identification

Links

Slides

http://www.slideshare.net/arthiv1/building-similarentityrecognizerv1?utm_source=ss&utm_medium=upload&utm_campaign=quick-view

Comments

  • 0
    Srinivasan Venkataraman (@venkat23451) 5 years ago

    In automatic this article will be of much use.

  • 0
    Srinivasan Venkataraman (@venkat23451) 5 years ago

    In automation, this article will be of much use.

Login with Twitter or Google to leave a comment