The Fifth Elephant 2015

A conference on data, machine learning, and distributed and parallel computing

Anomaly Detection Using Apache Spark

Submitted by Kiran Veigas (@kiranveigas) (proposing) on Monday, 1 June 2015

This is a proposal requesting for someone to speak on this topic. If you’d like to speak, leave a comment.

videocam_off

Technical level

Advanced

Section

Crisp Talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +6

Objective

walk through how we used Sparks scalable KMeans algorithm to detect Anomalies for our Cyber Analytics platform

Description

Apache Spark has proved itself to be the next generation BigData processing tool , which has become a favourite for DataScientists and Data Engineers. Its Machine learning component provides well tested scalable algorithms.

It runs 10-100X faster than traditional map-reduce and it provides high level API’s making development an ease.Since Spark exposes API in Java, Scala, Python and R (Coming soon) Data scientists can use their favourite language to build data products.

In this session we will walk through how we used Sparks scalable KMeans algorithm to detect Anomalies for our Cyber Analytics platform.It will demonstrate a taste of Scala(Sparks Native language) , RDD ,and usage of K-means clustering . And how to improve clustering in a session with Spark. Finally we demonstrate how to use the K-means model in realtime to detect anomalies.

Speaker bio

Vishnu Subramanian works as solution architect for Happiest minds with years of experience in building distributed systems using Hadoop , Spark , ElasticSearch , Cassandra , Machine Learning.A Databricks certified spark developer and having experience in building Data Products. His interests are in IOT , Data Science , BigData Security

Comments

  • 1
    Kiran Veigas (@kiranveigas) Proposer 3 years ago

    Vishnu Subramanian from Happiest Minds will speak on this topic

Login with Twitter or Google to leave a comment