Tools for AI & ML for machine learning at Scale.
As applications become more business critical and application teams are receiving monitoring data for these mission critical business applications as a continuous stream it becomes difficult to manually monitor them and create dashboards/reporting around these applications.
It is becoming increasingly clear that the only way to fix this is to have the right set of tools in place that can help teams to monitor their applications in an automated manner using various machine learning techniques.
We at Appdynamics have built a product that helps address the above requirement using AI and Machine learning. Our algorithms continuously monitor business critical applications, find anomalies and root causes for these anomalies, and give users insights that would have otherwise taken days and weeks to find.
This talk revolves around the various open source tools being used while implementing the above solution.
Our ML/AI platform learns the normal behavior of an application’s data and find anomalies instantly. We are also able to leverage our understanding of the application’s architecture and the correlation between different metrics. Once the anomalies are detected we automatically correlate the anomalies and events for the fastest Root Cause Analysis.
Collecting , Ingesting , storing, and processing billions of events per second in real time for monitoring the application is not a simple process. For doing this in a seamless manner we would need to identify the right set of tools as well as the Infrastructure/Cost requirements for running these tools.
Some of the Open Source Tools being used by us to achieve the above functionality are
- Apache HBase - Kafka/kafka Streams - Neo4j - Redis - Confluent Avro Schema Registry - Kubernetes
In this session we would talk around why we chose these set of tools considering some of the challenges around Machine Learning at a scale.
Saurabh is Principal Software Engineer at AppDynamics, where he work on building solutions around real time streaming datapipelines and various machine learning algorithms for automated anomaly detection and Root Cause Analysis of problems.
His interests lies on how to combine data science with software engineering and solve real time business critical problems.