Leveraging Machine Learning to Reduce Mean Time to Identify and Resolve Issues when handling Systems/Applications at Scale
Section: Full talk Technical level: Intermediate
“Our fascination with the use of computing power to augment human decision-making has likely outgrown even the tremendous advances made in algorithmic approaches,”said Christian Beedgen, Co-founder and CTO, Sumo Logic. “In reality, the successful use of AI and related techniques is still limited to areas around image recognition and natural language understanding, where input/output scenarios can be reasonably constructed, and that will not change drastically in 2019. The idea that any business can “turn on AI” to become successful or more successful is preposterous, no matter how much data is being collected. But the collection of data to support humans and algorithms continues and raises important ethical questions and is something we need to pay close attention to over the next few years. Data is human and therefore is just as messy as humans. Data does not create objectivity. It is well established that data and algorithms perpetuate existing biases and automated decisions are — at best — difficult to explain and justify. Appealing such decisions is even harder when we fall into the trap of thinking data and algorithms combine to create objective truth. With greater decision-making power comes much greater responsibility, and humans will increasingly be held accountable for the impact of decisions their business makes.”
Ability to scale to 1000s of servers and services on click of a button, has resulted in an explosion of machine generated data to Terabytes and Petabytes. Most importantly, business models have also changed to running digital platforms, in comparison to the old economy-based businesses. This has resulted in a different paradigm shift in the way, IT and Software systems are managed and maintained, because the most critical part of this business model is to deliver amazing world class user experience on these digital platforms.
And when we need to do so (manage and maintain the systems, keeping a sharp focus on customer experience), time to identify and resolve issues in the shortest time possible, or for that matter proactively look at possible issues, becomes critical. This is where adoption of machine learning in analyzing machine generated logs help in delivering a IT ecosystems at scale. This also involves adopting a Dev-Sec-Ops culture within the organization.
Here is the outline:
- Evolution of business models and how digital platforms are most valued, in comparison to traditional business models
- How scale has been made possible in a Dev-Sec-Ops world, running a digital platform delivering amazing customer experience
- To be successful - 1. Amazing customer experience 2. Security and 3. Innovation
- Running digital platforms at scale
- How to leverage Machine Learning to reduce mean time to identify and resolve issues
- Adopting a DEVSECOPS culture within the organization when you want to run digital ecosystems at scale
Paul is a highly experienced management and technical professional with expertise in Operations, Enterprise Products, and Product Management. Having around 17 years of cumulative experience, specialties include Management & Strategy, Market/ Channel Development, Analytics & Business Intelligence, and Entrepreneurship.
Prometheus: Solving monitoring for EVERYONE
Prometheus, now a graduated CNCF project, is the de facto leader in the monitoring and metrics space being used by 1000s of companies in the world. With the 2.0 launch nearly a year behind us, we are now focused on making Prometheus boring. i.e, more stable, more usable and even MOAR user friendly! more