Eye on Infrastructure

Mar 2020

23 Mon

24 Tue

25 Wed

26 Thu

27 Fri 08:55 AM – 09:00 PM IST

28 Sat

29 Sun

Make a submission

Accepting submissions till 27 Mar 2020, 09:00 AM

TERI auditorium, Bangalore, Bangalore

##This event is postponed. Watch this page for updates.

Netconf has emerged from conversations that started at Rootconf 2019 edition and continued into Rootconf Pune, Hyderabad and Delhi editions subsequently.

Netconf is a platform for network engineers, ISPs, government representatives from telecom and IT departments, civil society groups, policy makers, providers of networking solutions, tech and law groups and activists, among others to discuss the technical, economic and social aspects of running networks and infrastructure, and access. See https://hasgeek.com/rootconf/netconf-2020/proposals#call-for-proposal for more details about topics.

The first edition of Netconf is an unconference, where participants can propose topics, suggest speakers and session formats. The event’s agenda will be set by participants. There will be room for open sessions for participants to propose topics/ideas/sessions on the morning of the event itself.

Event details:
Date: Friday, 27 March 2020
Venue: TERI auditorium, Domlur, Bangalore
Time: 9 AM to 6:30 PM followed
Schedule: https://hasgeek.com/rootconf/netconf-2020/schedule
Post-conference programme: reading of Michael W. Lucas’s works on tech-fiction followed by dinner.

For inquiries about tickets and sponsorships, write to sales@hasgeek.com or call 7676332020.

Click here for the sponsorship deck

To speak at Netconf, submit a talk here: https://hasgeek.com/rootconf/netconf-2020/proposals#call-for-proposal

Entry for children at Netconf: Children of all ages are welcome at Netconf. Entry for children is free. If you are bringing child/children under age 1, mention it when filling your ticket details. This will help us to arrange facilities for care.
Participants are welcome to propose sessions and activities for engaging children at Netconf.

#Community Sponsor

Hosted by

Rootconf

Rootconf is a community-funded platform for activities and discussions on the following topics: Site Reliability Engineering (SRE). Infrastructure costs, including Cloud Costs - and optimization. Security - including Cloud Security. more

All submissions

Previous Next

Eye on Infrastructure

Submitted Mar 10, 2020

Format of the session: Full talk (40 mins)

Abstract:

In this talk, I would be sharing my point of views on Network monitoring.

As we all know that, how important is network monitoring in our estate and Cloud.

I will be explaining in technical details, how we have moved from standard monitoring to advanced automated monitoring. We have developed event correlation engines that can perform most of the troubleshooting and repeated tasks that a network engineer used to perform. I will demonstrate how we can achieve better Mean Time To Detect failures.

Implementing the network monitoring by telegraf using SNMP, “Elastic Container Services” , advanced python scripts and AWS native services like Lambda. Monitoring network elements using Wavefront and sending alerts to customized slack channels with runbooks.

Advanced Event correlation engine-Network troubleshooting –

Event correlation Engine is an automated troubleshooting engine to improve the MTTD and MTTR. When alerts triggering event then workflow will login to the network environment and validate for failures and correlates the dependencies automatically. Network engineers will now have all the analysis performed for an alert within seconds and do not have to necessarily login to devices to understand the reason for a specific alert. The workflow also provides potential impact and action to be taken for a specific alert by using AWS native services like API gateway, Step functions, lambda and DB’s to achieve this functionality.
The conclusion output will be notified to the Network team for remediation.

As a next step I would propose to incorporate auto remediation and same can be reviewed and discussed with audience.

Target Audience: Networking folks who are interested in advanced network monitoring and automated workflow using python and cloud services

Key Takeaways: How we can have advanced monitoring and network troubleshooting automation as part of service which can improve Mean time to detect and Mean time to restore.

Outline

Introduction

Monitoring Network devices using advanced AWS native services

Automated Alerting notifier via Slack channel with run books

Event Correlation Engine

Advanced network troubleshooting automation workflow using python3 and AWS advanced services

Auto Incident assignment and Auto remediation possible ?

Questions and Answers

Requirements

AWS cloud learning account

Speaker bio

Swaminathan S working as staff network security engineer in Intuit Technologies with overall experience of 15+ years in network. We have Datacenter and cloud environments as hybrid infrastructure supporting business applications and have implemented multiple projects related to monitoring solutions and automation projects.

All submissions

Previous Next