Tickets

Loading…

Arvind Aravamudhan

@arvindaravamudhan

Democratizing ML at Freshworks

Submitted Apr 15, 2019

The data journey usually begins with raw data, advances to data analytics and then matures to data science. The key for reaching data science maturity is to organize and store data for large scale crunching. ML/AI being one of the key growth drivers for Freshworks, in the presentation I will walk through how we solved the data organization and access problem for ML/AI use cases by building our own data lake.

Outline

1.Purpose - Support multiproduct, time, cost, Security & Complaince
2.Driving Value from Data and adoption at Freshworks
3.Patterns of Data Flow
4.Ingestion, Self Service Portal and ML pipelines - version 1 & 2 of pipelines
5.Security - Kerberos, OAuth and Sentry
6.Path Ahead
a) CDC - Faster Data
b) Spot Instances - Cheaper (Cost)

Speaker bio

Arvind heads the Data Engineering team at Freshworks, which, among other things, helps democratize ML and works closely with Analysts to help generate insights. The team is responsible building and managing the Data Lake and its pipelines. He has been working on moving some of the company’s core datasets from being processed in a once-a-day daily batch ETL to being processed in near real time. Previously, Arvind was an Architect at Banca Sella where he helped build and architect Fin-Tech solutions and was focused on identifying the Technology, Tools and Methodologies which aee most appropriate to the Banking Domain and highly transactional distributed systems.
He has also published blogs where he explains how he went about building one at Freshworks.

https://www.freshworks.com/company/freshworks-data-lake-blog/

https://www.freshworks.com/company/freshworks-data-lake-part2-blog/

Slides

https://docs.google.com/presentation/d/1cDyxSwz03MBKeeacnPXxa01D0vekp7SAAwWzasL6MEE/

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

Jump starting better data engineering and AI futures