Search Infrastructure @ Slack using Lambda Architecture
Submitted by Ananth Durai (@vananth22) on Thursday, 27 April 2017
Full talk for data engineering track
Slack is a collaboration tool for teams. We’re on a mission to make your working life simpler, more pleasant, and more productive. Search is the core feature of Slack offerings as Slack itself is an acronym for “Searchable Log of all conversation & knowledge”.
At Slack, we experiment frequently with various machine learning models to improve search experience so rebuilding search indexes are critical for search infrastructure.
This talk will centre around lambda architecture, common pitfalls of lambda architecture and best practices, an overview of Slack search infrastructure and our experience in building Solr offline indexing at scale.
This talk will be around,
1. Lambda Architecture
2. Common pitfalls of Lambda Architecture
3. Design patterns to handle common pitfalls of Lambda Architecture
4. Apache Solr offline indexing
5. Apache Solr search infrastructure.
Some basic understanding of mapreduce and stream processing.
Ananth Packkildurai is currently working as a Senior Data Engineer at Slack Technologies, San Francisco and has over thirteen years of experience in building systems at scale. Prior to Slack, Ananth worked as a senior data engineer at Bazaarvoice Inc to build large scale consumer reviews analytical platform. He works closely with Hadoop, Apache Crunch, Kafka, Apache Solr, Apache Spark, Druid and other big data platform tools.