Building analytics application with streaming expressions in Apache Solr
Apache Solr, an open source search engine project, has come a long way since its inception driving applications to have near-real time data mixed with richrelevance available to users with high availability, auto-scaling and effective failover strategy on cloud infrastructure.
Effective real-time analysis and visualization of collected and correlated data to get insights is high need for businesses. Streaming Expressions introduced in Apache Solr v 6.0 provides powerful stream language for Solrcloud. They are a suite of functions that can be combined to perform many different parallel computing tasks like aggregations, parallel relational algebra, batch processing, distributed graph traversal and related MapReduce operations and use-cases.
In Lucidworks, San Francisco California-based enterprise search technology company, we solve complex problems and implement use cases in and around search and analytics paradigm for multiple clients on huge datasets. This session will focus on challenges faced in building near-real time analytics applications on large datasets. We introduce Streaming Expressions in Apache Solr, discuss the concept and key components it is build upon. The session moves on to discuss how Streaming Expressions not only fulfills the expectations, it open doors for numerous possibilities emitting effective, valuable and meaningful analytical data with its ever growing library of functions.
- Challenges building analytics applications with real-time data
- Introduction to Streaming Expressions and Overview
- Sources, Decorators and Evaluators
- Short solutions from simple to complex use-cases optimised
- Statistical Programming with use-case
Amrit Sarkar is Search Engineer and Consultant at Lucidworks Inc, California-based enterprise search technology company, with 3+ years experience in search domain and big data, ecommerce and product.
He is an active Apache Solr Contributor for over an year.