Rootconf 2019

On infrastructure security, DevOps and distributed systems.

Introducing Hive-Kafka Integration for Real-Time Kafka SQL Queries

Submitted by Amit Nijhawan (@anijhawa) on Feb 25, 2019

Section: Crisp talk of 20 mins duration Technical level: Intermediate Status: Rejected

Abstract

I will explain the SQL access pattern for Kafka and how to get it tp work with the new Kafka Hive Integration.
Stream processing engines/libraries like Kafka Streams provide a programmatic stream processing access pattern to Kafka. Application developers love this access pattern but when you talk to BI developers, their analytics requirements are quite different which are focused on use cases around ad hoc analytics, data exploration, and trend discovery. BI persona requirements for Kafka access include:

Treat Kafka topics/streams as tables.
Support for ANSI SQL.
Support complex joins (different join keys, multi-way join, join predicate to non-table keys, non-equi joins, multiple joins in the same query).
UDF support for extensibility.
JDBC/ODBC support.
Creating views for column masking.
Rich ACL support including column level security.

To address these requirements, the new HDP 3.1 release has added a new Hive Storage Handler for Kafka which allows users to view Kafka topics as Hive tables. This new feature allows BI developers to take full advantage of Hive analytical operations/capabilities including complex joins, aggregations, window functions, UDFs, pushdown predicate filtering, windowing, etc.

Outline

To address these requirements, the new HDP 3.1 release has added a new Hive Storage Handler for Kafka which allows users to view Kafka topics as Hive tables. This new feature allows BI developers to take full advantage of Hive analytical operations/capabilities including complex joins, aggregations, window functions, UDFs, pushdown predicate filtering, windowing, etc.

Speaker bio

Amit Nijhawan - I am a Senior Technical Engineer at Red Hat. My domain is Java and Middleware Technologies and container technology.

Slides

https://www.slideshare.net/amitnijhawan/integration-for-realtime-kafka-sql

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('You need to be a participant to comment.') }}

{{ formTitle }}
{{ gettext('Post a comment...') }}
{{ gettext('New comment') }}

{{ errorMsg }}