The Fifth Elephant 2024

India’s most prestigious Big Data, Machine learning & Data Science conference

Tickets

Loading…

Abhijeet Kumar

@abhijeet3922

Unlock Data with NL2SQL: Building Low-code Data Assistant for Business using Code LLMs

Submitted May 15, 2024

Generative AI (LLMs) for codes had become very popular and powerful tool for developers to leverage with rise of enterprise solution like GitHub Copilot, AWS Code Whisperers, Google Duet etc. There are numerous open source code assist models for generating codes for hundreds of programming languages including python, java, SQL etc. These models can take english comments or instructions to write code for developers.

Apart for improving productivity of developers, these models can also be used to create impactful NL-2-SQL python-based solution in data space.

  • Allows business users (non-tech) to use natural language to answer day to day queries.
  • Speed up data analysis for insights & report generations.

Problem: Why does it matter ?

  • Many business users and financial analyst perform ad hoc analysis to answer business queries on the fly. They often struggle to produce data from data lakes and required systems which leads to higher effort from analyst teams to generate desired output.

  • Data governance teams in enterprise validates data migration process with many test cases to ensure data quality. Develop test use-cases takes extensive effort in writing data queries to validate.

CEO of Snowflake Inc. (cloud-based data company), Sridhar Ramaswamy in press briefing said, “Our dream here is, within a year, to have an API that our customers can use so that business users can directly talk to data,”. Natural language querying can enables various stakeholders — including executives, employees, customers, prospects, and partners — to pose questions about data in natural language and receive relevant responses.

Intended Audience

This talk is intended for data enginners, data scientists or researchers who are closer to enterprise data. This might interests business analysts or business consumers who require data querying and analysis on day to day basis.

The topic might interests data leaders as an initiative to democratize data for enterprises.

The talk will align to any professional working on Gen AI usecase with python.

Benefit to Participants

Participants will get in-depth design and details on solutioning the capability to make data closer to business.
This talk discusses some of the practical challenges and ways to overcome them.

Scope

  1. Introduction to problem, use-case and application (NL-2-SQL).
  2. State of Code LLMs
  3. Generalist (prompt-based) vs Specialist Models (fine-tuned) models
  4. Building based data assist capability: Implementation Diagram
  5. Practical challenges and Strategies to overcome.
  6. Demo of the Tool

KeyTakeAways

  1. A potential idea to democratize data to Business or Enterprise.
  2. Current state of open-source code LLMs and its benchmarking on tasks.
  3. Implementation details of No-Code/Low-Code solution for data assistants.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid Access Ticket

Hosted by

All about data science and machine learning

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor