This BoF session talks about building low-code data assistant for business using code LLMs. Generative AI (LLMs) for codes had become very popular and powerful tool for developers to leverage with rise of enterprise solution like GitHub Copilot, AWS Code Whisperers, Google Duet etc along with numerous open source code assist models for generating codes for hundreds of programming languages including python, java, SQL etc. These models can take english comments or instructions to write code for developers.
Apart for improving productivity of developers, these models can also be used to create impactful NL2SQL solution in data space.
- Allows business users (non-tech) to use natural language to answer day to day queries.
- Speed up data analysis for insights & report generations.
CEO of Snowflake Inc. (cloud-based data company), Sridhar Ramaswamy in press briefing said, “Our dream here is, within a year, to have an API that our customers can use so that business users can directly talk to data,”. Natural language querying can enables various stakeholders — including executives, employees, customers, prospects, and partners — to pose questions about data in natural language and receive relevant responses.
- Data enginners, Data scientists or researchers who are closer to enterprise data.
- Business analysts or Business consumers who require data querying and analysis on day to day basis.
- Data leaders as an initiative to democratize data for enterprises.
The talk will align to any professional working on LLMs & Gen AI usecases.
- Many business users (non-tech) often struggle to fetch data from data lakes and systems which leads to higher effort from data teams to produce the same on regular basis. How can we automate the same ?
- Many business and financial analyst perform ad-hoc analysis to answer business queries on the fly on downloaded Excel, CSVs data. How can we automate it ?
- Data governance teams in enterprise validates data migration process with many test cases to ensure data quality. How can we reduce extensive effort in writing data queries ?
- Data governance teams to create metadata in terms of sample SQL queries for Data Catalog.
- The talk helps data scientists & data teams on going about the implementing such solution with right tools & techniques.
-
Enroads to Enterprise: Introduction to an enterprise problem, potential use-cases and application of NL-2-SQL.
- Adhoc Data Queries: Automation
- Validation in Data Migration
- Enriching Data Catalog
-
State of Code LLMs:
- Closed vs Open models.
- Generalist vs Specialist models
-
Practical challenges: Potential challneges in scaling and strategies in rescue.
- Multi-Table scenario, How to find right tables in Snowflake ?
- Few shot - Retrieval examples.
- Metadata pruning.
- Hallucination correction (SQL Glot)
- Instructions in Prompt
- Fine-Tuning
-
Building Data Assistants: Implementation components & process flow.
- A potential idea to democratize data to Business or Enterprise.
- Practical details on solutioning the low code/No code data assistant capability to make data closer to business.
- Current state of Code generative Large Language Models.
- Indepth insights in effort and challenges in creating NL2SQL solution.
- Techniques and Tools to implement such solution and overcome challenges.
![Mind Map](https://github.com/abhijeet3922/NL2SQL/blob/main/assets/BoF-NL2SQL-No-Low-Code-Assistant.png?raw=True)
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}