Unavailable

This livestream is restricted

Already a member? Login with your membership email address

The Fifth Elephant 2024 Annual Conference (12th &13th July)

Maximising the Potential of Data — Discussions around data science, machine learning & AI

Chaitanya

Thejesh GN

@thej Author

Need for new licenses in this age of Generative AI

Submitted Jun 1, 2024

Introduction

In this rapidly evolving digital era, data acts as the fuel powering the relentless growth of artificial intelligence. As we stand on the brink of technological revolutions, it becomes crucial to understand not just how data drives AI, but also the ethical and legal frameworks that must evolve with it. We should try to look at licensing as a tool to make sure that we can level the playing field where currently because of the access to data and compute the incumbents are reaping most of the benefits of AI.

In this talk we will walk through our journey of trying to find the right license for the crowd sourced data collected for the Telugu ASR system and the model built using the data.

Target Audience

The primary audience for this talk involves Data Scientists and AI Researchers, Legal Professionals with a Focus on Technoloy, Open Source and Community Contributors, Policy Makers and Regulators.
Academics and Students in Technology and Law Fields, Tech Entrepreneurs and Start-up Founders can also benefit from this talk.

Outline

  • Introduction to the Project

    • Overview of the Telugu ASR system
    • Importance of crowdsourced data in ASR technologies
  • Challenges with Licensing Crowdsourced Data

    • Legal complexities of using crowdsourced data
    • Ethical considerations in data collection and usage
  • Requirements for an Effective License

    • Compliance with data protection regulations (e.g., GDPR, CCPA)
    • Flexibility to accommodate contributions from a diverse crowd
    • Clarity on data usage rights and restrictions
  • Journey to Finding the Right License

    • Evaluation of existing licenses (e.g., Creative Commons, MIT, proprietary licenses)
    • Customizing license elements to suit specific needs of datasets for AI which will take care of new terms like fine tuning, model weights etc
    • Engagement with legal experts and the community
  • Next Steps and Future Directions

    • Open house for consultations
    • Finalizing the license and release
  • Q&A

    • Open floor for questions and further discussion

Impact

Introducing a specialized license for crowdsourced data, akin to the impact of the GPL for open-source software, could fundamentally transform how data is utilized in technological innovations. It would promote a collaborative environment where data can be freely shared and enhanced, while ensuring compliance with ethical standards and data protection laws. Such a license would encourage broader participation and innovation, reduce legal barriers, and ensure the sustainability of data resources. It might also help level the playing field by making sure the benefits dont accrue to only the mega corporations in AI. By clarifying usage rights and responsibilities, this new licensing framework could set industry standards for data handling, leading to more responsible and impactful technological advancements across various sectors.

Mindmap
# Introduction to the Project
## Introduction to the Project
### Overview of the Telugu ASR system
#### Development Goals
#### Technological Framework
#### Performance Metrics
### Importance of crowdsourced data in ASR technologies
#### Data Volume and Diversity
#### Quality Improvement
#### Community Engagement
## Challenges with Licensing Crowdsourced Data
### Legal complexities of using crowdsourced data
#### IPR
#### Data Ownership
### Ethical considerations in data collection and usage
#### Informed Consent
#### Bias
## Requirements for an Effective License
### Compliance with data protection regulations (e.g., GDPR, CCPA)
### Flexibility to accommodate contributions from a diverse crowd
### Clarity on data usage rights and restrictions
## Journey to Finding the Right License
### Evaluation of existing licenses
### Customizing license elements to suit specific needs of datasets for AI which will take care of new terms like fine tuning, model weights etc
#### Balancing Flexibility and Control
#### Stakeholder Input
#### New Terms and Conditions
### Engagement with legal experts and the community
## Next Steps and Future Directions
### Open house for consultations
#### Community Involvement
#### Iterative Refinement
### Finalizing the license and release
## Q&A
### Open floor for questions and further discussion

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hosted by

Jump starting better data engineering and AI futures

Supported by

Gold Sponsor

Atlassian unleashes the potential of every team. Our agile & DevOps, IT service management and work management software helps teams organize, discuss, and compl

Silver Sponsor

Together, we can build for everyone.

Workshop sponsor

Datastax, the real-time AI Company.

Lanyard Sponsor

We reimagine the way the world moves for the better.

Sponsor

MonsterAPI is an easy and cost-effective GenAI computing platform designed for developers to quickly fine-tune, evaluate and deploy LLMs for businesses.

Community Partner

FOSS United is a non-profit foundation that aims at promoting and strengthening the Free and Open Source Software (FOSS) ecosystem in India. more

Beverage Partner

BONOMI is a ready to drink beverage brand based out of Bangalore. Our first segment into the beverage category is ready to drink cold brew coffee.