The Fifth Elephant Open Source AI Hackathon 2024

GenAI makers and creators contest and showcase

Tickets

Loading…

Akshobhya

@akshobhya_j

Bharat Shetty B

@ctangent Editor

Resources for Open-Source AI Hackathon

Submitted Jan 29, 2024

This submission contains a list of knowledge resources shared by the curator of The Fifth Elephant Open-Source AI hackathon, Bharat Shetty Barkur.

You will find interesting links here to spark creativity in your projects and aid in your idea development. If you have any interesting links to share, do add them as comments in this submission.

This submission will be updated daily until the hack day on 3 February. Do check back occasionally for updated resources.

  1. For those who are looking to leverage Microsoft’s semantic kernel along with LLMs: https://towardsdatascience.com/a-pythonistas-intro-to-semantic-kernel-af5a1a39564d

  2. https://github.com/pandeyanuradha/Chatbot-for-mental-health Folks, keep checking out stuff like Kaggle and GitHub for any interesting datasets that can help you in several use-cases like this one. This one uses a Kaggle dataset (less number of data 98 FAQs) But a RAG + LLM approach can also be tried out here etc.

Make sure that your hackathon descriptions detail out the nuances like this. It will help you to break down the main project to actionable small sets of items in the longer run.

  1. I had worked on a course project during my college time on generating molecules for drug discovery. Had referred to this paper then https://arxiv.org/pdf/2001.08184.pdf . Table 3 in the paper mentions some references for datasets. Not sure if your idea is slightly different but maybe the datasets are worth a look

  2. https://cloud.google.com/blog/topics/healthcare-life-sciences/building-a-clinical-intelligence-engine-using-medlm

For those who are looking for datasets in the healthcare mimic dataset an example dataset of open EHR. This Blog can give ideas that can be executed in open manner

  1. Saraswati has shared some datasets that might help for hackathon.
    https://cloud.google.com/healthcare-api/docs/resources/public-datasets
    https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data
    https://data.gov.in/keywords/Diagnostic

  2. https://arxiv.org/pdf/2303.18223.pdf check out this for survey of LLMs

  3. Also, check out this dataset that I found - https://huggingface.co/datasets/knowrohit07/know_medical_dialogues for anyone who wants to finetune and then try some nifty use-cases on top of this.

  4. @Saraswati Chandra has suggested these ideas for the hackathon
    Assuming this is in India context, here are some ideas for health related projects:
    ML
    Currently are predictively good at intent and pattern recognition
    Focus Area - Improving technicians efficiency OR resource deployment (public health)
    Potential Use cases
    a. Nurse hiring and deployment for villages (PH problem)
    b. ASHA workers route map plan
    c. Infectious disease (Malaria) potential harm map
    d. Breast cancer screening
    LLM
    Currently are predictively good at user interaction
    Focus area — Customer Care Journey OR customer Info (Public Health)
    Potential Use Cases

  • Vaccine Information & Reminders
  • Maternal care journey interaction
  • Post-Cancer Care
  • Pre&Post Clinic Visit interaction mgmt
  1. https://github.com/langroid/Awesome-LLM?tab=readme-ov-file#open-llm Folks, since many folks were asking what are the open src llms, take a look here for some of them
    10.https://arvindsaraf.medium.com/technology-for-impact-9b1c2c2c2934

    https://arvindsaraf.medium.com/regulating-ai-1aa732d8f82e
    Between them, one can pull up some ideas.

  2. Quality set of articles and notebooks on LLMs - https://github.com/ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing Please go through these resources - some of them will be helpful for sure.

  3. An example of quick hacks that folks are doing with LLMs and prompt engineering - https://arxiv.org/pdf/2401.14447.pdf

  4. https://github.com/poloclub/wordflow/ check out this to see how nicely they orchestrated the ideas/code/roadmap etc.

  5. https://lilacai-lilac.hf.space/datasets#lilac/OpenHermes-2.5&rowId="0000215f-9b07-46da-a8fa-b23aa28f1ba3" since datasets are useful for LLM pre-training/finetuning open source projects, this is a good example of app that will understand structure of these datasets.

Comments

{{ gettext('Login to leave a comment') }}

{{ gettext('Post a comment…') }}
{{ gettext('New comment') }}
{{ formTitle }}

{{ errorMsg }}

{{ gettext('No comments posted yet') }}

Hybrid access (members only)

Hosted by

The Fifth Elephant hackathons

Supported by

Host

Jump starting better data engineering and AI futures

Venue host

Welcome to the events page for events hosted at The Terrace @ Hasura. more

Partner

Providing all founders, at any stage, with free resources to build a successful startup.