1. What topics will be discussed?
1. Experiences running systems at scale in production
2. Fire stories
2. Who should attend this meetup
People interested in learning about distributed systems and who enjoy post-mortem/RCA’s.
3. Who should speak at this meetup?
1. Have you been the primary respondent to a fire in production? We’d love to hear your story.
2. Are you running systems at scale? We will be thrilled to hear experience reports of what went well.
5. Format of the meetup
Two talks, 40 minutes each.
Date (tentative): 30th May, 4-6 PM.
How to join the meetup: This event is free to attend. Zoom Link will be shared closer to the meetup date.
Folding@Home on Kubernetes in the Cloud
Folding@Home is a distributed computing project that uses idle compute time to run protein simulations. Recently, they have concentrated their efforts to prioritize Covid-19 research (https://foldingathome.org/covid19/). In this talk, I go through how to deploy Folding@Home client workers on your Kubernetes clusters to donate idle compute time across your cluster.
- What is Folding@Home: A brief overview of the project and it’s importance as well as exploring the underlying distributed systems concepts.
- Covid-19 and Protein Simulation Workloads and their Challenges: How do protein simulation workloads work and their relevance to Covid-19.
- Spinning up Folding@Home workers on your nodes via Deployments and DaemonSets: Diving into setting up workers on your clusters, pitfalls, and edge cases to avoid, as well as ensuring workers are used only during idle time.
- Summary and Future Work: Maintaining Helm charts, following the Folding@Home project.
- A running Kubernetes cluster (preferably cloud based and not minikube) with kubectl access privileges to create daemonsets and deployments
- A stable internet connection
- (Optional) A cluster with access to GPU nodes/cloud account with GPU node access. If you want to run workers on GPU nodes, make sure that your GPU node service limits are non-zero.
Ishaan is a Master’s student in the EECS department at UC Berkeley, researching robustness in Deep Neural Networks. He previously helped build a fast growing ticketing startup in India, where he led a team in scaling the backend infrastructure to serve some of the largest ticketed events in the world. He’s currently applying AI in Healthcare at the Stanford ML Group and creating the data science infrastructure at UC Berkeley’s Discovery Research Program.