DN
Dhruv Nigam
Dota 2 with Large Scale Deep Reinforcement Learning
On April 13th, 2019, OpenAI Five became the first AI system to defeat the world champions at an esports game. The game of Dota 2 presents novel challenges for AI systems such as long time horizons, imperfect information, and complex, continuous state-action spaces, all challenges that will become increasingly central to more capable AI systems.
Reinforcement learning has been a branch of machine learning with a lot of potential that has not materialized. It has proven instrumental in games where the environment is well-defined and can be simulated without cost. The real world does not offer these benefits. However, RLHF was a key component in making ChatGPT a success.
It is important to understand RL fundamentals to create systems that can continually improve from online feedback. I will cover through this paper -
- the philosophy or RL
- the challenge and importance of rewarding engineering
- The role of distributed training and Ray in making RL viable
{{ gettext('Login to leave a comment') }}
{{ gettext('Post a comment…') }}{{ errorMsg }}
{{ gettext('No comments posted yet') }}