Reinforcement Learning on a playable version of Flappy Bird.
Training the AI agent from scratch — watch it progressively learn to navigate the pipes:
pip install -e .On Linux, you also need tkinter as a backend for interactive matplotlib support:
sudo apt-get install python3-tkTo also install the tools for recording demos:
pip install -e ".[tools]"rl-flappy-birdA window will open. The score and commands are displayed on the right side of the window.
To let the AI agent learn from scratch:
rl-flappy-bird --agent aiTo load a pretrained agent:
rl-flappy-bird --agent ai --load_savePress S during the simulation to save the agent's current state.
python tools/record_demo.pyThe state is composed of the Bird's horizontal and vertical distances to the next pipe opening.
The agent explores its environment with an increasingly greedy Epsilon-Greedy scheme. After each simulation, it:
- Updates its approximation of the underlying Markov Decision Process from observed transitions.
- Solves for the optimal value function via Value Iteration.
The best action in a given state is the one that maximizes the expected value.
Sprites (bird, pipes, background) can be swapped by:
- placing new JPG files in the
sprites/directory; - updating the sprite paths in
rl_flappy_bird/args.py.
Other simulation parameters can also be tuned in args.py:
- environment dimensions;
- bird dynamics (gravity, jump velocity);
- RL hyperparameters (discount factor, state discretization).

