This project is a self-imposed challenge to learn robotics, reinforcement learning (RL), and NVIDIA Omniverse to develop a simulated robot that learns to walk and then transfer that knowledge to a real, bittle robot.
This challenge was inspired by Umar Jamil's 100 days of CUDA challenge. Although this isn't CUDA, or focused soley on RL, the mindset of consistent learning and experimentation is the key, and I will be tracking my progress with the 100 days challenge discord.
Nothing. I’m a computer scientist with no prior robotics experience. I'm going to start from the Introduction to robotics by John J. Craig and build up from there. I am switching to Probabilistic Robotics (Intelligent Robotics and Autonomous Agents series) by Sebastian Thrun. It is more of a Computer Science Robotics textbook.
I am well versed in Computer Vision and know some basic RL algorithms like Q-Learning and MCTS, but I plan to start with the fundamental papers and build up from there. Thank you David Abel for the paper recommendations!
Check out the Daily Progress/ folder for daily updates. My goal is to maintain consistent progress, even on slower days.
As a full-time researcher and part-time student, my progress will be steady but may vary until I complete my degree and my current research.
- Make the robot walk – Develop and train a reinforcement learning policy to teach a simulated robot how to walk.
- Transfer learning to a real robot – Adapt the trained policy to a physical robot with real-world constraints. I will be using a Bittle Robot Dog
- Enhance autonomy – Implement reasoning capabilities to allow the robot to navigate and interact with the world.
- Optimize learning efficiency – Explore different RL methods and physics simulations for faster, more robust learning.
- Integrate multimodal inputs – Utilize additional sensory inputs (e.g., vision, IMU, force sensors) to improve decision-making.
| Day | Notes & Summaries |
|---|---|
| Day 1 | Intro to Robotics: Basic robotic definitions. Full Notes |
| Day 2 | Intro to Robotics: Basic robotic definitions and common notations. Full Notes |
| Day 3 | Intro to Robotics: Compound Transformations. Full Notes |
| Day 4 | Intro to Robotics: Z-Y-X Euler Angles & Different ways to represent rotation Matrices Full Notes |
| Day5 | PAPER: Continuous Control with Deep Reinforcement Learning Intro to Robotics: Rotation Matrices Continued, Notation, Computational Constraints Full Notes |
| Day6 | PAPER: Proximal Policy Opyimization Algorithms - Part 1 Intro to Robotics: Relating frames to each other pt 1 Full Notes |
| Day7 | PAPER: Proximal Policy Opyimization Algorithms - Part 2 Code Implementation: PPO, part 1 -> Actor & Config Intro to Robotics: Relating frames to each other pt 2 Full Notes |
| Day8 | PAPER: Human-level control through deep reinforcement learning Code Implementation: PPO, part 2 -> Critic, Clippled Loss, KL divergence coefficient, Action value Full Notes |
| Day9 | PAPER: PAPER: Trust Region Policy Optimization (TRPO) Code Implementation: PPO, part 3 -> Trajectory generation Full Notes |
| Day10 | PAPER: PAPER: Trust Region Policy Optimization (TRPO) pt 2 Intro to Robotics: Examples of mapping between Kinematic descriptions Full Notes |
| Day11 | PAPER: PAPER: Trust Region Policy Optimization (TRPO) pt 3 TEXTBOOK UPDATE: Switching textbooks to Probabilistic Robotics (Intelligent Robotics and Autonomous Agents series) Code Implementation: PPO, part 4 -> Loss updates Full Notes |
| Day12 | PAPER: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor pt 1 Code Implementation: PPO, part 5 -> Started Omniverse development Full Notes |
| Day13 | Code Implementation: PPO, part 5 -> Cartpole simulation up, need to conifgure manager environment Full Notes |
| Day14 | PAPER: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor pt 2 Code Implementation: PPO, part 6 -> Started minibatching and supporting multi actors Full Notes |
| Day15 | PAPER: The Difficulty of Passive Learning in Deep Reinforcement Learning BOOK: Probabilistic Robotics -> New Book! Introduction and definitions Full Notes |
| Day16 | PAPER: The Difficulty of Passive Learning in Deep Reinforcement Learning pt 2 BOOK: Probabilistic Robotics ->Probability Full Notes |
| Day17 | PAPER: The Difficulty of Passive Learning in Deep Reinforcement Learning pt 3 BOOK: Probabilistic Robotics ->Robot environment interaction definitions Full Notes |
| Day18 | PAPER: The Difficulty of Passive Learning in Deep Reinforcement Learning pt 4 BOOK: Probabilistic Robotics ->Bayes filtering Full Notes |
| Day19 | PAPER: Deep Reinforcement Learning with Double Q-Learning Pt.1 BOOK: Probabilistic Robotics ->Markov Assumption, Gaussian Filters Full Notes |
| Day20 | Notetaking Update content notes & reflective notes PAPER: Deep Reinforcement Learning with Double Q-Learning Pt.2 BOOK: Probabilistic Robotics ->Kalman Filters, Beliefs as linear Gaussian Distributions Full Notes |
| Day21 | PAPER: Deep Reinforcement Learning with Double Q-Learning Pt.3 BOOK: Probabilistic Robotics ->EKFs,Linearization with Taylor expansion Full Notes |
📂 robotics-learning-challenge
│── 📜 README.md # Project overview and documentation
│── 📂 progress # Logs and updates on milestones
│ │── 📝 day1.md
│ │── 📝 day2.md
│ │── 📝 day3.md
│ └── ...
│── 📂 code # Scripts, simulations, and training code
│ │── 🏗️ simulation # Omniverse-based simulations
│ │── 🤖 real-robot # Deployment & transfer learning
│ │── 🧠 models # RL models and training scripts
│── 📂 logs # Training and debugging logs
│── 📂 docs # Additional documentation
└── 📂 experiments # Experimental setups and results
Python >= 3.10
- Continous Control with Deep Reinforcement Learning
- Proximal Policy Optimization Algorithms
- Human-level control through deep reinforcement learning
- ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
- sentdex series
- Trust Region Policy Optimization
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
- Introduction to Robotics by John J. Craig
- The Difficulty of Passive Learning in Deep Reinforcement Learning
- Deep Reinforcement Learning with Double Q-Learning
- DEMO^3 Multi-Stage Manipulation with Demonstration-Augmented Reward, Policy, and World Model Learning
- DREAM TO CONTROL: LEARNING BEHAVIORS BY LATENT IMAGINATION
- Mastering the game of Go with deep neural networks and tree search
- Mastering Diverse Domains through World Models
- Mastering the game of Go without human knowledge
- A Distributional Perspective on Reinforcement Learning
- Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning
- Sim2Real Transfer in Robotics
- NVIDIA ISSAC SIM tutorial
- Probabilistic Robotics (Intelligent Robotics and Autonomous Agents series)
- Visual and LIDAR based SLAM with ROS using Bittle and Raspberry Pi
- Berkeley Humanoid Traning Code
- Transferring Robot Learning Policies From Simulation to Reality
- Introduction to Robotic Simulations in Isaac Sim (Not available yet)
- Huggingface RL course
- Robotics 101
- Robot dog simulation -> NVIDIA tech. blog
- Fine-tuning RL policies for stable and energy-efficient walking.
- Sim2Real transfer to transfer learned motions to a real world robot.
- Open-world reasoning integrating work like concept graphs
- GTC Insights & Updates – Implementing new ideas and techniques learned from NVIDIA GTC in March.
Stay tuned for updates, especially after GTC!