Train with stable-baselines3
You can train the environments with any OpenAI/gym compatible library. In this documentation we explain how to use one of them: stable-baselines3 (SB3).
To install SB3, follow the instructions from its documentation Install stable-baselines3.
Alternatively, you can install
panda-gym and SB3 directly with a single command:
pip install panda-gym[extra]
If you use zsh terminal, the syntax is
pip install 'panda-gym[extra]'
Now that SB3 is installed, you can run the following code to train an agent. You can use every algorithm compatible with
Box action space, see stable-baselines3/RL Algorithm). In the following example, a DDPG agent is trained to solve th Reach task.
import gym import panda_gym from stable_baselines3 import DDPG env = gym.make("PandaReach-v2") model = DDPG(policy="MultiInputPolicy", env=env) model.train(30000)
Here we provide the canonical code for training with SB3. For any information on the setting of hyperparameters, verbosity, saving the model, … please read the SB3 documentation.
Bonus: Train with RL Baselines3 Zoo
RL Baselines3 Zoo is the training framework associated with SB3.
It provides scripts for training, evaluating agents, setting hyperparameters, plotting results and recording video. It also contains already optimized hypermeters, including for some
The current version of RL Baselines3 Zoo provides hyperparameters for version 1 of
panda-gym, but not for version 2. Before training with RL Baselines3 Zoo, you will have to set your own hyperparameters by editing
hyperparameters/<ALGO>.yml. For more information, please read the README of RL Baselines3 Zoo.
To use it, follow the instructions for its installation, then use the following command.
python train.py --algo <ALGO> --env <ENV>
For example, to train an agent with TQC on
python train.py --algo tqc --env PandaPickAndPlace-v2
To visualize the trained agent, follow the instructions in the SB3 documentation. It is necessary to add
--env-kwargs render:True when running the enjoy script.
python enjoy.py --algo <ALGO> --env <ENV> --folder <TRAIN_AGENT_FOLDER> --env-kwargs render:True