0
点赞
收藏
分享

微信扫一扫

NVIDIA公司推出的GPU运行环境下的机器人仿真环境(NVIDIA Isaac Gym)的安装——强化学习的仿真训练环境 (续2)

紧接前文:

​​NVIDIA公司推出的GPU运行环境下的机器人仿真环境(NVIDIA Isaac Gym)的安装——强化学习的仿真训练环境​​



本文主要给出  NVIDIA Isaac Gym  在给出的pytorch下PPO算法下运行例子的运行命令例子:





下面就给出几个使用rlgpu文件下的reinforcement learning代码训练isaacgym环境的例子:



下面的例子使用的文件:/home/devil/isaacgym/python/rlgpu/train.py

rlgpu下面的train.py



使用help解释来查看NVIDIA给出的reinforcement leanring算法命令参数:

python train.py -h


RL Policy

optional arguments:
-h, --help show this help message and exit
--sim_device SIM_DEVICE
Physics Device in PyTorch-like syntax
--pipeline PIPELINE Tensor API pipeline (cpu/gpu)
--graphics_device_id GRAPHICS_DEVICE_ID
Graphics Device ID
--flex Use FleX for physics
--physx Use PhysX for physics
--num_threads NUM_THREADS
Number of cores used by PhysX
--subscenes SUBSCENES
Number of PhysX subscenes to simulate in parallel
--slices SLICES Number of client threads that process env slices
--test Run trained policy, no training
--play Run trained policy, the same as test, can be used only
by rl_games RL library
--resume RESUME Resume training or start testing from a checkpoint
--checkpoint CHECKPOINT
Path to the saved weights, only for rl_games RL
library
--headless Force display off at all times
--horovod Use horovod for multi-gpu training, have effect only
with rl_games RL library
--task TASK Can be BallBalance, Cartpole, CartpoleYUp, Ant,
Humanoid, Anymal, FrankaCabinet, Quadcopter,
ShadowHand, Ingenuity
--task_type TASK_TYPE
Choose Python or C++
--rl_device RL_DEVICE
Choose CPU or GPU device for inferencing policy
network
--logdir LOGDIR
--experiment EXPERIMENT
Experiment name. If used with --metadata flag an
additional information about physics engine, sim
device, pipeline and domain randomization will be
added to the name
--metadata Requires --experiment flag, adds physics engine, sim
device, pipeline info and if domain randomization is
used to the experiment name provided by user
--cfg_train CFG_TRAIN
--cfg_env CFG_ENV
--num_envs NUM_ENVS Number of environments to create - override config
file
--episode_length EPISODE_LENGTH
Episode length, by default is read from yaml config
--seed SEED Random seed
--max_iterations MAX_ITERATIONS
Set a maximum number of training iterations
--steps_num STEPS_NUM
Set number of simulation steps per 1 PPO iteration.
Supported only by rl_games. If not -1 overrides the
config settings.
--minibatch_size MINIBATCH_SIZE
Set batch size for PPO optimization step. Supported
only by rl_games. If not -1 overrides the config
settings.
--randomize Apply physics domain randomization
--torch_deterministic
Apply additional PyTorch settings for more
deterministic behaviour






运行命令例子:

1.  CPU上仿真,CPU上训练

在CPU上运行仿真环境,同时PPO深度强化学习算法在CPU上进行训练    #Simulation on CPU, training on CPU:

python  train.py --task=ShadowHand --headless --sim_device=cpu --rl_device=cpu --physx --num_threads=24


2.  CPU上仿真,GPU上训练

python  train.py --task=ShadowHand --headless --sim_device=cpu --rl_device=cuda:0 --physx --num_threads=24



3.  GPU上仿真,CPU上训练

python  train.py --task=ShadowHand --headless --sim_device=cuda:0 --rl_device=cpu  --physx --num_threads=24



4.  GPU上仿真,GPU上训练

其中,在0号显卡仿真,在1号显卡训练:

python  train.py --task=ShadowHand --headless --sim_device=cuda:0 --rl_device=cuda:1  --physx --num_threads=24


其中,在1号显卡仿真,在0号显卡训练:

python  train.py --task=ShadowHand --headless --sim_device=cuda:1  --rl_device=cuda:0  --physx --num_threads=24



举报

相关推荐

0 条评论