PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
reinforcement-learning
deep-learning
deep-reinforcement-learning
pytorch
atari
hessian
second-order
continuous-control
actor-critic
ale
mujoco
proximal-policy-optimization
ppo
advantage-actor-critic
a2c
acktr
natural-gradients
roboschool
kfac
kronecker-factored-approximation
-
Updated
Mar 3, 2020 - Python
I've been using rllib for a while and quite like the Trainable and Tune (hyperaram search) abstractions.
Does coach contain something similar?
I really like coach's documentation. Can someone provide a comparison between the two and when it is applicable to use each?
Thanks.