deep-q-network

I was surprised to see this loss function because it is generally used when the target is a distribution (i.e. sums to 1). This is not the case for the advantage estimate. However, I worked out the math and it does appear to be doing the right thing which is neat!

I think this trick should be mentioned in the code.

Hey! Great work, that could be a great example in ML courses.
It would maybe benefit from having a pip requirement.txt files that would automatize the install:

Pillow==7.0.0
opencv-python==4.2.0.32
torch==1.4.0
numpy==1.18.2
matplotlib==3.0.3

and the installation would sum up to:

python3 -m pip install -r requirement.txt

deep-q-network

Here are 238 public repositories matching this topic...

MorvanZhou / Reinforcement-learning-with-tensorflow

MorvanZhou / Tensorflow-Tutorial

rlcode / reinforcement-learning

Add comment on the use of categorical cross entropy in REINFORCE and a2c

simoninithomas / Deep_reinforcement_learning_Course

keon / deep-q-learning

chris-chris / pysc2-examples

qfettes / DeepRL-Tutorials

sudharsan13296 / Hands-On-Reinforcement-Learning-With-Python

samre12 / deep-trading-agent

yukezhu / tensorflow-reinforce

uvipen / Flappy-bird-deep-Q-learning-pytorch

uvipen / Tetris-deep-Q-learning-pytorch

requirement.txt

transedward / pytorch-dqn

MishaLaskin / curl

MishaLaskin / rad

ChenglongChen / pytorch-MADRL

navjindervirdee / 2048-deep-reinforcement-learning

harveybc / gym-fx

JuliaReinforcementLearning / ReinforcementLearning.jl

JasonYao81000 / MLDS2018SPRING

accel-brain / accel-brain-code

AdrianP- / gym_trading

cyoon1729 / deep-Q-networks

Scitator / rl-course-experiments

thtrieu / essence

alexsosn / ConvNetSwift

neka-nat / distributed_rl

xkiwilabs / DQN-using-PyTorch-and-ML-Agents

alok / rl_implementations

druce / rl

Improve this page

Add this topic to your repo