gym
Here are 698 public repositories matching this topic...
🐛 Bug
The documentation of DQN agent (https://stable-baselines3.readthedocs.io/en/master/modules/dqn.html) specifies that log_interval parameter is "The number of timesteps before logging". However, when set to 1 (or any other value) the logging is not made at that pace but is instead made every log_interval episode (and not timesteps). In the example below this is made every 200 timesteps.
Use case
Get better results for exercises and specially ingredients with full text search
Proposal
If using postgres, we should use its full text search capabilities so that we get better results and smooth out typos (search in exercises.api.views and nutrition.api.views). A short check of the connection engine should make easy to use the current filter if that's not the case. W
-
Updated
Feb 6, 2021 - Python
-
Updated
Jan 25, 2021 - Python
-
Updated
Jul 24, 2021 - Python
Per this comment in #12
-
Updated
Feb 19, 2022 - Python
-
Updated
Feb 6, 2022 - Jupyter Notebook
-
Updated
Feb 5, 2022 - Python
-
Updated
Apr 5, 2021 - Python
-
Updated
May 31, 2020 - Python
-
Updated
Dec 15, 2021 - Python
-
Updated
Jul 14, 2019 - Python
There seem to be some vulnerabilities in our code that might fail easily. I suggest adding more unit tests for the following:
- Custom agents (there's only VPG and PPO on CartPole-v0 as of now. We should preferably add more to cover discrete-offpolicy, continuous-offpolicy and continuous-onpolicy)
- Evaluation for the Bandits and Classical agents
- Testing of convergence of agents as proposed i
-
Updated
Oct 1, 2020 - Python
-
Updated
Oct 21, 2021 - Python
-
Updated
Nov 18, 2021 - Jupyter Notebook
-
Updated
May 8, 2020 - JavaScript
-
Updated
Jan 29, 2022 - C++
-
Updated
Jul 4, 2019 - Python
-
Updated
Jan 4, 2022 - Python
-
Updated
Jan 27, 2022 - Jupyter Notebook
-
Updated
Jan 29, 2022 - Python
-
Updated
Dec 5, 2021 - Jupyter Notebook
-
Updated
Nov 2, 2021 - Python
Improve this page
Add a description, image, and links to the gym topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gym topic, visit your repo's landing page and select "manage topics."
The following applies to DDPG and TD3, and possibly other models. The following libraries were installed in a virtual environment:
numpy==1.16.4
stable-baselines==2.10.0
gym==0.14.0
tensorflow==1.14.0
Episode rewards do not seem to be updated in
model.learn()beforecallback.on_step(). Depending on whichcallback.localsvariable is used, this means that: