reinforcement-learning
Here are 4,736 public repositories matching this topic...
Description
I am wondering when Assessing the Factual Accuracy of Generated Text in https://github.com/tensorflow/tensor2tensor/tree/master/tensor2tensor/data_generators/wikifact will be publically available since it's already been 6 months. @bengoodrich
In the Ray Design Agents documentation the Ray Perception parameter Ray Layer Mask is not mentioned. I am a bit confused about what does it do and if it interacts with the Detectable Tags parameter.

Neither the readme nor readthedocs have install instructions.
I couldn't find it on PyPi or Anaconda, and there doesn't appear to be a pyproject.toml, setup.cfg, setup.py, or conda recipe.
Moreover, the t
-
Updated
Jun 9, 2020 - Jupyter Notebook
-
Updated
May 24, 2020
Vcpkg is a C++ dependency management system that makes installation and consumption as a dependency very easy. We should support this for VW to allow consuming the lib as easy as possible.
Instructions for creating a new package can be found here: https://github.com/microsoft/vcpkg/blob/master/docs/examples/packaging-github-repos.md
Hello,
I'm starting to work on building an starcraft2 IA using pysc2.
I followed the tutorial https://itnext.io/build-a-zerg-bot-with-pysc2-2-0-295375d2f58e.
It's clear but it doesn't answer some important questions.
Is there detailled information somewhere available?
What I want to do is to separate my army in several parts and send them to different locations.
Idem for the drones.
The
Reading the documentation, it is not entirely clear to me what are the differences of the joint control modes supported in pybullet.
I try to recap them here, please @erwincoumans feel free to jump in to add more details and address holes in my understanding.
Let's assume that in all cases (excluding torque), the reference error is the following:
error = kp * (pos_des - pos) + kd *
I understand that these two python files show two different methods to construct a model. The original n_epoch is 500 which works perfect for both python files. But if I change n_epoch to 20, only tutorial_mnist_mlp_static.py can achieve a high test accuracy (~0.97). The other file tutorial_mnist_mlp_static_2.py only get 0.47.
The models built from these two files looks the same for me (the s
-
Updated
Jun 5, 2020 - Python
-
Updated
May 23, 2020
-
Updated
May 29, 2020 - Python
-
Updated
Jun 4, 2020 - Python
I tried some RNN regression learning based on the code in the "PyTorch-Tutorial/tutorial-contents/403_RNN_regressor.py" file, which did not work for me at all.
According to an accepted answer on stack-overflow (https://stackoverflow.com/questions/52857213/recurrent-network-rnn-wont-learn-a-very-simple-function-plots-shown-in-the-q?noredirect=1#comment92916825_52857213), it turns out that the li
-
Updated
May 5, 2020 - Python
Description
Trax is a library for deep learning that focuses on sequence models and reinforcement learning. It combines performance with code clarity and maintained documentation and tests.
...
Sorry to bother, I'll be brief. I don't think the "maintained documentation" part of the statement is true (yet?). I like the work and I respect every project that goes deep down on neural network
-
Updated
Dec 14, 2019 - Jupyter Notebook
-
Updated
Mar 18, 2020 - JavaScript
Jupyter containers hosted by Coursera cause a lot of trouble. Perhaps more than they are worth.
- They are very limited in terms of lifetime and CPU. The docs say 90 minutes / 0.5-2 CPUs. That's definitely insufficient to train Breakout, for example.
- Updating them is inconvenient. We don't h
Lately running into too many Sagemaker issues. Is there any unambiguous documentation on Sagemakers Instances? I could glean the following from different sources:
- Sagemaker Instances, Sagemaker being a managed service, have nothing to do with EC2 instances.
- Unlike EC2 console, Sagemaker console has no option to view limits or increase limits. One has to go directly to the support page a
The OpenAI Gym installation instructions are missing reference to the "Build Tools for Visual Studio 2019" from the following site.
https://visualstudio.microsoft.com/downloads/
I also found this by reading the following article.
https://towardsdatascience.com/how-to-install-openai-gym-in-a-windows-environment-338969e24d30
Even though this is an issue in the OpenAI gym, a note in this RE
一言でいうと
自然言語とプログラムコード双方で事前学習したモデルの提案。翻訳と同様、自然言語/プログラムコードをSeparatorで区切って学習させる。BERT(#959 )のMask以外にELECTRA(#1539 )の置換トークン発見を目的関数に使っている。自然言語によるコード検索、欠損語推論(多肢選択)で有効性を確認。
論文リンク
https://arxiv.org/abs/2002.08155
著者/所属機関
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou
- Microsoft Research Asia
- Ha
In the updateEdgeStats function, reward is updated by edge.reward += reward, which is consistent with the formula in paper "Mastering the game of Go without human knowledge".
But in many other popular unofficial implementations, e.g. ,
, add
v to update the edge reward when the current node belongs to the current player, b
Documentation
-
Updated
Apr 21, 2020 - Python
-
Updated
Apr 21, 2020 - Jupyter Notebook
How to use Watcher / WatcherClient over tcp/ip network?
Watcher seems to ZMQ server, and WatcherClient is ZMQ Client, but there is no API/Interface to config server IP address.
Do I need to implement a class that inherits from WatcherClient?
Can you describe what modifications need to be done if I want to replace dynamic_rnn with tf.keras.RNN in many-to-one example as dynamic_rnn is deprecated now.
-
Updated
Jun 8, 2020 - Python
Improve this page
Add a description, image, and links to the reinforcement-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the reinforcement-learning topic, visit your repo's landing page and select "manage topics."