gpt
Here are 120 public repositories matching this topic...
-
Updated
Jun 9, 2021 - Python
The Split class accepts SplitDelimiterBehavior which is really useful. The Punctuation however always uses SplitDelimiterBehavior::Isolated (and Whitespace on the other hand behaves like SplitDelimiterBehavior::Removed).
impl PreTokenizer for Punctuation {
fn pre_tokenize(&self, pretokenized: &mut PreTokenizedString) -> Result<()> {
pretokenized.split(|_, s| s.spl
-
Updated
Jun 23, 2021 - Python
I'm playing around with this wonderful code but I'm running into a curious issue when I try to train the model with my own data.
I replicated the personachat_self_original.json file structure and added my own data. I deleted dataset_cache_OpenAIGPTTokenizer file but when I try to train, I get this error:
INFO:train.py:Pad inputs and convert to Tensor
Traceback (most recent call last)
-
Updated
Jun 22, 2021 - Rust
-
Updated
Jun 5, 2021 - Python
-
Updated
Jun 17, 2021 - C++
-
Updated
Apr 22, 2019
-
Updated
Jun 18, 2021 - Jupyter Notebook
-
Updated
Jun 9, 2021 - Python
-
Updated
Jun 14, 2021 - JavaScript
-
Updated
Jun 22, 2021 - Python
-
Updated
Jun 10, 2021 - Python
-
Updated
Jun 19, 2021
-
Updated
Dec 10, 2020 - Jupyter Notebook
-
Updated
Sep 18, 2019 - JavaScript
-
Updated
May 19, 2021 - C
-
Updated
Jan 29, 2021 - Python
Improve this page
Add a description, image, and links to the gpt topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the gpt topic, visit your repo's landing page and select "manage topics."
huggingface/transformers#12276 introduced a new
--log_levelfeature, which now allows users to set their desired log level via CLI or TrainingArguments.run_translation.pywas used as a "model" for other examples.Now we need to replicate this to all other Trainer-based examples under examples/pytorch/, the 3 changes are