natural-language-processing
Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
Here are 7,241 public repositories matching this topic...
-
Updated
Feb 25, 2021 - Python
-
Updated
Mar 9, 2021 - Jupyter Notebook
-
Updated
Mar 11, 2021 - Python
-
Updated
Mar 11, 2021 - Python
-
Updated
Mar 11, 2021 - Python
-
Updated
Mar 5, 2021 - Python
-
Updated
Jun 12, 2017
Change tensor.data to tensor.detach() due to
pytorch/pytorch#6990 (comment)
tensor.detach() is more robust than tensor.data.
A user reported a documentation issue on the mailing list: https://groups.google.com/g/gensim/c/8nobtm9tu-g.
The report shows two problems:
- Something changed with
wmdistancebetween 3.8 and 4.0 that is not properly captured in the Migration notes.
- The [WMD tutorial](https://radimrehurek.com/gensim_4
-
Updated
Mar 2, 2021
-
Updated
Feb 28, 2021
-
Updated
Mar 11, 2021 - Python
-
Updated
Mar 11, 2021 - Python
-
Updated
Dec 22, 2020 - Python
While setting train_parameters to False very often we also may consider disabling dropout/batchnorm, in other words, to run the pretrained model in eval mode.
We've done a little modification to PretrainedTransformerEmbedder that allows providing whether the token embedder should be forced to eval mode during the training phase.
Do you this feature might be handy? Should I open a PR?
-
Updated
Mar 11, 2021 - Python
-
Updated
Feb 18, 2021
Current pytorch implementation ignores the argument split_f in the function train_batch_ch13 as shown below.
def train_batch_ch13(net, X, y, loss, trainer, devices):
if isinstance(X, list):
# Required for BERT Fine-tuning (to be covered later)
X = [x.to(devices[0]) for x in X]
else:
X = X.to(devices[0])
...Todo: Define the argument `
-
Updated
Jan 1, 2021 - Python
-
Updated
Oct 20, 2020 - Jupyter Notebook
-
Updated
Mar 9, 2021
-
Updated
Mar 7, 2021 - Java
-
Updated
Dec 3, 2020 - Python
-
Updated
Mar 11, 2021 - Python
-
Updated
Mar 11, 2021 - Python
Hello spoooopyyy hackers
This is a Hacktoberfest only issue!
This is also data-sciency!
The Problem
Our English dictionary contains words that aren't English, and does not contain common English words.
Examples of non-common words in the dictionary:
"hlithskjalf",
"hlorrithi",
"hlqn",
"hm",
"hny",
"ho",
"hoactzin",
"hoactzine
-
Updated
Feb 21, 2021 - Python
-
Updated
Feb 25, 2021 - Python
Created by Alan Turing
- Wikipedia
- Wikipedia
Hi, I am interested in using the DeBERTa model that was recently implemented here and incorporating it into FARM so that it can also be used in open-domain QA settings through Haystack.
Just wondering why there's only a Slow Tokenizer implemented for DeBERTa and wondering if there are plans to create the Fast Tokeni