Natural language processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Hi, I am interested in using the DeBERTa model that was recently implemented here and incorporating it into FARM so that it can also be used in open-domain QA settings through Haystack.

Just wondering why there's only a Slow Tokenizer implemented for DeBERTa and wondering if there are plans to create the Fast Tokeni

A user reported a documentation issue on the mailing list: https://groups.google.com/g/gensim/c/8nobtm9tu-g.

The report shows two problems:

Something changed with wmdistance between 3.8 and 4.0 that is not properly captured in the Migration notes.

The [WMD tutorial](https://radimrehurek.com/gensim_4

While setting train_parameters to False very often we also may consider disabling dropout/batchnorm, in other words, to run the pretrained model in eval mode.
We've done a little modification to PretrainedTransformerEmbedder that allows providing whether the token embedder should be forced to eval mode during the training phase.

Do you this feature might be handy? Should I open a PR?

Hi I would like to propose a better implementation for 'test_indices':

We can remove the unneeded np.array casting:

Cleaner/New:
test_indices = list(set(range(len(texts))) - set(train_indices))

Old:
test_indices = np.array(list(set(range(len(texts))) - set(train_indices)))

Natural language processing

Here are 13,273 public repositories matching this topic...

huggingface / transformers

apachecn / AiLearning

google-research / bert

hankcs / HanLP

explosion / spaCy

oxford-cs-deepnlp-2017 / lectures

virgili0 / Virgilio

RaRe-Technologies / gensim

keon / awesome-nlp

bharathgs / Awesome-pytorch-list

RasaHQ / rasa

flairNLP / flair

chiphuyen / stanford-tensorflow-tutorials

spencermountain / compromise

allenai / allennlp

nltk / nltk

botpress / botpress

hanxiao / bert-as-service

graykode / nlp-tutorial

NLP-LOVE / ML-NLP

stanfordnlp / CoreNLP

sloria / TextBlob

huggingface / datasets

brightmart / text_classification

crownpku / Awesome-Chinese-NLP

brightmart / nlp_chinese_corpus

dragen1860 / TensorFlow-2.x-Tutorials

nfmcclure / tensorflow_cookbook

NLPchina / ansj_seg

zihangdai / xlnet