language-model
Here are 517 public repositories matching this topic...
-
Updated
Dec 1, 2019
PositionalEmbedding
The position embedding in the BERT is not the same as in the transformer. Why not use the form in bert?
Spacy has customizable word level tokenizers with rules for multiple languages. I think porting that to rust would add nicely to this package. Having a customizable uniform word level tokenization across platforms (client web, server) and languages would be beneficial. Currently, idk any clean way or whether it's even possible to write bindings for spacy cython.
Spacy Tokenizer Code
https:
Rust documentation
-
Updated
Jun 18, 2020 - Python
-
Updated
Oct 7, 2019 - Python
-
Updated
Jun 6, 2020 - Python
Hi,
First thanks for releasing this, it has been quite helpful.
Would be great if the README page mentioned in software requirements the dependency on pytorch-qrnn (for QRNN-based models). Currently, following the instructions and running one of the standard QRNN models will just throw a ModuleNotFoundError with no instructions. Would be great if there was a prior mention and/or a try/catch w
-
Updated
Jun 7, 2020
On home page of website: https://nlp.johnsnowlabs.com/ I read "Full Python, Scala, and Java support"
Unfortunately it's 3 days now I'm trying to use Spark NLP in Java without any success.
- I cannot find Java API (JavaDoc) of the framework.
- not event a single example in Java is available
- I do not know Scala, I do not know how to convert things like:
val testData = spark.createDataFrame(
-
Updated
May 4, 2020 - Python
It would be great to have instructions on how to train a language model from scratch - not just loading the paper's model.
-
Updated
Jun 10, 2020 - Python
-
Updated
Jun 11, 2020
-
Updated
Jan 1, 2019 - Python
Hi,
When we try to tokenize the following sentence:
If we use spacy
a = spacy.load('en_core_web_lg')
doc = a("I like the link http://www.idph.iowa.gov/ohds/oral-health-center/coordinator")
list(doc)
We got
[I, like, the, link, http://www.idph.iowa.gov, /, ohds, /, oral, -, health, -, center, /, coordinator]
But if we use the Spacy transformer tokenizer:
-
Updated
Jun 20, 2019 - Python
-
Updated
Dec 18, 2017 - Python
-
Updated
Jan 10, 2020 - Python
-
Updated
May 14, 2020 - C++
-
Updated
Jun 15, 2020 - TeX
I think the filenames in models.sh referred to on lines 4-9 should refer to kaldi-generic-en-tdnn_f-r20190609* which is downloaded on line 3.
-
Updated
Jun 17, 2020 - Go
-
Updated
Nov 15, 2018 - Jupyter Notebook
-
Updated
Jan 9, 2020 - Python
-
Updated
Jun 2, 2020 - Python
-
Updated
May 29, 2020
-
Updated
Feb 28, 2020 - Jupyter Notebook
File "main.py", line 40, in
tf.app.run()
File "/home/luban/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 30, in main
train(args)
File "/nfs/private/proj/chatbot/lib/train.py", line 32, in train
model = seq2seq_model_utils.create_model(sess, arg
Improve this page
Add a description, image, and links to the language-model topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the language-model topic, visit your repo's landing page and select "manage topics."
Many models have identical implementations of
prune_headsit would be nice to store that implementation as a method onPretrainedModeland reduce the redundancy.