transformers
Here are 187 public repositories matching this topic...
Looks like spacy 2.1 --> 2.2 has changed the way lemmatizer objects are built. See stack-overflow answer for details.
I can update the library to account for this migration. I have a fork that I can create a pull request from. Let me know.
Steps to reproduce the behavior:
Run
"fr
-
Updated
Jun 12, 2020 - Jupyter Notebook
Problem
Some of our transformers & estimators are not thoroughly tested or not tested at all.
Solution
Use OpTransformerSpec and OpEstimatorSpec base test specs to provide tests for all existing transformers & estimators.
On home page of website: https://nlp.johnsnowlabs.com/ I read "Full Python, Scala, and Java support"
Unfortunately it's 3 days now I'm trying to use Spark NLP in Java without any success.
- I cannot find Java API (JavaDoc) of the framework.
- not event a single example in Java is available
- I do not know Scala, I do not know how to convert things like:
val testData = spark.createDataFrame(
-
Updated
Jun 10, 2020 - Python
As Simple Transformers grows, the single page README documentation has gotten quite bloated and difficult to use. Because of this, I've decided that it's time (if not a little late already) to move the documentation to a more user-friendly Github Pages hosted website at the link below.
https://thilinarajapakse.github.io/simpletransformers/
As of now, only the text classification section is
As the document in http://mleap-docs.combust.ml/ say, if I want to load model, I need to place the model in the docker mounted directory.
But I think it is not convenient. I think there should be some way to upload the model file in the PUT request. Just like the T
GPU Memory Benchmark
I did a few training runs of a simple Reformer module with different parameters and logged the GPU memory usage.
Of course, depending on your machine or other things these values can vary, but I thought it might be useful as a visual guide:
dim = 512, seq_len = 256, depth = 1, heads = 1, batch_size = 1: 452 MB
dim = 512, seq_len = 256, depth = 1, heads = 1, batch_size = 8: 992 MB
-
Updated
Jun 10, 2020 - Python
-
Updated
May 11, 2020
Hey! Thanks for the work on this.
Wondering how we can use this with mocha? tsconfig-paths has its own tsconfig-paths/register to make this work
https://github.com/dividab/tsconfig-paths#with-mocha-and-ts-node
Basically with mocha we have to run mocha -r ts-node/register -- but that wouldnt have the compiler flag.
Would be worthwhile to have the ability to do it which looks like
-
Updated
May 19, 2020 - Python
prediction should include a hyper link to the answer. On clicking the answer from UI will open the pdf page where the answer/paragraph
I have trained almost 80thousand examples within 2000 labels,valid acc almost 92%,but test result all example prob is blew 0.01.
I have tried tranning examples to predict.
The methodology that was outline in the export.md is incredibly out-of-date. TensorFlow has official docker binaries now as well
-
Updated
Apr 17, 2020 - Jupyter Notebook
Describe the bug
When using the LMFineTuner and specifying the learning_rate_finder_configs , an error is thrown when passing these configs to finetuner.find_learning_rate() as suggested in the documentation and in the [Colab example](https://colab.research.google.com/github/Novetta/adaptnlp/blob/master/tutor
-
Updated
Apr 26, 2020
-
Updated
May 30, 2020
-
Updated
Mar 15, 2020 - TypeScript
-
Updated
May 19, 2020 - Jupyter Notebook
-
Updated
Dec 19, 2019 - Java
-
Updated
May 30, 2020 - Python
-
Updated
Jun 17, 2020 - TypeScript
-
Updated
Jun 10, 2020 - Jupyter Notebook
-
Updated
Jun 17, 2020 - Python
-
Updated
Jun 17, 2020 - Python
-
Updated
Jun 15, 2020 - Python
Improve this page
Add a description, image, and links to the transformers topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the transformers topic, visit your repo's landing page and select "manage topics."

Spacy has customizable word level tokenizers with rules for multiple languages. I think porting that to rust would add nicely to this package. Having a customizable uniform word level tokenization across platforms (client web, server) and languages would be beneficial. Currently, idk any clean way or whether it's even possible to write bindings for spacy cython.
Spacy Tokenizer Code
https: