natural-language-understanding

https://github.com/huggingface/transformers/blob/546dc24e0883e5e9f5eb06ec8060e3e6ccc5f6d7/src/transformers/models/gpt2/modeling_gpt2.py#L698

Assertions can't be relied upon for control flow because they can be disabled, as per the following:

$ python --help
usage: python [option] ... [-c cmd | -m mod | file | -] [arg] ...
...
-O     : remove assert and __debug__-dependent statem

Description

While using tokenizers.create with the model and vocab file for a custom corpus, the code throws an error and is not able to generate the BERT vocab file

Error Message

ValueError: Mismatch vocabulary! All special tokens specified must be control tokens in the sentencepiece vocabulary.

To Reproduce

from gluonnlp.data import tokenizers
tokenizers.create('spm', model_p

natural-language-understanding

Here are 539 public repositories matching this topic...

huggingface / transformers

Raise exceptions instead of using assertions for control flow

[Performance] Tracking open Issues and PRs (pytorch transformers)

Getting time offsets of beginning and end of each word in Wav2Vec2

google-research / bert

hanxiao / bert-as-service

ludwig-ai / ludwig

microsoft / nlp-recipes

huggingface / tokenizers

dmlc / gluon-nlp

[Error Message] Improve error message in SentencepieceTokenizer when arguments are not expected.

Description

Error Message

To Reproduce

Use official MXNet batchify to implement the batchify functions

NMT Inference: Chunk overlength sequences and translate in sequence

opencog / opencog

google / sling

namisan / mt-dnn

explosion / spacy-transformers

KartikChugh / Otto

chatopera / insuranceqa-corpus-zh

declare-lab / conv-emotion

MITESHPUTHRANNEU / Speech-Emotion-Analyzer

microsoft / DeBERTa

turtlesoupy / this-word-does-not-exist

huggingface / autonlp

practical-nlp / practical-nlp-code

Decalogue / chat

suragnair / seqGAN

BotLibre / BotLibre

Picovoice / rhino

jayparks / tf-seq2seq

soulbliss / NLP-conference-compendium

graphbrain / graphbrain

chatopera / clause

JohnSnowLabs / nlu

gkiril / oie-resources

Droidtown / ArticutAPI

Improve this page

Add this topic to your repo