Skip to content
#

transformers

Here are 479 public repositories matching this topic...

tokenizers
david-waterworth
david-waterworth commented Feb 27, 2021

The Split class accepts SplitDelimiterBehavior which is really useful. The Punctuation however always uses SplitDelimiterBehavior::Isolated (and Whitespace on the other hand behaves like SplitDelimiterBehavior::Removed).

impl PreTokenizer for Punctuation {
    fn pre_tokenize(&self, pretokenized: &mut PreTokenizedString) -> Result<()> {
        pretokenized.split(|_, s| s.spl
pytorch-original-transformer

My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.

  • Updated Dec 27, 2020
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the transformers topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the transformers topic, visit your repo's landing page and select "manage topics."

Learn more