tokenizers.bpe (Q83103)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: tokenizers.bpe |
Byte Pair Encoding Text Tokenization
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | tokenizers.bpe |
Byte Pair Encoding Text Tokenization |
Statements
15 September 2023
0 references
Unsupervised text tokenizer focused on computational efficiency. Wraps the 'YouTokenToMe' library <https://github.com/VKCOM/YouTokenToMe> which is an implementation of fast Byte Pair Encoding (BPE) <https://aclanthology.org/P16-1162/>.
0 references