Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
New item
Special pages
In other projects
MaRDI portal item
Discussion
View source
View history
English
Log in

tokenizers

From MaRDI portal
Tokenizers
Jump to:navigation, search



swMATH16425CRANtokenizersMaRDI QIDQ28295FDOQ28295

Fast, Consistent Tokenization of Natural Language Text

Lincoln Mullen

Last update: 22 December 2022

Copyright license: MIT license, File License

Software version identifier: 0.3.0

Official website: https://cran.r-project.org/web/packages/tokenizers/index.html

Source code repository: https://github.com/cran/tokenizers




Cited In (19)

  • lda
  • topicmodels
  • Medlda
  • tidytext
  • spacyr
  • DOLDA: a regularized supervised topic model for high-dimensional multi-class regression
  • textrecipes
  • Twitmo
  • TTLocVis
  • deeplr
  • DramaAnalysis
  • rslp
  • wactor
  • textfeatures
  • tidypmc
  • pdfsearch
  • proustr
  • covfefe
  • WhatsR


This page was built for software: tokenizers

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Tokenizers&oldid=56279382"
Tools
What links here
Related changes
Printable version
Permanent link
Page information
This page was last edited on 13 March 2026, at 06:53. Warning: Page may not contain recent updates.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki