Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
New item
In other projects
MaRDI portal item
Discussion
View source
View history
English
Log in

tokenizers

From MaRDI portal
Software:28295
Jump to:navigation, search



swMATH16425CRANtokenizersMaRDI QIDQ28295FDOQ28295

Fast, Consistent Tokenization of Natural Language Text

Lincoln Mullen

Last update: 22 December 2022

Copyright license: MIT license, File License

Software version identifier: 0.3.0

Source code repository: https://github.com/cran/tokenizers




Cited In (13)

  • tidytext
  • DOLDA: a regularized supervised topic model for high-dimensional multi-class regression
  • textrecipes
  • deeplr
  • DramaAnalysis
  • rslp
  • wactor
  • textfeatures
  • tidypmc
  • pdfsearch
  • proustr
  • covfefe
  • WhatsR


This page was built for software: tokenizers

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Software:28295&oldid=29466592"
Tools
What links here
Related changes
Printable version
Permanent link
Page information
This page was last edited on 5 March 2024, at 20:26. Warning: Page may not contain recent updates.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki