Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.
Version: | 1.0.1 |
Depends: | R (≥ 2.10) |
Imports: | rlang (≥ 0.4.2), stringi, stringr |
Suggests: | testthat (≥ 3.0.0) |
Published: | 2022-03-03 |
Author: | Jon Harmon [aut, cre], Jonathan Bratt [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph] |
Maintainer: | Jon Harmon <jonthegeek at gmail.com> |
BugReports: | https://github.com/macmillancontentscience/piecemaker/issues |
License: | Apache License (≥ 2) |
URL: | https://github.com/macmillancontentscience/piecemaker |
NeedsCompilation: | no |
Materials: | README NEWS |
CRAN checks: | piecemaker results |
Reference manual: | piecemaker.pdf |
Package source: | piecemaker_1.0.1.tar.gz |
Windows binaries: | r-devel: piecemaker_1.0.1.zip, r-release: piecemaker_1.0.1.zip, r-oldrel: piecemaker_1.0.1.zip |
macOS binaries: | r-release (arm64): piecemaker_1.0.1.tgz, r-oldrel (arm64): piecemaker_1.0.1.tgz, r-release (x86_64): piecemaker_1.0.1.tgz, r-oldrel (x86_64): piecemaker_1.0.1.tgz |
Old sources: | piecemaker archive |
Reverse imports: | morphemepiece, wordpiece |
Please use the canonical form https://CRAN.R-project.org/package=piecemaker to link to this page.