Converting text to numerical features requires specifically created procedures, which are implemented as steps according to the 'recipes' package. These steps allows for tokenization, filtering, counting (tf and tfidf) and feature hashing.
Version: | 1.0.0 |
Depends: | R (≥ 3.4), recipes (≥ 1.0.0) |
Imports: | lifecycle, dplyr, generics (≥ 0.1.0), magrittr, Matrix, purrr, Rcpp, rlang, SnowballC, tibble, tokenizers, vctrs, glue |
LinkingTo: | Rcpp |
Suggests: | covr, janitor, knitr, modeldata, rmarkdown, sentencepiece, spacyr, stopwords, stringi, testthat (≥ 3.0.0), text2vec, textfeatures (≥ 0.3.3), tokenizers.bpe, udpipe, wordpiece |
Published: | 2022-07-02 |
Author: | Emil Hvitfeldt [aut, cre] |
Maintainer: | Emil Hvitfeldt <emilhhvitfeldt at gmail.com> |
BugReports: | https://github.com/tidymodels/textrecipes/issues |
License: | MIT + file LICENSE |
URL: | https://github.com/tidymodels/textrecipes, https://textrecipes.tidymodels.org, https://textrecipes.tidymodels.org/ |
NeedsCompilation: | yes |
SystemRequirements: | GNU make, C++11 |
Materials: | README NEWS |
CRAN checks: | textrecipes results |
Reference manual: | textrecipes.pdf |
Vignettes: |
Working with n-grams Cookbook - Using more complex recipes involving text Under the hood - tokenlist |
Package source: | textrecipes_1.0.0.tar.gz |
Windows binaries: | r-devel: textrecipes_1.0.0.zip, r-release: textrecipes_1.0.0.zip, r-oldrel: textrecipes_1.0.0.zip |
macOS binaries: | r-release (arm64): textrecipes_1.0.0.tgz, r-oldrel (arm64): textrecipes_1.0.0.tgz, r-release (x86_64): textrecipes_1.0.0.tgz, r-oldrel (x86_64): textrecipes_1.0.0.tgz |
Old sources: | textrecipes archive |
Please use the canonical form https://CRAN.R-project.org/package=textrecipes to link to this page.