Logistic
regression is default for classification in
textTrain.model_max_length
in
textEmbed()
.textModels()
show downloaded models.textModelsRemove()
deletes specified models.textSimilarityTest()
when
uneven number of cases are tested.textDistance()
function with distance
measures.textSimilarity()
.textSimilarity()
in
textSimilarityTest()
, textProjection()
and
textCentrality()
for plotting.textTrainRegression()
concatenates word embeddings when provided with a list of several word
embeddings.word_embeddings_4$singlewords_we
.textCentrality()
, words to be plotted are selected
with word_data1_all$extremes_all_x >= 1
(rather than
==1
).textSimilarityMatrix()
computes semantic similarity
among all combinations in a given word embedding.textDescriptives()
gets options to remove NA and
compute total scores.textDescriptives()
textrpp_initiate()
tokenization
is made with NLTK
from
python.textWordPredictions()
(which has a trial period/not fully developed and might be removed in
future versions); p-values are not yet implemented.textPlot()
for objects from both
textProjection()
and
textWordPredictions()
textrpp_initiate()
runs automatically in
library(text)
when default environment exitstextSimilarityTest()
.stringr
to stringi
(and
removed tokenizer) as imported packagetextrpp_install()
installs a conda
environment with text required python packages.textrpp_install_virtualenv()
install a virtual
environment with text required python packages.textrpp_initialize()
initializes installed
environment.textrpp_uninstall()
uninstalls conda
environment.textEmbed()
and textEmbedLayersOutput()
support the use of GPU using the device
setting.remove_words
makes it possible to remove specific words
from textProjectionPlot()
textProjetion()
and textProjetionPlot()
it now possible to add points of the aggregated word embeddings in the
plottextProjetion()
it now possible to manually add
words to the plot in order to explore them in the word embedding
space.textProjetion()
it is possible to add color or
remove words that are more frequent on the opposite “side” of its dot
product projection.textProjection()
with
split == quartile
, the comparison distribution is now based
on the quartile data (rather than the data for mean)textEmbed()
with
decontexts=TRUE
.textSimilarityTest()
is not giving error when using
method = unpaired, with unequal number of participants in each
group.textPredictTest()
function to significance test
correlations of different models. 0.9.11This version is now on CRAN. ### New Features - Adding option to
deselect the step_centre
and step_scale
in
training. - Cross-validation method in
textTrainRegression()
and
textTrainRandomForrest()
have two options
cv_folds
and validation_split
. (0.9.02) -
Better handling of NA
in step_naomit
in
training. - DistilBert
model works (0.9.03)
textProjectionPlot()
plots words extreme in more than
just one feature (i.e., words are now plotted that satisfy, for example,
both plot_n_word_extreme
and
plot_n_word_frequency
). (0.9.01)textTrainRegression()
and
textTrainRandomForest()
also have function that select the
max evaluation measure results (before only minimum was selected all the
time, which, e.g., was correct for rmse but not for r) (0.9.02)id_nr
in training and predict by using
workflows (0.9.02).