healthcareai
now depends on dplyr 1.0.0 and tibble
3.0.0. You will need these versions or later. Various hidden changes
were made to be compatible with these packages’ lastest
breaking changes.allow_parallel
parameter was removed to prevent
nested parallelism.healthcareai
now depends on recipes 0.1.4 and caret
6.0.81. You will need these versions or later. Various hidden changes
were made to be compatible with these packages’ lastest
breaking changes.bagimpute
in prep_data
now accepts
bag_trees
to specify the number of trees. This is updated
to be compatible with recipes
0.1.4.healthcareai
library versions now are
saved to model objects.explore
. Make
counterfactual predictions across the most-important features in a model
to see how those features influence predicted outcomes.
plot
method to visualize a model’s logic.pip
. Carefully specify variables and
alternative values that exert causal influence on outcomes; then get
recommended actions for a given patient with expected outcomes given the
actions.outcome_groups
argument to
predict
).risk_groups
argument to predict
.
plot
support for outcome- and risk-group
predictions.get_thresholds
.
plot
method to compare performance across metrics at
various thresholds.split_train_test
can keep multiple observations of an
individual in the same split via the grouping_col
argument.NA
with make_na
.
missingness
finds any such strings it issues a
warning with code that can be used to do the replacement.rename_with_counts
.summary.missingness
method for wide datasets with
missingness in many columns.prep_data
, trigonometric transformations make
circular features out of dates and times for more informative features
in less-wide data frames.plot.model_class
and
summary.model_class
.machine_learn
.missingness
is faster.add_best_levels
works in deployment even if none of the
columns to be created are present in the deployment observations.prep_data
can handle logical features.outcome
doesn’t need to be re-declared in model
training if it was specified in data prep.caret
-trained models into a
model_list
.add_best_levels
and get_best_levels
.interpret
and plot.interpret
to extract
glmnet estimates.variable_importance
returns random forest or xgboost
importances, whichever model performs better.predict
can now write an extensive log file, and if
that option is activated, as in production, predict
is a
safe function that always completes; if there is an error, it returns a
zero-row data frame that is otherwise the same as what would have been
returned (provided prep_data
or machine_learn
was used).remove_near_zero_variance
argument of
prep_data
.separate_drgs
returns NA
for complication
when the DRG is missing.model_list
objects.methods
is attached on attaching the package so that
scripts operate the same in Rscript, R GUI, and R Studio.ggplot2
,
broom
, and recipes
.A whole new architecture featuring a simpler API, more rigor under the hood, and attractive plots.
ranger
and
caret
methods
to maintain functionality across
environmentsgetProcessVariablesDf
RandomForestDeployment
and LassoDeployment
for
usage detailsskip_on_not_appveyor
will skip a unit test unless it’s
being run on Appveyor.skip_if_no_mssql
isn’t needed as a test utility
anymore.XGBoostDevelopment
and XGBoostDeployment
.KmeansClustering
.findVariaion
will return groups with the highest
variation of a chosen target measure within a data set.variationAcrossGroups
will plot a boxplot of variation
between groups for a chosen target measure.SupervisedModelDevelopment
now saves the model after
trainingSupervisedModelDeployment
no longer trains models. It
only loads the model saved in SupervisedModelDevelopment
.
Predictions are made for all data.imputeColumn
was replaced with
imputeDF
DBI
backend. We support reading and
writing to MSSQL and SQLite databases.testWindowCol
is no longer a param in
SupervisedModelDeployment
or used in the algorithms.writeToDB
is no longer a param in
SupervisedModelDeployment
or used in the algorithms.destSchemaTable
is no longer a param in
SupervisedModelDeployment
or used in the algorithms.getPredictions()
in
development (lasso, random forest, linear mixed model)