This is the R package implementation of the GPBoost library. See https://github.com/fabsig/GPBoost for more information on the modeling background and the software implementation.
This is also a short example:
# Combine tree-boosting and grouped random effects model
library(gpboost)
data(GPBoost_data, package = "gpboost")
<- GPModel(group_data = group_data)
gp_model <- gpboost(data = X, label = y, gp_model = gp_model,
bst nrounds = 10, objective = "regression_l2")
summary(gp_model)
<- predict(bst, data = X_test, group_data_pred = group_data_test)
pred $response_mean pred
The gpboost
package is available on CRAN
and can be installed as follows:
install.packages("gpboost", repos = "https://cran.r-project.org")
It is much easier to install the package from CRAN. However, the package can also be build from source as described in the following. In short, the main steps for installation are the following ones:
git clone --recursive https://github.com/fabsig/GPBoost
cd GPBoost
Rscript build_r.R
Below is a more complete installation guide.
You need to install git and CMake first. Note that 32-bit R/Rtools is not supported for custom installation.
NOTE: Windows users may need to run with administrator rights (either R or the command prompt, depending on the way you are installing this package).
Installing a 64-bit version of Rtools is mandatory.
After installing Rtools
and CMake
, be sure
the following paths are added to the environment variable
PATH
. These may have been automatically added when
installing other software.
Rtools
Rtools
3.x, example:
C:\Rtools\mingw_64\bin
Rtools
4.x, example (NOTE: two paths are
required):
C:\rtools40\mingw64\bin
C:\rtools40\usr\bin
install.packages()
, these paths can be added locally in R
as follows prior to installation:Sys.setenv(PATH=paste0(Sys.getenv("PATH"),";C:\\Rtools\\mingw_64\\bin\\;C:\\rtools40\\usr\\bin\\"))
CMake
C:\Program Files\CMake\bin
R
C:\Program Files\R\R-3.6.1\bin
The default compiler is Visual Studio (or VS Build Tools)
in Windows, with an automatic fallback to MingGW64 (i.e. it is enough to
only have Rtools and CMake). To force the usage of MinGW64, you can add
the --use-mingw
(for R 3.x) or --use-msys2
(for R 4.x) flags (see below).
You can perform installation either with Apple Clang or gcc.
gcc
and g++
. If you install these from
Homebrew, your versions of g++
and gcc
are
most likely in /usr/local/bin
, as shown below.# replace 8 with version of gcc installed on your machine
export CXX=/usr/local/bin/g++-8 CC=/usr/local/bin/gcc-8
Build and install the R package with the following commands:
git clone --recursive https://github.com/fabsig/GPBoost
cd GPBoost
Rscript build_r.R
The build_r.R
script builds the package in a temporary
directory called gpboost_r
. It will destroy and recreate
that directory each time you run the script. That script supports the
following command-line options:
--skip-install
: Build the package tarball, but do not
install it.--use-gpu
: Build a GPU-enabled version of the
library.--use-mingw
: Force the use of MinGW toolchain,
regardless of R version.--use-msys2
: Force the use of MSYS2 toolchain,
regardless of R version.Note: for the build with Visual Studio/VS Build Tools in Windows, you should use the Windows CMD or PowerShell.
There is currently no integration service set up that automatically
runs unit tests. However, any contribution needs to pass all unit tests
in the R-package/tests/testthat
directory. These tests can
be run using the run_tests_coverage_R_package.R
file. In any case, make sure that you run the full set of tests by
speciying the following environment variable
Sys.setenv(GPBOOST_ALL_TESTS = "GPBOOST_ALL_TESTS")
before runing the tests in the R-package/tests/testthat
directory.