An R package for sparse regression modelling with grouped predictors
(including overlapping groups). grpsel
uses the group
subset selection penalty, usually leading to excellent selection and
prediction. Optionally, the group subset penalty can be combined with a
group lasso or ridge penalty for added shrinkage. Linear and logistic
regression are currently supported. See this paper for more
information.
To install the latest stable version from CRAN, run the following code:
install.packages('grpsel')
To install the latest development version from GitHub, run the following code:
::install_github('ryan-thompson/grpsel') devtools
The grpsel()
function fits a group subset regression
model for a sequence of tuning parameters. The cv.grpsel()
function provides a convenient way to automatically cross-validate these
parameters.
library(grpsel)
# Generate some grouped data
set.seed(123)
<- 100 # Number of observations
n <- 10 # Number of predictors
p <- 5 # Number of groups
g <- rep(1:g, each = p / g) # Group structure
group <- numeric(p)
beta which(group %in% 1:2)] <- 1 # First two groups are nonzero
beta[<- matrix(rnorm(n * p), n, p)
x <- x %*% beta + rnorm(n)
y
# Fit the group subset selection regularisation path
<- grpsel(x, y, group)
fit coef(fit, lambda = 0.05)
## [,1]
## [1,] 0.1363219
## [2,] 1.0738565
## [3,] 0.9734311
## [4,] 0.8432186
## [5,] 1.1940502
## [6,] 0.0000000
## [7,] 0.0000000
## [8,] 0.0000000
## [9,] 0.0000000
## [10,] 0.0000000
## [11,] 0.0000000
# Cross-validate the group subset selection regularisation path
<- cv.grpsel(x, y, group)
fit coef(fit)
## [,1]
## [1,] 0.1363219
## [2,] 1.0738565
## [3,] 0.9734311
## [4,] 0.8432186
## [5,] 1.1940502
## [6,] 0.0000000
## [7,] 0.0000000
## [8,] 0.0000000
## [9,] 0.0000000
## [10,] 0.0000000
## [11,] 0.0000000
See the package vignette or reference manual.