Use cross-validation to help select the optimal number of variable groups and the value of gamma.

cv.OHPL(X.cal, y.cal, maxcomp, gamma = seq(0.1, 0.9, 0.1), X.test,
y.test, cv.folds = 5L, G = 30L, type = c("max", "median"),
scale = TRUE, pls.method = "simpls")

## Arguments

X.cal Predictor matrix (training) Response matrix with one column (training) Maximum number of components for PLS A vector of the gamma sequence between (0, 1). X.test Predictor matrix (test) y.test Response matrix with one column (test) Number of cross-validation folds Maximum number of variable groups Find the maximum absolute correlation ("max") or find the median of absolute correlation ("median"). Default is "max". Should the predictor matrix be scaled? Default is TRUE. Method for fitting the PLS model. Default is "simpls". See the details section in plsr for all possible options.

## Value

A list containing the optimal model, RMSEP, Q2, and other evaluation metrics. Also the optimal number of groups to use in group lasso.

## Examples

data("wheat")

X <- wheat$x y <- wheat$protein
n <- nrow(wheat$x) set.seed(1001) samp.idx <- sample(1L:n, round(n * 0.7)) X.cal <- X[samp.idx, ] y.cal <- y[samp.idx] X.test <- X[-samp.idx, ] y.test <- y[-samp.idx] # this could run a while# NOT RUN { cv.fit <- cv.OHPL( x, y, maxcomp = 6, gamma = seq(0.1, 0.9, 0.1), x.test, y.test, cv.folds = 5, G = 30, type = "max" ) # the optimal G and gamma cv.fit$opt.G
cv.fit\$opt.gamma
# }