Title: | Predicting Categorical and Continuous Outcomes Using One in Ten Rule |
---|---|
Description: | Predicts categorical or continuous outcomes while concentrating on a number of key points. These are Cross-validation, Accuracy, Regression and Rule of Ten or "one in ten rule" (CARRoT), and, in addition to it R-squared statistics, prior knowledge on the dataset etc. It performs the cross-validation specified number of times by partitioning the input into training and test set and fitting linear/multinomial/binary regression models to the training set. All regression models satisfying chosen constraints are fitted and the ones with the best predictive power are given as an output. Best predictive power is understood as highest accuracy in case of binary/multinomial outcomes, smallest absolute and relative errors in case of continuous outcomes. For binary case there is also an option of finding a regression model which gives the highest AUROC (Area Under Receiver Operating Curve) value. The option of parallel toolbox is also available. Methods are described in Peduzzi et al. (1996) <doi:10.1016/S0895-4356(96)00236-3> , Rhemtulla et al. (2012) <doi:10.1037/a0029315>, Riley et al. (2018) <doi:10.1002/sim.7993>, Riley et al. (2019) <doi:10.1002/sim.7992>. |
Authors: | Alina Bazarova [aut, cre], Marko Raseta [aut] |
Maintainer: | Alina Bazarova <[email protected]> |
License: | GPL-2 |
Version: | 3.0.2 |
Built: | 2025-02-26 03:14:23 UTC |
Source: | https://github.com/cran/CARRoT |
Function enables efficient computation of area under receiver operating curve (AUC). Source: https://stat.ethz.ch/pipermail/r-help/2005-September/079872.html
AUC(probs, class)
AUC(probs, class)
probs |
probabilities |
class |
outcomes |
A value for AUC
AUC(runif(100,0,1),rbinom(100,1,0.3))
AUC(runif(100,0,1),rbinom(100,1,0.3))
Function which averages out the predictive power over all cross-validations
av_out(preds,crv,k)
av_out(preds,crv,k)
preds |
An M x |
crv |
number of cross-validations |
k |
size of the test set for which the predictions are made |
Returns an M x N matrix of average predictive powers where M is maximum feasible number of variables included in a regression, N is the maximum feasible number of regressions of the fixed size; the row index indicates the number of variables included in a regression
#creating a matrix of predictive powers preds<-cbind(matrix(runif(40,1,4),ncol=10),matrix(runif(40,1.5,4),ncol=10)) preds<-cbind(preds,matrix(runif(40,1,3.5),ncol=10)) #running the function av_out(preds,3,5)
#creating a matrix of predictive powers preds<-cbind(matrix(runif(40,1,4),ncol=10),matrix(runif(40,1.5,4),ncol=10)) preds<-cbind(preds,matrix(runif(40,1,3.5),ncol=10)) #running the function av_out(preds,3,5)
Function for combining outputs in a list
comb(...)
comb(...)
... |
an argument of |
Function mapply
#array of numbers to be separated in a list a<-1:4 #running the function comb(a)
#array of numbers to be separated in a list a<-1:4 #running the function comb(a)
Function which computes the maximum number of regressions with fixed number of variables based on the rule of thumb
compute_max_length(vari_col,k,c,we,minx,maxx,st)
compute_max_length(vari_col,k,c,we,minx,maxx,st)
vari_col |
number of predictors |
k |
maximum weight of the predictors |
c |
array of all indices of the predictors |
we |
array of weights of the predictors. Continuous or categorical numerical variable with more then 5 categories has weight 1, otherwise it has weight |
minx |
minimum number of predictors, 1 by default |
maxx |
maximum number of predictors, total number of variables by default |
st |
a subset of predictors to be always included into a predictive model |
Integer correponding to maximum number of regressions of the same size
Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996). “A simulation study of the number of events per variable in logistic regression analysis.” Journal of Clinical Epidemiology, 49(12), 1373-1379. ISSN 0895-4356, doi:10.1016/S0895-4356(96)00236-3, http://dx.doi.org/10.1016/S0895-4356(96)00236-3.
Rhemtulla M, Brosseau-Liard PÉ, Savalei V (2012). “When can categorical variables be treated as continuous?: A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions.” Psychological Methods, 17(3), 354-373. doi:10.1037/a0029315.
Function uses combn
compute_max_length(4,40,1:4,c(1,1,2,1))
compute_max_length(4,40,1:4,c(1,1,2,1))
Function which computes maximal weight (multiplied by the corresponding EPV rule) of a regression according to the rule of thumb applied to the outcome variable. Weight of a regression equals the sum of weights of its predictors.
compute_max_weight(outi,mode)
compute_max_weight(outi,mode)
outi |
set of outcomes |
mode |
indicates the mode: 'linear' (linear regression), 'binary' (logistic regression), 'multin' (multinomial regression) |
For continuous outcomes it equals sample size divided by 10, for multinomial it equals the size of the smallest category divided by 10
returns an integer value of maximum allowed weight multiplied by 10
Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996). “A simulation study of the number of events per variable in logistic regression analysis.” Journal of Clinical Epidemiology, 49(12), 1373-1379. ISSN 0895-4356, doi:10.1016/S0895-4356(96)00236-3, http://dx.doi.org/10.1016/S0895-4356(96)00236-3.
#continuous outcomes compute_max_weight(runif(100,0,1),'linear') #binary outcomes compute_max_weight(rbinom(100,1,0.4),'binary')
#continuous outcomes compute_max_weight(runif(100,0,1),'linear') #binary outcomes compute_max_weight(rbinom(100,1,0.4),'binary')
Function which computes the weight of each predictor according to the rules of thumb and outputs it into corresponding array
compute_weights(vari_col, vari)
compute_weights(vari_col, vari)
vari_col |
number of predictors |
vari |
set of predictors |
Continuous or categorical numerical variable with more then 5 categories has weight 1, otherwise it has weight n-1
where n
is the number of categories
Returns an array of weights of the size vari_col
Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR (1996). “A simulation study of the number of events per variable in logistic regression analysis.” Journal of Clinical Epidemiology, 49(12), 1373-1379. ISSN 0895-4356, doi:10.1016/S0895-4356(96)00236-3, http://dx.doi.org/10.1016/S0895-4356(96)00236-3.
Rhemtulla M, Brosseau-Liard PÉ, Savalei V (2012). “When can categorical variables be treated as continuous?: A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions.” Psychological Methods, 17(3), 354-373. doi:10.1037/a0029315.
#creating data-set with for variables a<-matrix(NA,nrow=100,ncol=4) #binary variable a[,1]=rbinom(100,1,0.3) #continuous variable a[,2]=runif(100,0,1) #categorical numeric with les than 5 categories a[,3]=t(rmultinom(100,1,c(0.2,0.3,0.5)))%*%c(1,2,3) #categorical numeric with 5 categories a[,4]=t(rmultinom(100,1,c(0.2,0.3,0.3,0.1,0.1)))%*%c(1,2,3,4,5) #running the function compute_weights(4,a)
#creating data-set with for variables a<-matrix(NA,nrow=100,ncol=4) #binary variable a[,1]=rbinom(100,1,0.3) #continuous variable a[,2]=runif(100,0,1) #categorical numeric with les than 5 categories a[,3]=t(rmultinom(100,1,c(0.2,0.3,0.5)))%*%c(1,2,3) #categorical numeric with 5 categories a[,4]=t(rmultinom(100,1,c(0.2,0.3,0.3,0.1,0.1)))%*%c(1,2,3,4,5) #running the function compute_weights(4,a)
Function running a single cross-validation by partitioning the data into training and test set
cross_val( vari, outi, c, rule, part, l, we, vari_col, preds, mode, cmode, predm, cutoff, objfun, minx = 1, maxx = NULL, nr = NULL, maxw = NULL, st = NULL, corr = 1, Rsq = F, marg = 0, n_tr, preds_tr )
cross_val( vari, outi, c, rule, part, l, we, vari_col, preds, mode, cmode, predm, cutoff, objfun, minx = 1, maxx = NULL, nr = NULL, maxw = NULL, st = NULL, corr = 1, Rsq = F, marg = 0, n_tr, preds_tr )
vari |
set of predictors |
outi |
array of outcomes |
c |
set of all indices of the predictors |
rule |
an Events per Variable (EPV) rule, defaults to 10 |
part |
indicates partition of the original data-set into training and test set in a proportion |
l |
number of observations |
we |
weights of the predictors |
vari_col |
overall number of predictors |
preds |
array to write predictions for the test split into, intially empty |
mode |
|
cmode |
|
predm |
|
cutoff |
cut-off value for logistic regression |
objfun |
|
minx |
minimum number of predictors to be included in a regression, defaults to 1 |
maxx |
maximum number of predictors to be included in a regression, defaults to maximum feasible number according to one in ten rule |
nr |
a subset of the data-set, such that |
maxw |
maximum weight of predictors to be included in a regression, defaults to maximum weight according to one in ten rule |
st |
a subset of predictors to be always included into a predictive model,defaults to empty set |
corr |
maximum correlation between a pair of predictors in a model |
Rsq |
whether R-squared statistics constrained is introduced |
marg |
margin of error for R-squared statistics constraint |
n_tr |
size of the training set |
preds_tr |
array to write predictions for the training split into, intially empty |
regr |
An M x N matrix of sums of the absolute errors for each element of the test set for each feasible regression. M is maximum feasible number of variables included in a regression, N is the maximum feasible number of regressions of the fixed size; the row index indicates the number of variables included in a regression. Therefore each row corresponds to results obtained from running regressions with the same number of variables and columns correspond to different subsets of predictors used. |
regrr |
An M x N matrix of sums of the relative errors for each element of the test set (only for |
nvar |
Maximum feasible number of variables in the regression |
emp |
An accuracy of always predicting the more likely outcome as suggested by the training set (only for |
In regr
and regrr
NA
values are possible since for some numbers of variables there are fewer feasible regressions than for the others.
Uses compute_max_weight
, sum_weights_sub
, make_numeric_sets
, get_predictions_lin
, get_predictions
, get_probabilities
, AUC
, combn
#creating variables vari<-matrix(c(1:100,seq(1,300,3)),ncol=2) #creating outcomes out<-rbinom(100,1,0.3) #creating array for predictions pr<-array(NA,c(2,2)) pr_tr<-array(NA,c(2,2)) #passing set of the inexes of the predictors c<-c(1:2) #passing the weights of the predictors we<-c(1,1) #setting the mode m<-'binary' #running the function cross_val(vari,out,c,10,10,100,we,2,pr,m,'det','exact',0.5,'acc',nr=c(1,4),n_tr=90,preds_tr=pr_tr)
#creating variables vari<-matrix(c(1:100,seq(1,300,3)),ncol=2) #creating outcomes out<-rbinom(100,1,0.3) #creating array for predictions pr<-array(NA,c(2,2)) pr_tr<-array(NA,c(2,2)) #passing set of the inexes of the predictors c<-c(1:2) #passing the weights of the predictors we<-c(1,1) #setting the mode m<-'binary' #running the function cross_val(vari,out,c,10,10,100,we,2,pr,m,'det','exact',0.5,'acc',nr=c(1,4),n_tr=90,preds_tr=pr_tr)
Function transforms a set of predictors into a set of predictors, their squares, pairwise interactions, cubes and three-way interactions
cub(A, n = 1000)
cub(A, n = 1000)
A |
set of predictors |
n |
first |
Returns the predictors including their squares, pairwise interactions, cubes and three-way interactions
cub(cbind(1:100,rnorm(100),runif(100),rnorm(100,0,2)))
cub(cbind(1:100,rnorm(100),runif(100),rnorm(100,0,2)))
Function transforms an index of an array of two- or three-way interactions into two or three indices corresponding to the interacting variables
find_int(ind,N)
find_int(ind,N)
ind |
index to transform |
N |
number of interacting variables |
Returns two or three indices corredsponding to a combination of variables written under the given index
find_int(28,9)
find_int(28,9)
Reorders the columns of matrix a
according to the ordered elements of array s
find_sub(a,s,j,c,st)
find_sub(a,s,j,c,st)
a |
A |
s |
array of numbers of the size N |
j |
number of rows in |
c |
array of all indices of the predictors |
st |
a subset of predictors to be always included into a predictive model |
Returns a submatrix of matrix a
which consits of columns determined by the input array s
#all two-element subsets of 1:3 a<-combn(3,2) s<-c(3,2,3) find_sub(a,s,2,1:3)
#all two-element subsets of 1:3 a<-combn(3,2) s<-c(3,2,3) find_sub(a,s,2,1:3)
Function which identifies regressions with the highest predictive power
get_indices(predsp,nvar,c,we,st,minx)
get_indices(predsp,nvar,c,we,st,minx)
predsp |
An M x N matrix of averaged out predictive power values. M is maximum feasible number of variables included in a regression, N is the maximum feasible number of regressions of the fixed size; the row index indicates the number of variables included in a regression. |
nvar |
array of maximal number of variables for each cross-validation |
c |
array of all indices of the prediction variables |
we |
array of all weights of the prediction variables |
st |
a subset of predictors to be always included into a predictive model |
minx |
minimum number of predictors, defaults to 1 |
A list of arrays which contain indices of the predictors corresponfing to the best regressions
Uses sum_weights_sub
, find_sub
, combn
#creating a set of averaged out predictive powers predsp<-matrix(NA,ncol=3,nrow=3) predsp[1,]=runif(3,0.7,0.8) predsp[2,]=runif(3,0.65,0.85) predsp[3,1]=runif(1,0.4,0.5) #running the function get_indices(predsp,c(3,3,3),1:3,c(1,1,1))
#creating a set of averaged out predictive powers predsp<-matrix(NA,ncol=3,nrow=3) predsp[1,]=runif(3,0.7,0.8) predsp[2,]=runif(3,0.65,0.85) predsp[3,1]=runif(1,0.4,0.5) #running the function get_indices(predsp,c(3,3,3),1:3,c(1,1,1))
Function which makes a prediction for multinomial/logistic regression based on the given cut-off value and probabilities.
get_predictions(p,k,cutoff,cmode,mode)
get_predictions(p,k,cutoff,cmode,mode)
p |
probabilities of the outcomes for the test set given either by an array (logistic regression) or by a matrix (multinomial regression) |
k |
size of the test set |
cutoff |
cut-off value of the probability |
cmode |
|
mode |
|
Outputs the array of the predictions of the size of p
.
#binary mode get_predictions(runif(20,0.4,0.6),20,0.5,'det','binary') #creating a data-set for multinomial mode p1<-runif(20,0.4,0.6) p2<-runif(20,0.1,0.2) p3<-1-p1-p2 #running the function get_predictions(matrix(c(p1,p2,p3),ncol=3),20,0.5,'det','multin')
#binary mode get_predictions(runif(20,0.4,0.6),20,0.5,'det','binary') #creating a data-set for multinomial mode p1<-runif(20,0.4,0.6) p2<-runif(20,0.1,0.2) p3<-1-p1-p2 #running the function get_predictions(matrix(c(p1,p2,p3),ncol=3),20,0.5,'det','multin')
Function which runs a linear regression on a training set, computes predictions for the test set
get_predictions_lin(trset,testset,outc,k,n_tr,p,Rsq,Rsq_v,marg)
get_predictions_lin(trset,testset,outc,k,n_tr,p,Rsq,Rsq_v,marg)
trset |
values of predictors on the training set |
testset |
values of predictors on the test set |
outc |
values of predictors on the training set |
k |
length of the test set |
n_tr |
size of the training set |
p |
weight of the model |
Rsq |
whether the R-squared statistics constraint is introduced |
Rsq_v |
value of R-squared statistics on the training spli of the data |
marg |
margin of error for R-squared statistics constraint |
An array of continous variables of the length equal to the size of a testset
Function uses function lsfit
and coef
trset<-matrix(c(rnorm(90,2,4),runif(90,0,0.5),rbinom(90,1,0.5)),ncol=3) testset<-matrix(c(rnorm(10,2,4),runif(10,0,0.5),rbinom(10,1,0.5)),ncol=3) get_predictions_lin(trset,testset,runif(90,0,1),10)
trset<-matrix(c(rnorm(90,2,4),runif(90,0,0.5),rbinom(90,1,0.5)),ncol=3) testset<-matrix(c(rnorm(10,2,4),runif(10,0,0.5),rbinom(10,1,0.5)),ncol=3) get_predictions_lin(trset,testset,runif(90,0,1),10)
Function which computes probabilities of outcomes on the test set by applying regression parameters inferred by a run on the training set. Works for logistic or multinomial regression
get_probabilities(trset,testset,outc,mode,Rsq,p,n_tr)
get_probabilities(trset,testset,outc,mode,Rsq,p,n_tr)
trset |
values of predictors on the training set |
testset |
values of predictors on the test set |
outc |
values of outcomes on the training set |
mode |
|
Rsq |
whether R-squared statistics constrained is introduced |
p |
weight of the model |
n_tr |
size of the training set |
In binary mode this function computes the probabilities of the event '0'. In multinomial mode computes the probabilities of the events '0','1',...,'N-1'.
Probabilities of the outcomes. In 'binary'
mode returns an array of the size of the number of observations in a testset. In 'multin'
returns an M x N matrix where M is the size of the number of observations in a testset
and N is the number of unique outcomes minus 1.
Function uses multinom
and coef
trset<-matrix(c(rbinom(70,1,0.5),runif(70,0.1)),ncol=2) testset<-matrix(c(rbinom(10,1,0.5),runif(10,0.1)),ncol=2) get_probabilities(trset,testset,rbinom(70,1,0.6),'binary')
trset<-matrix(c(rbinom(70,1,0.5),runif(70,0.1)),ncol=2) testset<-matrix(c(rbinom(10,1,0.5),runif(10,0.1)),ncol=2) get_probabilities(trset,testset,rbinom(70,1,0.6),'binary')
Function which turns a single categorical (non-numeric) variable into a numeric one (or several) by introducing dummy '0'/'1' variables.
make_numeric(vari, outcome, ra,mode)
make_numeric(vari, outcome, ra,mode)
vari |
array of values to be transformed |
outcome |
TRUE/FALSE indicates whether the variable |
ra |
indices of the input array |
mode |
|
This function is essentially a standard way to turn categorical non-numeric variables into numeric ones in order to run a regression
Returned value is an M x N matrix where M is the length of the input array of indices ra
and N is length(vari)-1
.
#creating a non-numeric set a<-t(rmultinom(100,1,c(0.2,0.3,0.5)))%*%c(1,2,3) a[a==1]='red' a[a==2]='green' a[a==3]='blue' #running the function make_numeric(a,FALSE,sample(1:100,50),"linear") make_numeric(a,TRUE,sample(1:100,50))
#creating a non-numeric set a<-t(rmultinom(100,1,c(0.2,0.3,0.5)))%*%c(1,2,3) a[a==1]='red' a[a==2]='green' a[a==3]='blue' #running the function make_numeric(a,FALSE,sample(1:100,50),"linear") make_numeric(a,TRUE,sample(1:100,50))
Function which turns a set of predictors containing non-numeric variables into a fully numeric set
make_numeric_sets(a,ai,k,vari,ra,l,mode)
make_numeric_sets(a,ai,k,vari,ra,l,mode)
a |
An M x N matrix, containing all possible subsets (N overall) of the size M of predictors' indices; therefore each column of |
ai |
array of indices of the array |
k |
index of the array |
vari |
set of all predictors |
ra |
array of sample indices of |
l |
size of the sample |
mode |
|
Function transforms the whole set of predictors into a numeric set by consecutively calling function make_numeric
for each predictor
Returns a list containing two objects: tr
and test
tr |
training set transformed into a numeric one |
test |
test set transformed into a numeric one |
#creating a categorical numeric variable a<-t(rmultinom(100,1,c(0.2,0.3,0.5)))%*%c(1,2,3) #creating an analogous non-numeric variable c<-array(NA,100) c[a==1]='red' c[a==2]='green' c[a==3]='blue' #creating a data-set b<-data.frame(matrix(c(a,rbinom(100,1,0.3),runif(100,0,1)),ncol=3)) #making the first column of the data-set non-numeric b[,1]=data.frame(c) #running the function make_numeric_sets(combn(3,2),1:3,1,b,sample(1:100,60),100,"binary")
#creating a categorical numeric variable a<-t(rmultinom(100,1,c(0.2,0.3,0.5)))%*%c(1,2,3) #creating an analogous non-numeric variable c<-array(NA,100) c[a==1]='red' c[a==2]='green' c[a==3]='blue' #creating a data-set b<-data.frame(matrix(c(a,rbinom(100,1,0.3),runif(100,0,1)),ncol=3)) #making the first column of the data-set non-numeric b[,1]=data.frame(c) #running the function make_numeric_sets(combn(3,2),1:3,1,b,sample(1:100,60),100,"binary")
Function transforms a set of predictors into a set of predictors, their squares and pairwise interactions
quadr(A, n = 1000)
quadr(A, n = 1000)
A |
set of predictors |
n |
first |
Returns the predictors including their squares and pairwise interactions
quadr(cbind(1:100,rnorm(100),runif(100),rnorm(100,0,2)))
quadr(cbind(1:100,rnorm(100),runif(100),rnorm(100,0,2)))
One of the two main functions of the package. Identifies the predictors included into regressions with the highest average predictive power
regr_ind( vari, outi, crv, cutoff = NULL, part = 10, mode, cmode = "det", predm = "exact", objfun = "acc", parallel = FALSE, cores, minx = 1, maxx = NULL, nr = NULL, maxw = NULL, st = NULL, rule = 10, corr = 1, Rsq = F, marg = 0 )
regr_ind( vari, outi, crv, cutoff = NULL, part = 10, mode, cmode = "det", predm = "exact", objfun = "acc", parallel = FALSE, cores, minx = 1, maxx = NULL, nr = NULL, maxw = NULL, st = NULL, rule = 10, corr = 1, Rsq = F, marg = 0 )
vari |
set of predictors |
outi |
array of outcomes |
crv |
number of cross-validations |
cutoff |
cut-off value for mode |
part |
for each cross-validation partitions the dataset into training and test set in a proportion |
mode |
|
cmode |
|
predm |
|
objfun |
|
parallel |
TRUE if using parallel toolbox, FALSE if not. Defaults to FALSE |
cores |
number of cores to use in case of parallel=TRUE |
minx |
minimum number of predictors to be included in a regression, defaults to 1 |
maxx |
maximum number of predictors to be included in a regression, defaults to maximum feasible number according to one in ten rule |
nr |
a subset of the data-set, such that |
maxw |
maximum weight of predictors to be included in a regression, defaults to maximum weight according to one in ten rule |
st |
a subset of predictors to be always included into a predictive model,defaults to empty set |
rule |
an Events per Variable (EPV) rule, defaults to 10' |
corr |
maximum correlation between a pair of predictors in a model |
Rsq |
whether the R-squared statistics constraint is introduced |
marg |
margin of error for R-squared statistics constraint |
Prints the best predictive power provided by a regression, predictive accuracy of the empirical prediction (value of emp
computed by cross_val
for logistic and linear regression). Returns indices of the predictors included into regressions with the highest predictive power written in a list. For mode='linear'
outputs a list of two lists. First list corresponds to the smallest absolute error, second corresponds to the smallest relative error
Uses compute_weights
, make_numeric
, compute_max_weight
, compute_weights
, compute_max_length
, cross_val
,av_out
, get_indices
#creating variables for linear regression mode variables_lin<-matrix(c(rnorm(56,0,1),rnorm(56,1,2)),ncol=2) #creating outcomes for linear regression mode outcomes_lin<-rnorm(56,2,1) #running the function regr_ind(variables_lin,outcomes_lin,100,mode='linear',parallel=TRUE,cores=2) #creating variables for binary mode vari<-matrix(c(1:100,seq(1,300,3)),ncol=2) #creating outcomes for binary mode out<-rbinom(100,1,0.3) #running the function regr_ind(vari,out,20,cutoff=0.5,part=10,mode='binary',parallel=TRUE,cores=2,nr=c(1,10,20),maxx=1)
#creating variables for linear regression mode variables_lin<-matrix(c(rnorm(56,0,1),rnorm(56,1,2)),ncol=2) #creating outcomes for linear regression mode outcomes_lin<-rnorm(56,2,1) #running the function regr_ind(variables_lin,outcomes_lin,100,mode='linear',parallel=TRUE,cores=2) #creating variables for binary mode vari<-matrix(c(1:100,seq(1,300,3)),ncol=2) #creating outcomes for binary mode out<-rbinom(100,1,0.3) #running the function regr_ind(vari,out,20,cutoff=0.5,part=10,mode='binary',parallel=TRUE,cores=2,nr=c(1,10,20),maxx=1)
Function which computes the sum of predictors' weights for each subset containing a fixed number of predictors
sum_weights_sub(a,m,we,st)
sum_weights_sub(a,m,we,st)
a |
an |
m |
number of elements in each subset of indices |
we |
array of weights of the predictors |
st |
a subset of predictors to be always included into a predictive model |
Returns an array of weights for predictors defined by each colun of the matrix a
#all two-element subsets of the set 1:3 a<-combn(3,2) sum_weights_sub(a,2,c(1,2,1))
#all two-element subsets of the set 1:3 a<-combn(3,2) sum_weights_sub(a,2,c(1,2,1))