A Short Introduction to the FSinR Package

2. Feature Selection Process

As mentioned above, the feature selection process aims to obtain an optimal subset of features. To do this, a search algorithm is combined with a filter/wrapper method. The search algorithm guides the process in the features space according to the results returned by the filter/wrapper methods of the evaluated subsets. The optimal subset of features obtained in this process is very useful to generate simpler models, which require less computational resources and which present a better final performance.

The feature selection process is done with the featureSelection function. This is the main function of the package and its parameters are:

data: a matrix or data.frame with the dataset
class: the name of the dependent variable
searcher: the search algorithm
evaluator: the evaluation method (filter or wrapper method)

The result of this function mainly returns the best subset of features found and the measure value of that subset. In addition, it returns another group of variables such as the execution time, etc.

2.1. Search algorithms

The search functions present in the package are the following:

Sequential forward selection
Sequential floating forward selection
Sequential backward selection
Sequential floating backward selection
Breadth first search
Deep first search
Genetic algorithm
Whale optimization algorithm
Ant colony optimization
Simulated annealing
Tabu search
Hill-Climbing
Las Vegas wrapper

The package contains a function, searchAlgorithm, which allows you to select the search algorithm to be used in the feature selection process. The function consists of the following parameters:

searcher: the name of the search algorithm
params: a list of specific parameters for each algorithm

The result of the call to this function is another function to be used in the main function as a search algorithm.

2.2. Filter methods

The filter methods implemented in the package are:

Chi-squared
Cramer V
F-score
Relief
Rough Sets consistency
Binary consistency
Inconsistent Examples consistency
Inconsistent Examples Pairs consistency
Determination Coefficient
Mutual information
Gain ratio
Symmetrical uncertain
Gini index
Jd
MDLC
ReliefFeatureSetMeasure

The filterEvaluator function allows you to select a filter method from the above. The function has the parameters:

filter: the name of the filter method
params: a list of specific parameters for each method

The result of the function call is a function that is used as a filter evaluation method in the main function, featureSelection.

2.3. Wrapper methods

The FSinR package allows the possibility of using the 238 models available in the caret package as wrapper methods. The complete list of caret models can be found here. In addition, the caret package offers the possibility of establishing a group of options to personalize the models (eg. resampling techniques, evaluation measurement, grid parameters, …) using the trainControl and train functions. In FSinR, the wrapperEvaluator function is used to set all these parameters and use them to generate the wrapper model using as background the functions of caret. The wrapperEvaluator function has as parameters:

learner: model name of those available in caret
resamplingParams: list of parameters for trainControl function
fittingParams: list of parameters for train function (x, y, method and trainControl not neccesary)

The result of this function is another function that is used as a wrapper evaluation measure in the main function.

2.4. Feature Selection process example

To demonstrate in a simple manner how package works, the iris dataset will be used in this example.

It is important to note that the FSinR package does not divide data into training and test data. Instead, it applies the feature selection process to the entire dataset passed to it as a parameter. Therefore, in a modeling process the user should divide the dataset prior to performing the feature selection process. Missing data must also be processed prior to the use of the package. But since the main purpose of this vignette is to illustrate the use of the package and not the entire modeling process, in this case the entire unpartitioned dataset will be used.

The iris dataset consists in 150 instances of 4 variables (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) that determine the type of iris plant. The target variable, Species, has 3 possible classes (setosa, versicolor, virginica).

library(caret)
library(FSinR)

data(iris)

2.4.1. Search + Wrapper

Next we will illustrate an example of feature selection composed of a search algorithm and a wrapper method.

First, the wrapper method must be generated. To do this, you have to call the wrapperEvaluator function and determine which model you want to use as a wrapper method. Since the iris problem is a classification problem, in the example we use a knn model.

evaluator <- wrapperEvaluator("knn")

The next step is to generate the search algorithm. This is done by calling the searchAlgorithm function and specifying the algorithm you want to use. In this case we are going to use as a search algorithm a sequential search, sequentialForwardSelection.

searcher <- searchAlgorithm('sequentialForwardSelection')

Once we have generated the search algorithm and the wrapper method we call the main function that performs the feature selection process.

results <- featureSelection(iris, 'Species', searcher, evaluator)

The results show the best subset of features found and its evaluation.

results$bestFeatures
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,]            0           0            0           1
results$bestValue
#> [1] 0.9575654

The above example illustrates a simple use of the package, but this can also be more complex if you intend to establish certain parameters. Regarding the wrapper method, it is possible to tune the model parameters. In the first instance, you can establish the resampling parameters, which are the same arguments that are passed to the trainControl function of the caret package. Secondly, you can establish the fitting parameters, which are the same as for the caret train function.

resamplingParams <- list(method = "cv", number = 10)
fittingParams <- list(preProc = c("center", "scale"), metric="Accuracy", tuneGrid = expand.grid(k = c(1:20)))

evaluator <- wrapperEvaluator("knn", resamplingParams, fittingParams)

For more details, the way in which caret train and tune a model can be seen here. Note that the FSinR package is able to detect automatically depending on the metric whether the objective of the problem is to maximize or minimize.

As for the search algorithm, it is also possible to establish a certain number of parameters. These parameters are specific to each search algorithm. If in this case we now use a tabu search, the number of algorithm iterations, the size of the taboo list, as well as an intensification phase and a diversification phase, among others can be established.

searcher <- searchAlgorithm('tabu', list(tamTabuList = 4, iter = 5, intensification=2, iterIntensification=5, diversification=1, iterDiversification=5, verbose=FALSE) )

Most FSinR package search algorithms include a parameter called verbose, which if set to TRUE shows the development and information of the iterations of the algorithms per console. It is important for tabu search to note that the number of neighbors that are considered and evaluated in each iteration of the tabu search algorithm, numNeigh, is set by default to as many as there are. Then it should be noted that a high value of this parameter considerably increases the calculation time.

Finally, the feature selection process is run again.

results <- featureSelection(iris, 'Species', searcher, evaluator)

2.4.2. Search + Filter

This example is very similar to the previous one. The main difference is to generate a filter method as an evaluator instead of a wrapper method. This is done with the function filterEvaluator.

evaluator <- filterEvaluator('MDLC')

The search algorithm is then generated:

searcher <- searchAlgorithm('sequentialForwardSelection')

And finally the feature selection process is performed, passing to the featureSelection function the filter method and the search algorithm generated.

results <- featureSelection(iris, 'Species', searcher, evaluator)

Again, the best subset obtained and the value obtained by the evaluation measure are shown as results.

results$bestFeatures
#>      Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,]            0           1            0           0
results$bestValue
#> [1] 98.96749

As in the previous example, in this case you can establish both the parameters of the search algorithms and the filter methods if the functions have them.

2.4.3. Individual evaluation of features

Advanced use of the package allows individual evaluation of a set of features using the filter/wrapper methods. That is, the filter/wrapper methods can be used directly without the need to include them in a search algorithm. This is not a defined high-level functionality within the package, since an individual evaluation is not a feature selection process itself. But it is a functionality that can be realized and has to be taken into consideration.

To do this, an evaluator must be generated using a filter method or a wrapper method.

filter_evaluator <- filterEvaluator("IEConsistency")

wrapper_evaluator <- wrapperEvaluator("lvq")

And then obtain a measure from the evaluator by passing as a parameter the dataset, the name of the dependent variable and the set of features to be evaluated.

resultFilter <- filter_evaluator(iris, 'Species', c("Sepal.Length", "Sepal.Width",  "Petal.Length", "Petal.Width"))
#> Warning in filter_evaluator(iris, "Species", c("Sepal.Length", "Sepal.Width", :
#> The data seems not to be discrete, as it should be
resultFilter
#> [1] 1

resultWrapper <- wrapper_evaluator(iris, 'Species', c("Petal.Length", "Petal.Width"))
resultWrapper
#> [1] 0.9548126

3. Direct Feature Selection Process

In this case, the feature selection process consists of a direct search algorithm (cut-off method) combined with an evaluator based on filter or wrapper methods.

The direct feature selection process is performed using the directFeatureSelection function. This function contains the same parameters as the featureSelection function, with the difference that the searcher parameter is replaced by the directSearcher parameter. This parameter contains a direct search algorithm.

The function mainly returns the selected subset of features, and the value of the individual evaluation of each of these features.

3.1. Direct search algorithms

The direct search functions present in the package are the following:

Select k best
Select percentile
Select threshold
Select threshold range
Select difference
Select slope

The package contains the directSearchAlgorithm function that allows you to select the direct search algorithm to be used. The parameters are the same as the searchAlgorithm function, except that the searcher parameter is replaced by the directSearcher parameter. This parameter contains the name of the chosen cut-off method.

The result of the call to this function is another function to be used in the main function as a direct search algorithm.

3.2. Direct Feature Selection process example

In the previous example we showed the functionality of the package on a classification example. In this example we are going to show a regression example. For this we use the mtcars dataset, composed of 32 instances of 10 variables. The regression problem consists in forecasting the value of the mpg variable.

Compared to the process in the previous example, to perform a direct feature selection process the main change is that in this case a direct search algorithm has to be generated. In addition, the function for performing the feature selection process also changes to directFeatureSelection. While the generation of the filter or wrapper methods remains the same.

library(caret)
library(FSinR)

data(mtcars)


evaluator <- filterEvaluator('determinationCoefficient')

directSearcher <- directSearchAlgorithm('selectKBest', list(k=3))

results <- directFeatureSelection(mtcars, 'mpg', directSearcher, evaluator)
results$bestFeatures
#>      cyl disp hp drat wt qsec vs am gear carb
#> [1,]   1    1  0    0  1    0  0  0    0    0
results$featuresSelected
#> [1] "wt"   "cyl"  "disp"
results$valuePerFeature
#> [1] 0.7528328 0.7261800 0.7183433

4. Hybrid Feature Selection Process

The hybrid feature selection process is composed by a hybrid search algorithm that combines the measures obtained from 2 evaluators. This process is performed with the hybridFeatureSelection function. The parameters of this function are as follows:

data: a matrix or data.frame with the dataset
class: the name of the dependent variable
hybridSearcher: the hybrid search algorithm
evaluator_1: the first evaluation method (filter or wrapper method)
evaluator_2: the second evaluation method (filter or wrapper method)

The function returns again the best subset of features found and their measure among other values.

4.1. Hybrid search algorithms

The hybrid search function present in the package is the following:

Linear Consistency-Constrained (LCC)

The package contains the hybridSearchAlgorithm function that allows you to select this hybrid search function. The parameters are the same as the searchAlgorithm and directSearchAlgorithm function, except that the searcher (directSearcher) parameter is replaced by the hybridSearcher parameter. This parameter contains the name of the chosen hybrid method.

The result of the call to this function is another function to be used in the main function as a hybrid search algorithm.

4.2. Hybrid Feature Selection process example

In this example we continue with the previous regression example.

The hybrid feature selection process is slightly different from the above. This is because in addition to generating a hybrid search algorithm, two evaluators must be generated. The package only implements a hybrid search algorithm, LCC. This algorithm requires a first evaluator to evaluate the features individually, and a second evaluator to evaluate the features conjunctively.

The package implements different filter/wrapper methods focused on individual feature evaluation or set evaluation. Set evaluation methods can also evaluate features individually. But individual evaluation methods (Cramer, Chi squared, F-Score and Relief) cannot evaluate sets of features. Therefore, to perform a hybrid feature selection process, a hybrid search algorithm has to be generated together with an individual feature evaluator (individual or set evaluation method) and a set evaluator (set evaluation method)

library(caret)
library(FSinR)

data(mtcars)


evaluator_1 <- filterEvaluator('determinationCoefficient')
evaluator_2 <- filterEvaluator('ReliefFeatureSetMeasure')

hybridSearcher <- hybridSearchAlgorithm('LCC')

results <- hybridFeatureSelection(mtcars, 'mpg', hybridSearcher, evaluator_1, evaluator_2)
results$bestFeatures
#>      cyl disp hp drat wt qsec vs am gear carb
#> [1,]   1    1  1    1  1    1  1  1    1    1
results$bestValue
#> [1] 0.0171875