Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.clickml.app/llms.txt

Use this file to discover all available pages before exploring further.

The Feature Selection component removes irrelevant or redundant features. Fewer, better features typically lead to faster training and more generalizable models.

Configuration

OptionDescriptionDefault
MethodSelection algorithm (see table below)
Target ColumnThe column being predicted (required for supervised methods)
KNumber of top features to keep (supervised methods)10
ThresholdVariance or correlation threshold (unsupervised methods)0.0 / 0.9
Estimator (RFE)Model used internally to rank features: Random Forest, Logistic Regression, Linear RegressionRandom Forest

Methods

MethodTypeHow it ranks features
Variance ThresholdUnsupervisedDrops features whose variance is below the threshold
Correlation ThresholdUnsupervisedDrops one of each pair of features correlated above the threshold
Select K Best (Chi2)SupervisedRanks features by chi-squared statistic (non-negative values only)
Select K Best (F-score)SupervisedRanks features by ANOVA F-score
Select K Best (Mutual Info)SupervisedRanks features by mutual information with the target
RFESupervisedRecursively removes the least important features using an estimator
Lasso (L1)SupervisedDrops features whose Lasso coefficient is zero

Input / Output

Type
InputDataFrame
OutputDataFrame (selected features only)