Documentation Index
Fetch the complete documentation index at: https://docs.clickml.app/llms.txt
Use this file to discover all available pages before exploring further.
The Feature Selection component removes irrelevant or redundant features. Fewer, better features typically lead to faster training and more generalizable models.
Configuration
| Option | Description | Default |
|---|
| Method | Selection algorithm (see table below) | — |
| Target Column | The column being predicted (required for supervised methods) | — |
| K | Number of top features to keep (supervised methods) | 10 |
| Threshold | Variance or correlation threshold (unsupervised methods) | 0.0 / 0.9 |
| Estimator (RFE) | Model used internally to rank features: Random Forest, Logistic Regression, Linear Regression | Random Forest |
Methods
| Method | Type | How it ranks features |
|---|
| Variance Threshold | Unsupervised | Drops features whose variance is below the threshold |
| Correlation Threshold | Unsupervised | Drops one of each pair of features correlated above the threshold |
| Select K Best (Chi2) | Supervised | Ranks features by chi-squared statistic (non-negative values only) |
| Select K Best (F-score) | Supervised | Ranks features by ANOVA F-score |
| Select K Best (Mutual Info) | Supervised | Ranks features by mutual information with the target |
| RFE | Supervised | Recursively removes the least important features using an estimator |
| Lasso (L1) | Supervised | Drops features whose Lasso coefficient is zero |
| Type |
|---|
| Input | DataFrame |
| Output | DataFrame (selected features only) |