> ## Documentation Index
> Fetch the complete documentation index at: https://docs.clickml.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Transformation component

> Apply mathematical transformations like log, square root, Box-Cox, and Yeo-Johnson to numerical columns to reduce skew and stabilize variance.

The **Data Transformation** component applies element-wise mathematical functions to numerical columns — useful for fixing skewed distributions before training.

## Configuration

| Option        | Description                                                                                            |
| ------------- | ------------------------------------------------------------------------------------------------------ |
| **Method**    | The transformation to apply (see table below).                                                         |
| **Columns**   | Columns to transform. Supports `All Numerical Features`.                                               |
| **Threshold** | Used only by the `Binarize` method. Values above this threshold become 1, values at or below become 0. |

### Methods

| Method      | Formula                                              | Use case                                             |
| ----------- | ---------------------------------------------------- | ---------------------------------------------------- |
| Log         | `log(x)`                                             | Right-skewed distributions (values must be positive) |
| Log1p       | `log(1 + x)`                                         | Right-skewed distributions that include zero         |
| Square Root | `√x`                                                 | Moderate right skew                                  |
| Cube Root   | `∛x`                                                 | Handles negative values                              |
| Square      | `x²`                                                 | Amplify differences for small values                 |
| Cube        | `x³`                                                 | Amplify differences more aggressively                |
| Exponential | `eˣ`                                                 | Left-skewed distributions                            |
| Box-Cox     | Power transform — finds optimal lambda automatically | General normalization (positive values only)         |
| Yeo-Johnson | Like Box-Cox but handles zero and negative values    | General normalization                                |
| Binarize    | `1 if x > threshold else 0`                          | Converting continuous features to binary flags       |

## Input / Output

|        | Type      |
| ------ | --------- |
| Input  | DataFrame |
| Output | DataFrame |
