> ## Documentation Index
> Fetch the complete documentation index at: https://docs.clickml.app/llms.txt
> Use this file to discover all available pages before exploring further.

# Train-Test Split component

> Split your dataset into training and test sets with configurable ratios, random seeds, and stratification to evaluate model generalization.

The **Train-Test Split** component divides your DataFrame into features (`X`) and target (`y`) and then splits them into training and test portions. Its output handles connect directly to model and evaluation components.

## Configuration

| Option              | Description                                                      | Default     |
| ------------------- | ---------------------------------------------------------------- | ----------- |
| **Target Column**   | The column the model should learn to predict                     | —           |
| **Split Mode**      | How to divide the data (see below)                               | Train/Test  |
| **Test Size**       | Fraction of data reserved for the test set                       | `0.2` (20%) |
| **Validation Size** | Fraction reserved for validation (Train/Val/Test mode only)      | `0.1` (10%) |
| **Random State**    | Seed for reproducibility                                         | `42`        |
| **Stratify**        | Keep class proportions equal across splits (classification only) | Off         |

### Split modes

| Mode                  | Output handles                                             |
| --------------------- | ---------------------------------------------------------- |
| No Split (Full Data)  | `X`, `y`                                                   |
| Train/Test            | `X Train`, `Y Train`, `X Test`, `Y Test`                   |
| Train/Validation/Test | `X Train`, `Y Train`, `X Val`, `Y Val`, `X Test`, `Y Test` |

## Input / Output

|        | Type                                  |
| ------ | ------------------------------------- |
| Input  | DataFrame                             |
| Output | Split Data (separate handles per set) |

<Tip>
  Enable **Stratify** for imbalanced classification datasets to ensure every split has a representative distribution of each class.
</Tip>
