TransactAI

Training Pipeline

Configure taxonomy, preprocess data, and train the Logistic Regression model.

Taxonomy Source

source_of_truth.json

This JSON defines the category structure. The pipeline uses this to generate synthetic labels for the initial supervised training set.

Dataset Distribution

Training samples per class

Pipeline Steps

1

Preprocessing

Regex Cleaning & Normalization

2

Feature Extraction

TF-IDF Vectorizer (n-gram 1,2)

3

Classification

Logistic Regression (L2 penalty)

Build Output
Waiting for job execution...
compiling... 0%