Models Guide

Endgame provides 100+ estimators organized into families. All models follow the scikit-learn interface: fit, predict, and predict_proba (classifiers) or transform (transformers). Every estimator is pipeline-compatible and accepts sample_weight where applicable.

Model Family Overview

Family

Key Classes

Best For

GBDTs

LGBMWrapper, XGBWrapper, CatBoostWrapper

General tabular, competitions

Deep Tabular

FTTransformerClassifier, SAINTClassifier, NODEClassifier, TabPFNClassifier, NAMClassifier, GANDALFClassifier, TabularResNetClassifier

Large datasets, categorical embeddings

Custom Trees

RotationForestClassifier, C50Classifier, ObliqueRandomForestClassifier, QuantileRegressorForest, EvolutionaryTreeClassifier

Structured data, diverse ensembles

Rules

RuleFitClassifier, FURIAClassifier

Interpretable rule extraction

Bayesian

TANClassifier, KDBClassifier, ESKDBClassifier

Probabilistic, small data

Kernel

GPClassifier, SVMClassifier

Small to medium datasets

Interpretable

EBMClassifier, MARSClassifier, SymbolicRegressor

Regulatory compliance, auditability

Neural

ELMClassifier, EmbeddingMLPClassifier, TabNetClassifier

Custom architectures, entity embeddings

Probabilistic

NGBoostClassifier, BARTClassifier

Uncertainty quantification

Baselines

NaiveBayesClassifier, LDAClassifier, QDAClassifier, RDAClassifier, KNNClassifier, LinearClassifier

Benchmarking, ensemble diversity


Preset System

The preset parameter loads competition-winning hyperparameter configurations. Three presets are available across all GBDT wrappers:

  • 'endgame' — competition-tuned defaults (low learning rate, many trees, early stopping). This is the default.

  • 'fast' — higher learning rate, fewer trees. Useful for rapid iteration.

  • 'overfit' — aggressively deep trees, no regularization. Use only for ensembling experiments.

  • 'custom' — no preset applied; pass all hyperparameters explicitly.

from endgame.models import LGBMWrapper

# Competition-ready defaults
model = LGBMWrapper(preset='endgame')
model.fit(X_train, y_train, eval_set=[(X_val, y_val)])

# Fast iteration during feature engineering
quick_model = LGBMWrapper(preset='fast')
quick_model.fit(X_train, y_train)

# Override specific parameters within a preset
model = LGBMWrapper(preset='endgame', num_leaves=63, min_child_samples=50)

The 'endgame' preset sets learning_rate=0.01, n_estimators=10000, and relies on early stopping to find the optimal number of rounds. Always pass a validation set when using this preset.


GBDTs

Gradient boosted decision trees are the default choice for tabular competitions. All three wrappers share the same interface via GBDTWrapper.

from endgame.models import LGBMWrapper, XGBWrapper, CatBoostWrapper

# LightGBM — fastest training, best default performance
lgbm = LGBMWrapper(preset='endgame')
lgbm.fit(X_train, y_train, eval_set=[(X_val, y_val)])
proba = lgbm.predict_proba(X_test)

# XGBoost — strong GPU support, wider ecosystem integration
xgb = XGBWrapper(preset='endgame', use_gpu=True)
xgb.fit(X_train, y_train)

# CatBoost — native categorical feature handling, often best out of the box
catboost = CatBoostWrapper(preset='endgame', categorical_features=['city', 'product'])
catboost.fit(X_train, y_train)

Feature importances are available via model.feature_importances_ after fitting.


Deep Tabular Models

Deep learning models for tabular data. These require PyTorch and are imported from endgame.models.tabular. They tend to shine on datasets with many categorical features or when pre-trained representations are available.

FT-Transformer

Feature Tokenizer + Transformer. Strong general-purpose deep tabular model.

from endgame.models.tabular import FTTransformerClassifier

ft = FTTransformerClassifier(
    d_token=192,
    n_blocks=3,
    attention_dropout=0.2,
    n_epochs=100,
    batch_size=512,
)
ft.fit(X_train, y_train)
proba = ft.predict_proba(X_test)

SAINT

Self-Attention and Intersample Attention Transformer. Captures both feature-level and sample-level interactions.

from endgame.models.tabular import SAINTClassifier

saint = SAINTClassifier(depth=6, heads=8, n_epochs=50)
saint.fit(X_train, y_train)

NODE

Neural Oblivious Decision Ensembles. Differentiable tree structure — fast and competitive with GBDTs on structured data.

from endgame.models.tabular import NODEClassifier

node = NODEClassifier(num_trees=2048, tree_depth=6, n_epochs=50)
node.fit(X_train, y_train)

NAM

Neural Additive Models. Each feature is modeled by an independent neural network, enabling per-feature shape functions with neural expressiveness.

from endgame.models.tabular import NAMClassifier

nam = NAMClassifier(hidden_units=[64, 64], n_epochs=100)
nam.fit(X_train, y_train)
# Access per-feature shape functions
contributions = nam.feature_contributions(X_test)

GANDALF

Gated Adaptive Network for Deep Automated Learning of Features. Requires the pytorch-tabular package and should be imported directly.

from endgame.models.tabular.gandalf import GANDALFClassifier

gandalf = GANDALFClassifier(gflu_stages=6, n_epochs=100)
gandalf.fit(X_train, y_train)

TabularResNet

Residual network architecture adapted for tabular data. Straightforward and reliable with normalization and skip connections.

from endgame.models.tabular import TabularResNetClassifier

resnet = TabularResNetClassifier(
    hidden_dim=256,
    n_layers=4,
    dropout=0.1,
    n_epochs=100,
)
resnet.fit(X_train, y_train)

Custom Trees

Rotation Forest

Applies PCA rotations to random feature subsets before building decision trees. Increases diversity substantially over standard random forests.

from endgame.models import RotationForestClassifier

rf = RotationForestClassifier(n_estimators=100, n_features_per_subset=3)
rf.fit(X_train, y_train)

C5.0

The classic C5.0 decision tree algorithm. Includes rule extraction, pruning, and boosting.

from endgame.models import C50Classifier

c50 = C50Classifier(n_trials=10, pruning=True)
c50.fit(X_train, y_train)
rules = c50.get_rules()  # Human-readable rule set

Oblique Random Forest

Uses linear combinations of features at each split, rather than axis-aligned splits. Captures diagonal decision boundaries.

from endgame.models import ObliqueRandomForestClassifier

orf = ObliqueRandomForestClassifier(n_estimators=100, max_depth=10)
orf.fit(X_train, y_train)

Quantile Regressor Forest

Provides prediction intervals via quantile regression. Each leaf stores the full empirical distribution of training targets.

from endgame.models import QuantileRegressorForest

qrf = QuantileRegressorForest(n_estimators=200)
qrf.fit(X_train, y_train)
lower, median, upper = qrf.predict_quantiles(X_test, quantiles=[0.1, 0.5, 0.9])

Evolutionary Tree

Optimizes tree structure via evolutionary algorithms rather than greedy splitting. Finds globally better splits at the cost of training time.

from endgame.models.trees.evtree import EvolutionaryTreeClassifier

evt = EvolutionaryTreeClassifier(population_size=100, n_generations=50)
evt.fit(X_train, y_train)

Rule-Based Models

RuleFit

Extracts linear rules from an ensemble of trees, then fits a sparse linear model over those rules. The result is a human-readable list of weighted conditions.

from endgame.models import RuleFitClassifier

rulefit = RuleFitClassifier(tree_size=4, max_rules=2000)
rulefit.fit(X_train, y_train)
rules_df = rulefit.get_rules()
print(rules_df[rules_df['importance'] > 0.01])

FURIA

Fuzzy Unordered Rule Induction Algorithm. Produces fuzzy rule sets that handle overlapping class regions gracefully.

from endgame.models import FURIAClassifier

furia = FURIAClassifier(n_rules=20)
furia.fit(X_train, y_train)
rule_list = furia.rules_  # List of FuzzyRule objects

Bayesian Network Classifiers

Bayesian classifiers are well-suited for small datasets where probabilistic structure is meaningful and calibrated probabilities are important.

TAN (Tree Augmented Naive Bayes)

Extends Naive Bayes by allowing each feature to have one additional parent (a single dependency tree over features).

from endgame.models import TANClassifier

tan = TANClassifier()
tan.fit(X_train, y_train)
proba = tan.predict_proba(X_test)

KDB (k-Dependence Bayesian)

Generalizes TAN by allowing each feature to depend on up to k other features. Higher k captures more complex dependencies at the cost of data requirements.

from endgame.models import KDBClassifier

kdb = KDBClassifier(k=2)
kdb.fit(X_train, y_train)

ESKDB (Ensemble Smoothed KDB)

Ensemble of KDB classifiers with Laplace smoothing and random structure perturbation for improved accuracy.

from endgame.models import ESKDBClassifier

eskdb = ESKDBClassifier(k=2, n_estimators=50)
eskdb.fit(X_train, y_train)

Kernel Methods

Gaussian Process Classifier

Provides well-calibrated probabilistic predictions with uncertainty estimates. Exact GP scales as O(n^3), so use on datasets below ~5,000 samples.

from endgame.models import GPClassifier
from sklearn.gaussian_process.kernels import RBF, Matern

gp = GPClassifier(kernel=Matern(nu=2.5), n_restarts_optimizer=5)
gp.fit(X_train, y_train)
proba = gp.predict_proba(X_test)  # Well-calibrated probabilities

SVM Classifier

Support Vector Machine with kernel selection. Competitive on medium-sized datasets with fewer than ~50,000 samples.

from endgame.models import SVMClassifier

svm = SVMClassifier(kernel='rbf', C=10.0, probability=True)
svm.fit(X_train, y_train)

Interpretable Models

These models are suitable for regulated industries where predictions must be auditable or explained to non-technical stakeholders.

EBM (Explainable Boosting Machine)

EBMs are generalized additive models trained with gradient boosting. They achieve near-GBDT accuracy while remaining fully interpretable via shape functions for each feature and pairwise interaction.

from endgame.models import EBMClassifier

ebm = EBMClassifier(interactions=15, max_bins=256)
ebm.fit(X_train, y_train)

# Inspect global explanation
ebm.explain_global()

# Local explanation for a single prediction
ebm.explain_local(X_test[:5])

EBMs support both classification and regression via EBMRegressor.

MARS (Multivariate Adaptive Regression Splines)

Fits piecewise linear splines with automatic knot selection. Produces explicit mathematical expressions for each prediction.

from endgame.models import MARSClassifier

mars = MARSClassifier(max_degree=2, max_terms=20)
mars.fit(X_train, y_train)
print(mars.summary())  # Equation with hinge functions

Symbolic Regression

Discovers explicit mathematical formulas via genetic programming. Best for scientific applications where the functional form matters.

from endgame.models import SymbolicRegressor

sr = SymbolicRegressor(
    population_size=1000,
    generations=20,
    function_set=['add', 'mul', 'sqrt', 'log'],
)
sr.fit(X_train, y_train)
print(sr.best_program_)  # e.g., "0.42 * x1 + sqrt(x2) - 1.7"

Neural Models

ELM (Extreme Learning Machine)

Single hidden-layer network where input weights are randomly assigned and only the output layer is trained. Extremely fast, useful as a cheap ensemble member.

from endgame.models import ELMClassifier

elm = ELMClassifier(n_hidden=1000, activation='relu')
elm.fit(X_train, y_train)

Embedding MLP

MLP with learned entity embeddings for categorical features. Effective when categorical cardinality is high (cities, products, user IDs).

from endgame.models.neural import EmbeddingMLPClassifier

mlp = EmbeddingMLPClassifier(
    cat_features=['city', 'product'],
    hidden_layers=[256, 128, 64],
    dropout=0.3,
    n_epochs=100,
)
mlp.fit(X_train, y_train)

TabNet

Attention-based neural network using sequential attention to select features at each decision step. Provides built-in feature importance.

from endgame.models.neural import TabNetClassifier

tabnet = TabNetClassifier(n_steps=5, gamma=1.5, n_epochs=100)
tabnet.fit(X_train, y_train)
importances = tabnet.feature_importances_

Probabilistic Models

NGBoost

Natural Gradient Boosting outputs full probability distributions rather than point estimates. Use when calibrated uncertainty is required.

from endgame.models import NGBoostClassifier

ngb = NGBoostClassifier(n_estimators=500, learning_rate=0.01)
ngb.fit(X_train, y_train)

# Returns probability distributions, not just point estimates
distributions = ngb.pred_dist(X_test)
proba = ngb.predict_proba(X_test)

BART (Bayesian Additive Regression Trees)

Fully Bayesian nonparametric model providing posterior distributions over predictions. Requires pymc and pymc-bart.

from endgame.models import BARTClassifier

bart = BARTClassifier(m=50, n_samples=1000, tune=500)
bart.fit(X_train, y_train)
proba = bart.predict_proba(X_test)
credible_intervals = bart.predict_interval(X_test, hdi_prob=0.94)

Foundation Models

TabPFN

TabPFN is a prior-fitted network trained on millions of synthetic tabular datasets. It performs in-context learning — no gradient-based training is needed at inference time.

from endgame.models.tabular import TabPFNClassifier

# No training loop — model uses the dataset as context directly
tabpfn = TabPFNClassifier(n_ensemble_configurations=32)
tabpfn.fit(X_train, y_train)  # Stores context, no gradient updates
proba = tabpfn.predict_proba(X_test)

TabPFN works best on datasets with fewer than 10,000 samples and fewer than 100 features. For larger datasets, use TabPFNv2 or TabPFN25:

from endgame.models.tabular import TabPFNv2Classifier, TabPFN25Classifier

# v2 — extended context window, improved accuracy
tabpfn_v2 = TabPFNv2Classifier()
tabpfn_v2.fit(X_train, y_train)

Because TabPFN has large optional dependencies, import it directly from the submodule rather than from endgame.models.


Baseline Models

Lightweight models useful for ensemble diversity and benchmarking.

from endgame.models import (
    NaiveBayesClassifier,
    LDAClassifier,
    QDAClassifier,
    RDAClassifier,
    KNNClassifier,
    LinearClassifier,
)

# Linear discriminant analysis — fast, good baseline for linearly separable data
lda = LDAClassifier(solver='svd')
lda.fit(X_train, y_train)

# Regularized discriminant analysis — blend of LDA and QDA
rda = RDAClassifier(alpha=0.5)
rda.fit(X_train, y_train)

# KNN — strong baseline, no training required
knn = KNNClassifier(n_neighbors=15, weights='distance')
knn.fit(X_train, y_train)

Model Selection Guidance

Use the following heuristics as a starting point:

Situation

Recommended approach

Small dataset (< 1,000 samples)

TANClassifier, ESKDBClassifier, TabPFNClassifier, GPClassifier

Medium dataset (1K–100K samples)

LGBMWrapper or XGBWrapper with preset='endgame'

Large dataset (> 100K samples)

LGBMWrapper, FTTransformerClassifier, TabularResNetClassifier

High-cardinality categoricals

CatBoostWrapper, EmbeddingMLPClassifier, SAINTClassifier

Interpretability required

EBMClassifier, RuleFitClassifier, MARSClassifier

Regulatory compliance

EBMClassifier, SymbolicRegressor, C50Classifier (with get_rules())

Calibrated uncertainty

NGBoostClassifier, BARTClassifier, GPClassifier

No training time budget

TabPFNClassifier (in-context learning), ELMClassifier

Ensembling diversity

Mix families: GBDT + rotation forest + ELM + KNN

Time series classification

See eg.timeseries (MiniRocketClassifier, HydraClassifier)

A practical workflow for competitions:

  1. Start with LGBMWrapper(preset='endgame') as your baseline.

  2. Run eg.benchmark or eg.quick.compare() to survey model families.

  3. Build a diverse set of out-of-fold predictions from multiple families.

  4. Use eg.ensemble.HillClimbingEnsemble or eg.ensemble.StackingEnsemble to combine them.

  5. Calibrate probabilities with eg.calibration if log-loss is the metric.


See Also