Models¶
- class endgame.models.GBDTWrapper(backend='lightgbm', task='auto', preset='endgame', use_gpu='auto', categorical_features=None, early_stopping_rounds=100, random_state=None, verbose=False, **kwargs)[source]¶
Bases:
EndgameEstimatorUnified interface for XGBoost, LightGBM, and CatBoost.
Provides consistent API across gradient boosting frameworks with competition-tuned default parameters.
- Parameters:
backend (str, default='lightgbm') – Boosting library: ‘xgboost’, ‘lightgbm’, ‘catboost’.
task (str, default='auto') – Task type: ‘auto’, ‘classification’, ‘regression’.
preset (str, default='endgame') – Hyperparameter preset: ‘endgame’, ‘fast’, ‘overfit’, ‘custom’.
use_gpu (bool or str, default='auto') – Enable GPU: True, False, or ‘auto’ (auto-detect).
categorical_features (List[str], optional) – Columns to treat as categorical.
early_stopping_rounds (int, default=100) – Early stopping patience.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
**kwargs – Override preset parameters.
- model_¶
Fitted underlying model.
- Type:
estimator
Examples
>>> from endgame.models import GBDTWrapper >>> model = GBDTWrapper(backend='lightgbm', preset='endgame') >>> model.fit(X_train, y_train, eval_set=[(X_val, y_val)]) >>> predictions = model.predict(X_test)
- fit(X, y, eval_set=None, sample_weight=None, **fit_params)[source]¶
Fit the model.
- Parameters:
X (array-like) – Training features.
y (array-like) – Target values.
eval_set (List[Tuple], optional) – Validation set(s) for early stopping.
sample_weight (array-like, optional) – Sample weights.
**fit_params – Additional fit parameters.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like) – Features to predict.
- Return type:
- Returns:
ndarray – Predictions.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like) – Features to predict.
- Return type:
- Returns:
ndarray – Class probabilities.
- score(X, y, sample_weight=None)[source]¶
Return the score on the given data.
For classification, returns accuracy. For regression, returns R² score.
- Parameters:
X (array-like) – Test features.
y (array-like) – True labels or target values.
sample_weight (array-like, optional) – Sample weights.
- Return type:
- Returns:
float – Score.
- set_fit_request(*, eval_set='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (GBDTWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (GBDTWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LGBMWrapper(preset='endgame', task='auto', use_goss=False, use_gpu='auto', categorical_features=None, early_stopping_rounds=100, random_state=None, verbose=False, **kwargs)[source]¶
Bases:
GBDTWrapperLightGBM-specific wrapper with additional features.
- Parameters:
preset (str, default='endgame') – Hyperparameter preset.
task (str, default='auto') – Task type: ‘auto’, ‘classification’, ‘regression’.
use_goss (bool, default=False) – Use Gradient-based One-Side Sampling.
**kwargs – Additional parameters.
early_stopping_rounds (int)
random_state (int | None)
verbose (bool)
Examples
>>> model = LGBMWrapper(preset='endgame') >>> model.fit(X_train, y_train, eval_set=[(X_val, y_val)])
- set_fit_request(*, eval_set='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (LGBMWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LGBMWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.XGBWrapper(preset='endgame', task='auto', use_dart=False, use_gpu='auto', categorical_features=None, early_stopping_rounds=100, random_state=None, verbose=False, **kwargs)[source]¶
Bases:
GBDTWrapperXGBoost-specific wrapper with additional features.
- Parameters:
Examples
>>> model = XGBWrapper(preset='endgame') >>> model.fit(X_train, y_train, eval_set=[(X_val, y_val)])
- set_fit_request(*, eval_set='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (XGBWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (XGBWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.CatBoostWrapper(preset='endgame', task='auto', auto_class_weights=None, use_gpu='auto', categorical_features=None, early_stopping_rounds=100, random_state=None, verbose=False, **kwargs)[source]¶
Bases:
GBDTWrapperCatBoost-specific wrapper with native categorical handling.
- Parameters:
preset (str, default='endgame') – Hyperparameter preset.
task (str, default='auto') – Task type: ‘auto’, ‘classification’, ‘regression’.
auto_class_weights (str, optional) – Auto class weighting: ‘Balanced’, ‘SqrtBalanced’.
**kwargs – Additional parameters.
early_stopping_rounds (int)
random_state (int | None)
verbose (bool)
Examples
>>> model = CatBoostWrapper(preset='endgame', categorical_features=['category']) >>> model.fit(X_train, y_train, eval_set=[(X_val, y_val)])
- set_fit_request(*, eval_set='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (CatBoostWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (CatBoostWrapper)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.RotationForestClassifier(n_estimators=10, n_subsets=3, max_features=0.5, base_estimator=None, bootstrap=True, random_state=None, n_jobs=1, verbose=False)[source]¶
Bases:
ClassifierMixin,BaseRotationForestRotation Forest for classification.
- Parameters:
n_estimators (int, default=10) – Number of trees.
n_subsets (int, default=3) – Number of feature subsets per tree.
max_features (float, default=0.5) – Fraction of features per subset.
base_estimator (estimator, optional) – Base tree. Default: DecisionTreeClassifier.
bootstrap (bool, default=True) – Bootstrap samples.
random_state (int, optional) – Random seed.
n_jobs (int)
verbose (bool)
Examples
>>> from endgame.models import RotationForestClassifier >>> clf = RotationForestClassifier(n_estimators=20) >>> clf.fit(X_train, y_train) >>> predictions = clf.predict(X_test)
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
ndarray of shape (n_samples,) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
ndarray of shape (n_samples, n_classes) – Class probabilities.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (RotationForestClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.RotationForestRegressor(n_estimators=10, n_subsets=3, max_features=0.5, base_estimator=None, bootstrap=True, random_state=None, n_jobs=1, verbose=False)[source]¶
Bases:
BaseRotationForest,RegressorMixinRotation Forest for regression.
- Parameters:
n_estimators (int, default=10) – Number of trees.
n_subsets (int, default=3) – Number of feature subsets per tree.
max_features (float, default=0.5) – Fraction of features per subset.
base_estimator (estimator, optional) – Base tree. Default: DecisionTreeRegressor.
bootstrap (bool, default=True) – Bootstrap samples.
random_state (int, optional) – Random seed.
n_jobs (int)
verbose (bool)
Examples
>>> from endgame.models import RotationForestRegressor >>> reg = RotationForestRegressor(n_estimators=20) >>> reg.fit(X_train, y_train) >>> predictions = reg.predict(X_test)
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
ndarray of shape (n_samples,) – Predicted values.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (RotationForestRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.C50Classifier(min_cases=2, cf=0.25, use_subset=True, global_pruning=True, use_rust=False, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorC5.0 Decision Tree Classifier.
A high-performance implementation of the C5.0 decision tree algorithm with support for continuous and categorical features, missing values, and sophisticated pruning.
- Parameters:
min_cases (int, default=2) – Minimum number of cases in a branch.
cf (float, default=0.25) – Confidence factor for pruning. Lower values = more pruning.
use_subset (bool, default=True) – Use subset splits for categorical attributes.
global_pruning (bool, default=True) – Apply global pruning in addition to local pruning.
use_rust (bool, default=False) – Use Rust backend if available. Disabled by default due to a classification routing bug in the current Rust extension.
random_state (int or None, default=None) – Random state for reproducibility.
- tree_¶
The fitted decision tree.
- Type:
TreeNode
- classes_¶
Unique class labels.
- Type:
ndarray
- feature_importances_¶
Feature importances based on split gains.
- Type:
ndarray
Examples
>>> from endgame.models.trees import C50Classifier >>> clf = C50Classifier() >>> clf.fit(X_train, y_train) >>> predictions = clf.predict(X_test)
- fit(X, y, sample_weight=None, categorical_features=None)[source]¶
Fit the C5.0 classifier.
- Parameters:
- Return type:
- Returns:
self (C50Classifier) – Fitted classifier.
- get_structure(feature_names=None)[source]¶
Get a human-readable representation of the decision tree structure.
- set_fit_request(*, categorical_features='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
categorical_features (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
categorical_featuresparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (C50Classifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (C50Classifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.C50Ensemble(n_trials=10, min_cases=2, cf=0.25, use_subset=True, use_rust=False, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorBoosted C5.0 Ensemble Classifier.
Uses AdaBoost-style boosting to combine multiple C5.0 trees.
- Parameters:
n_trials (int, default=10) – Number of boosting iterations.
min_cases (int, default=2) – Minimum cases per branch.
cf (float, default=0.25) – Confidence factor for pruning.
use_subset (bool, default=True) – Use subset splits for categorical attributes.
use_rust (bool, default=False) – Use Rust backend if available. Disabled by default due to a classification routing bug in the current Rust extension.
random_state (int or None, default=None) – Random state for reproducibility.
- estimators_¶
The fitted trees.
- Type:
- estimator_weights_¶
Weights for each tree in voting.
- Type:
ndarray
- classes_¶
Unique class labels.
- Type:
ndarray
- set_fit_request(*, categorical_features='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
categorical_features (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
categorical_featuresparameter infit.self (C50Ensemble)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (C50Ensemble)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.CubistRegressor(committees=1, neighbors=0, min_cases=10, max_rules=0, sample=1.0, extrapolation=0.05, unbiased=False, use_rust=True, random_state=None)[source]¶
Bases:
BaseEstimator,RegressorMixinCubist Regression Model.
A high-performance implementation of the Cubist algorithm that combines decision trees with linear regression models. The resulting model consists of a set of rules, where each rule has conditions and a linear model for prediction.
- Parameters:
committees (int, default=1) – Number of committee members (trees) to build. Using multiple committees creates a boosted ensemble where each subsequent model focuses on the residuals from previous models.
neighbors (int, default=0) – Number of nearest neighbors to use for instance-based correction. Set to 0 to disable instance-based correction. When enabled, predictions are adjusted based on the residuals of nearby training instances.
min_cases (int, default=2) – Minimum number of cases in a node before splitting is considered.
max_rules (int, default=0) – Maximum number of rules to generate (0 = unlimited).
sample (float, default=1.0) – Fraction of training data to use in each committee member.
extrapolation (float, default=0.05) – Amount of extrapolation allowed beyond training range (as fraction).
unbiased (bool, default=False) – If True, use unbiased splitting criterion.
use_rust (bool, default=True) – Use the Rust backend if available for better performance.
random_state (int or None, default=None) – Random state for reproducibility.
- feature_importances_¶
Feature importances based on usage in splits.
- Type:
ndarray of shape (n_features,)
Examples
>>> from endgame.models.trees import CubistRegressor >>> import numpy as np >>> X = np.random.randn(100, 5) >>> y = X[:, 0] * 2 + X[:, 1] * 3 + np.random.randn(100) * 0.1 >>> reg = CubistRegressor(committees=5, neighbors=5) >>> reg.fit(X, y) >>> predictions = reg.predict(X[:10])
Notes
Cubist was developed by Ross Quinlan as a commercial product. This implementation is based on the algorithm described in the open-source C code released under GPL.
The algorithm works as follows: 1. Build a regression tree by recursively splitting data to minimize
variance in the target variable.
At each leaf node, fit a linear model using the cases at that node.
Extract rules from the tree paths.
Prune rules to remove redundant conditions.
Optionally build multiple trees (committees) using boosting.
Optionally apply instance-based correction using k-NN.
- fit(X, y, sample_weight=None)[source]¶
Fit the Cubist regression model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights. Currently not fully supported.
- Return type:
- Returns:
self (CubistRegressor) – Fitted regressor.
- property feature_importances_: ndarray[tuple[Any, ...], dtype[_ScalarT]]¶
Feature importances based on split usage.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (CubistRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (CubistRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.ObliqueRandomForestRegressor(n_estimators=100, oblique_method='ridge', criterion='squared_error', max_depth=None, min_samples_split=2, min_samples_leaf=1, max_features='sqrt', max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, feature_combinations=2, ridge_alpha=1.0)[source]¶
Bases:
BaseEstimator,RegressorMixinOblique Random Forest for regression.
Same as ObliqueRandomForestClassifier but for continuous targets. Uses variance reduction (MSE) as the splitting criterion.
- Parameters:
n_estimators (int, default=100) – Number of trees in the forest.
oblique_method (str, default='ridge') – Method for finding oblique split directions: - ‘ridge’: Ridge regression (recommended) - ‘pca’: Principal Component Analysis - ‘random’: Random projections (fastest) - ‘householder’: Householder reflections Note: ‘lda’ and ‘svm’ fall back to ‘ridge’ for regression.
criterion (str, default='squared_error') – Splitting criterion: - ‘squared_error’: Mean squared error (variance reduction) - ‘absolute_error’: Mean absolute error
max_depth (int, default=None) – Maximum depth of each tree.
min_samples_split (int or float, default=2) – Minimum samples required to split a node.
min_samples_leaf (int or float, default=1) – Minimum samples required at a leaf.
max_features (int, float, str, or None, default='sqrt') – Features to consider per split.
max_leaf_nodes (int, default=None) – Maximum leaf nodes per tree.
min_impurity_decrease (float, default=0.0) – Minimum impurity decrease for split.
bootstrap (bool, default=True) – Whether to use bootstrap sampling.
oob_score (bool, default=False) – Whether to compute out-of-bag R² score.
n_jobs (int, default=None) – Number of parallel jobs.
random_state (int, RandomState, or None, default=None) – Random seed.
verbose (int, default=0) – Verbosity level.
warm_start (bool, default=False) – If True, reuse previous fit and add more trees.
feature_combinations (int, default=2) – Features per random combination.
ridge_alpha (float, default=1.0) – Ridge regularization strength.
- estimators_¶
The fitted tree estimators.
- Type:
- feature_importances_¶
Impurity-based feature importances.
- Type:
ndarray of shape (n_features_in_,)
Examples
>>> from endgame.models.trees import ObliqueRandomForestRegressor >>> from sklearn.datasets import make_regression >>> X, y = make_regression(n_samples=1000, n_features=10, random_state=42) >>> reg = ObliqueRandomForestRegressor(n_estimators=100, random_state=42) >>> reg.fit(X, y) >>> print(reg.score(X, y))
- fit(X, y, sample_weight=None)[source]¶
Build an oblique random forest from the training data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Return type:
- Returns:
self (object) – Fitted estimator.
- predict(X)[source]¶
Predict target values for samples in X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- apply(X)[source]¶
Apply trees to X, return leaf indices.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
X_leaves (ndarray of shape (n_samples, n_estimators)) – Leaf indices for each sample in each tree.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (ObliqueRandomForestRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (ObliqueRandomForestRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.ObliqueDecisionTreeClassifier(oblique_method='ridge', criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, max_features=None, min_impurity_decrease=0.0, random_state=None, ridge_alpha=1.0, feature_combinations=2)[source]¶
Bases:
ClassifierMixin,BaseEstimatorA single oblique decision tree for classification.
This is the base estimator used by ObliqueRandomForestClassifier. Uses linear combinations of features for splits, enabling better capture of diagonal decision boundaries.
- Parameters:
oblique_method (str, default='ridge') – Method for finding oblique splits: - ‘ridge’: Ridge regression on class labels (recommended) - ‘pca’: Principal Component Analysis - ‘lda’: Linear Discriminant Analysis - ‘random’: Random projections (fastest) - ‘svm’: Linear SVM hyperplane - ‘householder’: Householder reflections
criterion (str, default='gini') – Splitting criterion: ‘gini’ or ‘entropy’.
max_depth (int, default=None) – Maximum tree depth. None means unlimited.
min_samples_split (int or float, default=2) – Minimum samples required to split a node. If float, fraction of total samples.
min_samples_leaf (int or float, default=1) – Minimum samples required at a leaf. If float, fraction of total samples.
max_features (int, float, str, or None, default=None) – Features to consider per split: - int: Use exactly max_features - float: Use max_features * n_features (fraction) - ‘sqrt’: Use sqrt(n_features) - ‘log2’: Use log2(n_features) - None: Use all features
min_impurity_decrease (float, default=0.0) – Minimum impurity decrease required for split.
random_state (int, RandomState, or None, default=None) – Random seed.
ridge_alpha (float, default=1.0) – Ridge regularization for ‘ridge’ method.
feature_combinations (int, default=2) – Features per random combination (for ‘random’ method).
- tree_¶
The root node of the fitted tree.
- Type:
ObliqueTreeNode
- feature_importances_¶
Impurity-based feature importances.
- Type:
ndarray of shape (n_features_in_,)
- fit(X, y, sample_weight=None)[source]¶
Build the oblique decision tree.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target class labels.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Return type:
- Returns:
self (object) – Fitted estimator.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- apply(X)[source]¶
Return leaf indices for samples.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
X_leaves (ndarray of shape (n_samples,)) – Leaf node id for each sample.
- decision_path(X)[source]¶
Return decision path through the tree.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
indicator (ndarray of shape (n_samples, n_nodes)) – Dense matrix where element [i, j] = 1 if sample i passes through node j.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.ObliqueDecisionTreeRegressor(oblique_method='ridge', criterion='squared_error', max_depth=None, min_samples_split=2, min_samples_leaf=1, max_features=None, min_impurity_decrease=0.0, random_state=None, ridge_alpha=1.0, feature_combinations=2)[source]¶
Bases:
BaseEstimator,RegressorMixinA single oblique decision tree for regression.
This is the base estimator used by ObliqueRandomForestRegressor. Uses linear combinations of features for splits, enabling better capture of diagonal decision boundaries.
- Parameters:
oblique_method (str, default='ridge') – Method for finding oblique splits: - ‘ridge’: Ridge regression (recommended) - ‘pca’: Principal Component Analysis - ‘random’: Random projections (fastest) - ‘householder’: Householder reflections Note: ‘lda’ and ‘svm’ are not available for regression.
criterion (str, default='squared_error') – Splitting criterion: ‘squared_error’ or ‘absolute_error’.
max_depth (int, default=None) – Maximum tree depth. None means unlimited.
min_samples_split (int or float, default=2) – Minimum samples required to split a node.
min_samples_leaf (int or float, default=1) – Minimum samples required at a leaf.
max_features (int, float, str, or None, default=None) – Features to consider per split.
min_impurity_decrease (float, default=0.0) – Minimum impurity decrease required for split.
random_state (int, RandomState, or None, default=None) – Random seed.
ridge_alpha (float, default=1.0) – Ridge regularization for ‘ridge’ method.
feature_combinations (int, default=2) – Features per random combination (for ‘random’ method).
- tree_¶
The root node of the fitted tree.
- Type:
ObliqueTreeNode
- feature_importances_¶
Impurity-based feature importances.
- Type:
ndarray of shape (n_features_in_,)
- fit(X, y, sample_weight=None)[source]¶
Build the oblique decision tree.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights.
- Return type:
- Returns:
self (object) – Fitted estimator.
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (ObliqueDecisionTreeRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (ObliqueDecisionTreeRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.QuantileRegressorForest(n_estimators=100, quantiles=0.5, criterion='squared_error', max_depth=None, min_samples_split=2, min_samples_leaf=1, max_features=1.0, max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, max_samples=None, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False)[source]¶
Bases:
BaseEstimator,RegressorMixinRandom Forest for conditional quantile estimation.
Quantile Regression Forests (QRF) estimate the full conditional distribution P(Y|X), allowing prediction of any quantile, not just the mean. This is essential for: - Prediction intervals with coverage guarantees - Uncertainty quantification in regression - Asymmetric loss functions (e.g., inventory optimization)
The forest works by tracking which training samples end up in each leaf of each tree. At prediction time, for a test point x, we collect all training samples from the leaves that x falls into across all trees, then compute empirical quantiles from this collection.
- Parameters:
n_estimators (int, default=100) – Number of trees in the forest.
quantiles (float or array-like of floats, default=0.5) – Quantile(s) to predict in [0, 1]. - Single float: predict that quantile - Array: predict multiple quantiles simultaneously Default is 0.5 (median), which is more robust than mean. Common choices: [0.1, 0.5, 0.9] for prediction intervals.
criterion (str, default='squared_error') – Splitting criterion for trees: - ‘squared_error’: Mean squared error (standard) - ‘absolute_error’: Mean absolute error - ‘friedman_mse’: Improved MSE for gradient boosting - ‘poisson’: Poisson deviance
max_depth (int, default=None) – Maximum depth of each tree. None means unlimited depth (nodes expand until all leaves are pure or contain fewer than min_samples_split samples).
min_samples_split (int or float, default=2) – Minimum samples required to split an internal node. If float, fraction of n_samples.
min_samples_leaf (int or float, default=1) – Minimum samples required at a leaf node. If float, fraction of n_samples.
max_features (int, float, str, or None, default=1.0) – Number of features to consider for each split: - int: Use exactly max_features - float: Use max_features * n_features - ‘sqrt’: Use sqrt(n_features) - ‘log2’: Use log2(n_features) - None or 1.0: Use all features Note: For QRF, using all features is often preferred to get better leaf distributions.
max_leaf_nodes (int, default=None) – Maximum number of leaf nodes per tree.
min_impurity_decrease (float, default=0.0) – Minimum impurity decrease required for a split.
bootstrap (bool, default=True) – Whether to use bootstrap sampling for each tree.
max_samples (int or float, default=None) – Number of samples to draw for each tree (with replacement): - None: Draw n_samples samples - int: Draw max_samples samples - float: Draw max_samples * n_samples samples
oob_score (bool, default=False) – Whether to compute out-of-bag score. Note: OOB for QRF uses median prediction for scoring.
n_jobs (int, default=None) – Number of parallel jobs for fitting trees. None means 1, -1 means all processors.
random_state (int, RandomState, or None, default=None) – Random seed for reproducibility.
verbose (int, default=0) – Verbosity level for fitting progress.
warm_start (bool, default=False) – If True, reuse previous fit and add more trees.
- feature_importances_¶
Impurity-based feature importances.
- Type:
ndarray of shape (n_features_in_,)
Examples
>>> from endgame.models.trees import QuantileRegressorForest >>> from sklearn.datasets import make_regression >>> X, y = make_regression(n_samples=1000, n_features=10, random_state=42) >>> >>> # Predict median (more robust than mean) >>> qrf = QuantileRegressorForest(n_estimators=100, quantiles=0.5, random_state=42) >>> qrf.fit(X, y) >>> y_median = qrf.predict(X[:5]) >>> >>> # Prediction intervals >>> qrf = QuantileRegressorForest(n_estimators=100, quantiles=[0.1, 0.5, 0.9]) >>> qrf.fit(X, y) >>> intervals = qrf.predict(X[:5]) # Shape: (5, 3) >>> lower, median, upper = intervals[:, 0], intervals[:, 1], intervals[:, 2] >>> >>> # Change quantiles after fitting (no retraining needed!) >>> qrf.quantiles = [0.25, 0.75] >>> iqr_bounds = qrf.predict(X[:5]) # Shape: (5, 2)
Notes
QRF is particularly useful for:
Prediction Intervals: Unlike standard RF which only gives point predictions, QRF can give valid prediction intervals by predicting e.g., [0.05, 0.95] quantiles for 90% coverage.
Heteroscedastic Data: When variance of Y varies with X, QRF naturally captures this through different interval widths.
Conformal Prediction: QRF quantiles can be calibrated using conformal methods for guaranteed coverage.
Asymmetric Loss: For problems where over/under-prediction have different costs (inventory, load forecasting), predict the appropriate quantile that minimizes expected loss.
Memory Usage: QRF stores all training y values at leaves, which uses more memory than standard RF. For very large datasets, consider using subsampling via max_samples parameter.
References
Meinshausen, N. (2006). “Quantile Regression Forests.” Journal of Machine Learning Research, 7, 983-999.
- fit(X, y, sample_weight=None)[source]¶
Build a quantile regression forest from training data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights for fitting.
- Return type:
- Returns:
self (object) – Fitted estimator.
- predict(X)[source]¶
Predict quantile(s) for samples in X.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray) – Predicted quantile values. - If single quantile: shape (n_samples,) - If multiple quantiles: shape (n_samples, n_quantiles)
- predict_quantiles(X, quantiles)[source]¶
Predict specific quantiles without changing the estimator.
This allows predicting different quantiles from what was specified at construction, without modifying the estimator’s state.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
quantiles (float or array-like of floats) – Quantile(s) to predict in [0, 1].
- Return type:
- Returns:
y_pred (ndarray) – Predicted quantile values. - If single quantile: shape (n_samples,) - If multiple quantiles: shape (n_samples, n_quantiles)
- predict_interval(X, coverage=0.9)[source]¶
Predict symmetric prediction interval.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
coverage (float, default=0.9) – Desired coverage probability in (0, 1). E.g., 0.9 gives [5th, 95th] percentile interval.
- Return type:
- Returns:
lower (ndarray of shape (n_samples,)) – Lower bound of prediction interval.
upper (ndarray of shape (n_samples,)) – Upper bound of prediction interval.
- predict_mean(X)[source]¶
Predict conditional mean (like standard Random Forest).
This collects all y values from relevant leaves and returns their mean, equivalent to standard RF prediction.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted mean values.
- predict_std(X)[source]¶
Predict conditional standard deviation (uncertainty).
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_std (ndarray of shape (n_samples,)) – Predicted standard deviation for each sample.
- apply(X)[source]¶
Apply trees to X, return leaf indices.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
X_leaves (ndarray of shape (n_samples, n_estimators)) – Leaf indices for each sample in each tree.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (QuantileRegressorForest)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (QuantileRegressorForest)
- Returns:
self (object) – The updated object.
- Return type:
- endgame.models.pinball_loss(y_true, y_pred, quantile)[source]¶
Compute pinball (quantile) loss.
The pinball loss is the proper scoring rule for quantile regression: L(y, q) = (1-alpha) * max(q-y, 0) + alpha * max(y-q, 0)
where alpha is the quantile.
- Parameters:
- Return type:
- Returns:
float – Mean pinball loss.
Examples
>>> y_true = np.array([1, 2, 3, 4, 5]) >>> y_pred = np.array([1.5, 2.0, 2.5, 4.0, 4.5]) >>> pinball_loss(y_true, y_pred, 0.5) # Median loss
- endgame.models.interval_coverage(y_true, lower, upper)[source]¶
Compute empirical coverage of prediction intervals.
- class endgame.models.EvolutionaryTreeClassifier(population_size=100, n_generations=100, max_depth=8, min_samples_leaf=5, alpha=1.0, mutation_prob=0.8, crossover_prob=0.2, patience=20, warm_start=True, n_jobs=1, random_state=None, verbose=False)[source]¶
Bases:
_EvolutionaryTreeBase,ClassifierMixinEvolutionary Tree Classifier - Globally optimal trees via genetic algorithms.
Unlike greedy methods (CART, C4.5) that make locally optimal splits, evolutionary trees use genetic algorithms to search for globally optimal tree structures. This can discover patterns that greedy methods miss.
- Parameters:
population_size (int, default=100) – Number of trees in the population. Larger populations explore more of the search space but are slower.
n_generations (int, default=100) – Maximum number of evolutionary generations.
max_depth (int, default=8) – Maximum depth of trees in the population.
min_samples_leaf (int, default=5) – Minimum samples required in a leaf node.
alpha (float, default=1.0) – Complexity penalty coefficient. Higher values favor simpler trees. Controls the BIC-type tradeoff: loss + alpha * complexity.
mutation_prob (float, default=0.8) – Probability of applying mutation to offspring.
crossover_prob (float, default=0.2) – Probability of using crossover vs just mutation.
patience (int, default=20) – Generations without improvement before early stopping.
warm_start (bool, default=True) – If True, seed population with a greedy tree for faster convergence.
n_jobs (int, default=1) – Number of parallel jobs for fitness evaluation. -1 means using all processors.
random_state (int, optional) – Random seed for reproducibility.
verbose (bool, default=False) – If True, print progress every 10 generations.
- classes_¶
Unique class labels.
- Type:
ndarray
- tree_¶
The best tree found during evolution.
- Type:
TreeNode
Examples
>>> from endgame.models.trees.evtree import EvolutionaryTreeClassifier >>> clf = EvolutionaryTreeClassifier( ... population_size=50, n_generations=50, random_state=42 ... ) >>> clf.fit(X_train, y_train) >>> y_pred = clf.predict(X_test)
Notes
Evolutionary trees are slower than greedy methods but can find better structures for complex problems. They’re particularly valuable for:
Ensemble diversity: Different inductive bias from greedy trees
Interpretability: Often finds simpler trees with similar accuracy
Avoiding local optima: Global search escapes greedy suboptimality
Performance tips: - Use warm_start=True (default) to seed with a greedy tree - Increase patience for harder problems - Use n_jobs=-1 for parallel fitness evaluation on large populations - Reduce population_size for faster (but potentially worse) results
References
Grubinger et al., “evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R” (2014)
- fit(X, y, **fit_params)[source]¶
Fit the evolutionary tree classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target class labels.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (EvolutionaryTreeClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.EvolutionaryTreeRegressor(population_size=100, n_generations=100, max_depth=8, min_samples_leaf=5, alpha=1.0, mutation_prob=0.8, crossover_prob=0.2, patience=20, warm_start=True, n_jobs=1, random_state=None, verbose=False)[source]¶
Bases:
_EvolutionaryTreeBase,RegressorMixinEvolutionary Tree Regressor - Globally optimal trees via genetic algorithms.
- Parameters:
population_size (int, default=100) – Number of trees in the population.
n_generations (int, default=100) – Maximum number of evolutionary generations.
max_depth (int, default=8) – Maximum depth of trees.
min_samples_leaf (int, default=5) – Minimum samples required in a leaf node.
alpha (float, default=1.0) – Complexity penalty coefficient.
mutation_prob (float, default=0.8) – Probability of applying mutation.
crossover_prob (float, default=0.2) – Probability of using crossover.
patience (int, default=20) – Generations without improvement before early stopping.
warm_start (bool, default=True) – Seed population with a greedy tree.
n_jobs (int, default=1) – Number of parallel jobs (-1 for all processors).
random_state (int, optional) – Random seed for reproducibility.
verbose (bool, default=False) – Print progress during training.
- tree_¶
The best tree found.
- Type:
TreeNode
Examples
>>> from endgame.models.trees.evtree import EvolutionaryTreeRegressor >>> reg = EvolutionaryTreeRegressor(population_size=50, random_state=42) >>> reg.fit(X_train, y_train) >>> y_pred = reg.predict(X_test)
- fit(X, y, **fit_params)[source]¶
Fit the evolutionary tree regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (EvolutionaryTreeRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.EBMClassifier(feature_names=None, feature_types=None, max_bins=512, max_interaction_bins=32, interactions=10, exclude=None, validation_size=0.15, outer_bags=8, inner_bags=0, learning_rate=0.02, greedy_ratio=10.0, cyclic_progress=False, smoothing_rounds=50, interaction_smoothing_rounds=50, max_rounds=25000, early_stopping_rounds=50, early_stopping_tolerance=1e-05, min_samples_leaf=4, min_hessian=0.0001, reg_alpha=0.0, reg_lambda=0.0, max_delta_step=0.0, gain_scale=5.0, min_cat_samples=10, cat_smooth=10.0, missing='separate', max_leaves=2, monotone_constraints=None, n_jobs=-2, random_state=42)[source]¶
Bases:
ClassifierMixin,EBMBaseExplainable Boosting Machine for Classification.
An interpretable classifier that combines the accuracy of gradient boosting with the transparency of Generalized Additive Models (GAMs).
- Parameters:
feature_names (list of str, optional) – Names for features. If None, uses default naming.
feature_types (list of str, optional) – Types for features (“continuous”, “nominal”, “ordinal”).
max_bins (int, default=1024) – Maximum number of bins for continuous features.
max_interaction_bins (int, default=64) – Maximum bins for interaction terms.
interactions (int, float, str, or list, default=10) – Number or specification of interaction terms to detect. Can be an integer (number of interactions), float (fraction), string like “3x” (multiple of features), or explicit list.
exclude (list, optional) – Features or interactions to exclude.
validation_size (float, default=0.15) – Fraction of data to use for validation during training.
outer_bags (int, default=14) – Number of outer bags for ensembling.
inner_bags (int, default=0) – Number of inner bags (0 means no inner bagging).
learning_rate (float, default=0.015) – Learning rate for boosting.
greedy_ratio (float, default=10.0) – Ratio controlling greedy vs cyclic feature selection.
cyclic_progress (bool, default=False) – If True, use cyclic progress; if False, use greedy.
smoothing_rounds (int, default=75) – Number of smoothing rounds for main effects.
interaction_smoothing_rounds (int, default=75) – Number of smoothing rounds for interactions.
max_rounds (int, default=50000) – Maximum number of boosting rounds.
early_stopping_rounds (int, default=100) – Stop if no improvement after this many rounds.
early_stopping_tolerance (float, default=1e-5) – Tolerance for early stopping.
min_samples_leaf (int, default=4) – Minimum samples in a leaf.
min_hessian (float, default=0.0001) – Minimum hessian in a leaf.
reg_alpha (float, default=0.0) – L1 regularization.
reg_lambda (float, default=0.0) – L2 regularization.
max_delta_step (float, default=0.0) – Maximum delta step (0 means no limit).
gain_scale (float, default=5.0) – Scale factor for gain computation.
min_cat_samples (int, default=10) – Minimum samples for categorical bins.
cat_smooth (float, default=10.0) – Smoothing for categorical features.
missing (str, default="separate") – How to handle missing values (“separate”, “min”, “max”).
max_leaves (int, default=2) – Maximum leaves per tree (2 = stumps).
monotone_constraints (list, optional) – Monotonicity constraints per feature (-1, 0, 1).
n_jobs (int, default=-2) – Number of jobs for parallel processing.
random_state (int, default=42) – Random state for reproducibility.
- classes_¶
Unique class labels.
- Type:
ndarray
- intercept_¶
Model intercept.
- Type:
ndarray
Examples
>>> from endgame.models import EBMClassifier >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> clf = EBMClassifier(interactions=5) >>> clf.fit(X, y) >>> clf.score(X, y) 0.98 >>> global_exp = clf.explain_global() >>> local_exp = clf.explain_local(X[:5])
- fit(X, y, sample_weight=None)[source]¶
Fit the EBM classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target labels.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
- Return type:
- Returns:
self (EBMClassifier) – Fitted classifier.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (EBMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (EBMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.EBMRegressor(feature_names=None, feature_types=None, max_bins=1024, max_interaction_bins=64, interactions=10, exclude=None, validation_size=0.15, outer_bags=8, inner_bags=0, learning_rate=0.02, greedy_ratio=10.0, cyclic_progress=False, smoothing_rounds=50, interaction_smoothing_rounds=50, max_rounds=25000, early_stopping_rounds=50, early_stopping_tolerance=1e-05, min_samples_leaf=4, min_hessian=0.0001, reg_alpha=0.0, reg_lambda=0.0, max_delta_step=0.0, gain_scale=5.0, min_cat_samples=10, cat_smooth=10.0, missing='separate', max_leaves=2, monotone_constraints=None, n_jobs=-2, random_state=42)[source]¶
Bases:
EBMBase,RegressorMixinExplainable Boosting Machine for Regression.
An interpretable regressor that combines the accuracy of gradient boosting with the transparency of Generalized Additive Models (GAMs).
- Parameters:
feature_names (list of str, optional) – Names for features. If None, uses default naming.
feature_types (list of str, optional) – Types for features (“continuous”, “nominal”, “ordinal”).
max_bins (int, default=1024) – Maximum number of bins for continuous features.
max_interaction_bins (int, default=64) – Maximum bins for interaction terms.
interactions (int, float, str, or list, default=10) – Number or specification of interaction terms to detect.
exclude (list, optional) – Features or interactions to exclude.
validation_size (float, default=0.15) – Fraction of data to use for validation during training.
outer_bags (int, default=14) – Number of outer bags for ensembling.
inner_bags (int, default=0) – Number of inner bags.
learning_rate (float, default=0.015) – Learning rate for boosting.
greedy_ratio (float, default=10.0) – Ratio controlling greedy vs cyclic feature selection.
cyclic_progress (bool, default=False) – If True, use cyclic progress.
smoothing_rounds (int, default=75) – Number of smoothing rounds.
interaction_smoothing_rounds (int, default=75) – Number of smoothing rounds for interactions.
max_rounds (int, default=50000) – Maximum number of boosting rounds.
early_stopping_rounds (int, default=100) – Stop if no improvement after this many rounds.
early_stopping_tolerance (float, default=1e-5) – Tolerance for early stopping.
min_samples_leaf (int, default=4) – Minimum samples in a leaf.
min_hessian (float, default=0.0001) – Minimum hessian in a leaf.
reg_alpha (float, default=0.0) – L1 regularization.
reg_lambda (float, default=0.0) – L2 regularization.
max_delta_step (float, default=0.0) – Maximum delta step.
gain_scale (float, default=5.0) – Scale factor for gain computation.
min_cat_samples (int, default=10) – Minimum samples for categorical bins.
cat_smooth (float, default=10.0) – Smoothing for categorical features.
missing (str, default="separate") – How to handle missing values.
max_leaves (int, default=2) – Maximum leaves per tree.
monotone_constraints (list, optional) – Monotonicity constraints per feature.
n_jobs (int, default=-2) – Number of jobs for parallel processing.
random_state (int, default=42) – Random state for reproducibility.
Examples
>>> from endgame.models import EBMRegressor >>> from sklearn.datasets import load_diabetes >>> X, y = load_diabetes(return_X_y=True) >>> reg = EBMRegressor(interactions=10) >>> reg.fit(X, y) >>> reg.score(X, y) 0.72 >>> importance = reg.get_feature_importances()
- fit(X, y, sample_weight=None)[source]¶
Fit the EBM regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
- Return type:
- Returns:
self (EBMRegressor) – Fitted regressor.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (EBMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (EBMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- endgame.models.show_explanation(explanation, share_graphs=False)[source]¶
Display an EBM explanation in a dashboard.
This is a convenience function that wraps interpret’s show() function.
- Parameters:
explanation (EBMExplanation) – Explanation from explain_global() or explain_local().
share_graphs (bool, default=False) – If True, link axes across graphs.
- Returns:
None – Opens an interactive dashboard.
- class endgame.models.MARSRegressor(max_terms=None, max_degree=1, penalty=3.0, thresh=0.001, min_span=None, endspan=None, fast_k=20, feature_names=None, allow_linear=True)[source]¶
Bases:
BaseEstimator,RegressorMixinMultivariate Adaptive Regression Splines for regression.
MARS builds a piecewise linear model by discovering knots (thresholds) where the relationship between features and target changes. The model is an additive combination of hinge functions: max(0, x - knot) and max(0, knot - x).
- Parameters:
max_terms (int, default=None) – Maximum number of basis functions (including intercept). If None, defaults to min(100, max(20, 2 * n_features)) + 1.
max_degree (int, default=1) – Maximum degree of interactions. 1 = additive model (no interactions) 2 = pairwise interactions allowed 3 = three-way interactions allowed
penalty (float, default=3.0) – Generalized Cross-Validation (GCV) penalty per knot. Higher values produce simpler models. Typical range: 2-4.
thresh (float, default=0.001) – Forward pass stopping threshold. Stops adding terms when R^2 improvement falls below this value.
min_span (int, default=None) – Minimum number of observations between knots. If None, automatically calculated based on data size.
endspan (int, default=None) – Minimum observations before first knot and after last knot. If None, automatically calculated based on data size.
fast_k (int, default=20) – In the forward pass, only consider the best fast_k parent terms when searching for new basis functions. Set to 0 to consider all parents (slower but potentially better). This is “Fast MARS.”
feature_names (list of str, default=None) – Names for features (used in summary output).
allow_linear (bool, default=True) – If True, allows linear terms (no hinge) for features that appear to have purely linear relationships.
- coef_¶
Coefficients for each basis function.
- Type:
ndarray of shape (n_basis_functions,)
- feature_names_in_¶
Names of features seen during fit.
- Type:
ndarray of shape (n_features_in_,)
Examples
>>> from endgame.models import MARSRegressor >>> import numpy as np >>> X = np.random.randn(100, 3) >>> y = np.maximum(0, X[:, 0] - 0.5) + 2 * X[:, 1] + np.random.randn(100) * 0.1 >>> model = MARSRegressor(max_degree=1) >>> model.fit(X, y) MARSRegressor(max_degree=1) >>> print(model.summary()) >>> predictions = model.predict(X)
References
Friedman, J. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1-67.
Friedman, J. (1993). Fast MARS. Stanford University Technical Report 110.
Milborrow, S. Earth package vignette (R implementation reference).
- fit(X, y, sample_weight=None)[source]¶
Fit the MARS model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), default=None) – Individual weights for each sample.
- Return type:
- Returns:
self (object) – Fitted estimator.
- summary()[source]¶
Return a human-readable summary of the model.
Returns a string showing: - Model equation with all basis functions - R^2 and GCV statistics - Variable importance
- Return type:
- Returns:
summary (str) – Formatted model summary.
- compute_variable_importance()[source]¶
Compute variable importance based on GCV decrease.
For each variable, compute how much GCV would increase if all basis functions involving that variable were removed.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (MARSRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (MARSRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.MARSClassifier(max_terms=None, max_degree=1, penalty=3.0, thresh=0.001, min_span=None, endspan=None, fast_k=20, feature_names=None, allow_linear=True, method='logistic', logistic_C=1.0)[source]¶
Bases:
ClassifierMixin,BaseEstimatorMARS for classification via logistic regression on basis functions.
Fits a MARS model to discover basis functions, then uses logistic regression on those basis functions for classification.
- Parameters:
max_terms (int, default=None) – Maximum number of basis functions (including intercept). If None, defaults to min(100, max(20, 2 * n_features)) + 1.
max_degree (int, default=1) – Maximum degree of interactions. 1 = additive model (no interactions) 2 = pairwise interactions allowed 3 = three-way interactions allowed
penalty (float, default=3.0) – Generalized Cross-Validation (GCV) penalty per knot. Higher values produce simpler models.
thresh (float, default=0.001) – Forward pass stopping threshold.
min_span (int, default=None) – Minimum number of observations between knots.
endspan (int, default=None) – Minimum observations before first knot and after last knot.
fast_k (int, default=20) – Fast MARS parameter (see MARSRegressor).
feature_names (list of str, default=None) – Names for features.
allow_linear (bool, default=True) – If True, allows linear terms.
method (str, default='logistic') – Classification method: - ‘logistic’: Logistic regression on MARS basis functions - ‘threshold’: Threshold regression predictions at 0.5
logistic_C (float, default=1.0) – Regularization parameter for logistic regression. Only used when method=’logistic’.
- mars_regressor_¶
Underlying MARS model for basis function discovery.
- Type:
- logistic_¶
Fitted logistic regression model (when method=’logistic’).
- Type:
LogisticRegression
Examples
>>> from endgame.models import MARSClassifier >>> import numpy as np >>> X = np.random.randn(100, 3) >>> y = (X[:, 0] + X[:, 1] > 0).astype(int) >>> model = MARSClassifier(max_degree=1) >>> model.fit(X, y) MARSClassifier(max_degree=1) >>> predictions = model.predict(X) >>> probas = model.predict_proba(X)
- fit(X, y, sample_weight=None)[source]¶
Fit the MARS classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target class labels.
sample_weight (array-like of shape (n_samples,), default=None) – Individual weights for each sample.
- Return type:
- Returns:
self (object) – Fitted estimator.
- summary()[source]¶
Return a human-readable summary of the model.
- Return type:
- Returns:
summary (str) – Formatted model summary.
- property basis_functions_¶
Return basis functions from underlying MARS regressor.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (MARSClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (MARSClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.RuleFitRegressor(tree_generator=None, n_estimators=50, tree_max_depth=3, max_rules=300, min_support=0.01, max_support=0.99, alpha=None, cv=3, include_linear=True, standardize_linear=True, winsorize_linear=0.025, random_state=None, n_jobs=None)[source]¶
Bases:
BaseEstimator,RegressorMixinRuleFit: Rule-based regression combining tree ensembles with Lasso.
RuleFit generates interpretable models by extracting rules from a tree ensemble and fitting a sparse linear model on the original features plus binary rule features. The result is a human-readable model that shows exactly which rules and features drive predictions.
- Parameters:
tree_generator (estimator or None, default=None) – The tree ensemble used to generate rules. If None, uses GradientBoostingRegressor with default parameters. Must have estimators_ attribute after fitting (trees). Common choices: - GradientBoostingRegressor/Classifier - RandomForestRegressor/Classifier - ExtraTreesRegressor/Classifier
n_estimators (int, default=100) – Number of trees in the ensemble (if tree_generator is None). Ignored if tree_generator is provided.
tree_max_depth (int, default=3) – Maximum depth of trees (if tree_generator is None). Shallow trees (2-4) produce simpler, more interpretable rules. Ignored if tree_generator is provided.
max_rules (int or None, default=2000) – Maximum number of rules to extract. If None, extracts all rules. Rules are selected by coverage (fraction of samples satisfying rule).
min_support (float, default=0.01) – Minimum fraction of samples that must satisfy a rule for it to be included. Rules with lower support are discarded.
max_support (float, default=0.99) – Maximum fraction of samples satisfying a rule. Rules with higher support are too general and discarded.
alpha (float or None, default=None) – Lasso regularization strength. If None, selected via cross-validation. Higher values produce sparser (more interpretable) models.
cv (int, default=5) – Number of cross-validation folds for alpha selection. Only used if alpha is None.
include_linear (bool, default=True) – Whether to include original features (linear terms) in the model. If False, model uses only rule features.
standardize_linear (bool, default=True) – Whether to standardize linear features before fitting Lasso. Recommended for fair penalization across features.
winsorize_linear (float or None, default=0.025) – Winsorization quantile for linear features. If not None, clips extreme values at this quantile to reduce outlier influence.
random_state (int, RandomState, or None, default=None) – Random seed for reproducibility.
n_jobs (int, default=None) – Number of parallel jobs for cross-validation.
- rule_ensemble_¶
All extracted rules (before Lasso selection).
- Type:
RuleEnsemble
- coef_¶
Coefficients for all features (linear + rules).
- Type:
ndarray
- linear_coef_¶
Coefficients for original linear features.
- Type:
ndarray of shape (n_features_in_,)
- rule_coef_¶
Coefficients for rule features.
- Type:
ndarray of shape (n_rules,)
- feature_names_in_¶
Feature names seen during fit.
- Type:
ndarray of shape (n_features_in_,)
- tree_generator_¶
Fitted tree ensemble used for rule extraction.
- Type:
estimator
- feature_importances_¶
Importance of each original feature (sum of absolute coefficients for linear term and rules involving that feature).
- Type:
ndarray
Examples
>>> from endgame.models import RuleFitRegressor >>> from sklearn.datasets import make_regression >>> X, y = make_regression(n_samples=500, n_features=10, random_state=42) >>> model = RuleFitRegressor(tree_max_depth=3, random_state=42) >>> model.fit(X, y) >>> print(model.get_rules()) # Print selected rules >>> predictions = model.predict(X)
Notes
For best interpretability: - Use shallow trees (max_depth=2 or 3) for simpler rules - Use higher alpha (more regularization) for sparser models - Provide meaningful feature_names for readable rule output
References
Friedman, J. H., & Popescu, B. E. (2008). “Predictive learning via rule ensembles.” The Annals of Applied Statistics, 2(3), 916-954.
- fit(X, y, feature_names=None, sample_weight=None)[source]¶
Fit the RuleFit model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
feature_names (list of str, default=None) – Names for features. If None, uses [‘x0’, ‘x1’, …].
sample_weight (array-like of shape (n_samples,), default=None) – Sample weights for fitting.
- Returns:
self (object) – Fitted estimator.
- predict(X)[source]¶
Predict using the fitted RuleFit model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- transform(X)[source]¶
Transform X into rule features.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data.
- Returns:
X_rules (ndarray of shape (n_samples, n_rules)) – Binary matrix of rule satisfactions.
- get_rules(exclude_zero_coef=True, sort_by='importance')[source]¶
Get the extracted rules with their coefficients.
- Parameters:
exclude_zero_coef (bool, default=True) – If True, only return rules with non-zero coefficients.
sort_by (str, default='importance') – How to sort rules: - ‘importance’: By absolute coefficient value * support (descending) - ‘support’: By rule support/coverage (descending) - ‘coefficient’: By raw coefficient value (descending) - ‘length’: By number of conditions (ascending)
- Return type:
list[WSGIEnvironment]- Returns:
rules (list of dict) – Each dict contains: - ‘rule’: str, human-readable rule - ‘coefficient’: float, Lasso coefficient - ‘support’: float, fraction of samples satisfying rule - ‘importance’: float, |coefficient| * support - ‘conditions’: list of Condition objects
- summary()[source]¶
Return a human-readable summary of the model.
- Return type:
- Returns:
summary (str) – Formatted model summary including: - Model statistics - Top rules by importance - Linear feature coefficients
- visualize_rule(rule_idx)[source]¶
Visualize a specific rule’s effect.
- Parameters:
rule_idx (int) – Index of the rule to visualize.
- Returns:
fig (matplotlib Figure) – Visualization of rule effect.
- set_fit_request(*, feature_names='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
feature_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
feature_namesparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (RuleFitRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (RuleFitRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.RuleFitClassifier(tree_generator=None, n_estimators=50, tree_max_depth=3, max_rules=300, min_support=0.01, max_support=0.99, alpha=None, cv=3, include_linear=True, standardize_linear=True, winsorize_linear=0.025, random_state=None, n_jobs=None, class_weight=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorRuleFit for classification.
For binary classification, uses logistic regression on rule features. For multiclass, uses one-vs-rest strategy.
- Parameters:
tree_generator (estimator or None, default=None) – The tree ensemble used to generate rules. If None, uses GradientBoostingClassifier with default parameters.
n_estimators (int, default=100) – Number of trees in the ensemble (if tree_generator is None).
tree_max_depth (int, default=3) – Maximum depth of trees (if tree_generator is None).
max_rules (int or None, default=2000) – Maximum number of rules to extract.
min_support (float, default=0.01) – Minimum fraction of samples that must satisfy a rule.
max_support (float, default=0.99) – Maximum fraction of samples satisfying a rule.
alpha (float or None, default=None) – Regularization strength (1/C for logistic regression). If None, selected via cross-validation.
cv (int, default=5) – Number of cross-validation folds for alpha selection.
include_linear (bool, default=True) – Whether to include original features (linear terms).
standardize_linear (bool, default=True) – Whether to standardize linear features.
winsorize_linear (float or None, default=0.025) – Winsorization quantile for linear features.
random_state (int, RandomState, or None, default=None) – Random seed for reproducibility.
n_jobs (int, default=None) – Number of parallel jobs.
class_weight (dict, 'balanced', or None, default=None) – Weights for classes in the logistic regression step.
- (Plus all attributes from RuleFitRegressor)
- fit(X, y, feature_names=None, sample_weight=None)[source]¶
Fit the RuleFit classifier.
- Parameters:
- Returns:
self (object) – Fitted estimator.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- predict_log_proba(X)[source]¶
Predict class log-probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
log_proba (ndarray of shape (n_samples, n_classes)) – Class log-probabilities.
- transform(X)[source]¶
Transform X into rule features.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data.
- Returns:
X_rules (ndarray of shape (n_samples, n_rules)) – Binary matrix of rule satisfactions.
- get_rules(exclude_zero_coef=True, sort_by='importance')[source]¶
Get the extracted rules with their coefficients.
- Parameters:
- Return type:
list[WSGIEnvironment]- Returns:
rules (list of dict) – Each dict contains rule information.
- get_equation(precision=4)[source]¶
Get the model as a human-readable equation.
For binary classification only.
- set_fit_request(*, feature_names='$UNCHANGED$', sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
feature_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
feature_namesparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (RuleFitClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (RuleFitClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.FURIAClassifier(max_rules=50, min_support=2, max_conditions=10, fuzzify=True, rule_stretching=True, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorFuzzy Unordered Rule Induction Algorithm (FURIA) classifier.
FURIA learns a set of fuzzy rules for classification. It extends RIPPER with fuzzy boundaries that allow soft decision regions.
- Parameters:
max_rules (int, default=50) – Maximum number of rules to learn.
min_support (int, default=2) – Minimum number of positive examples a rule must cover.
max_conditions (int, default=10) – Maximum number of conditions per rule.
fuzzify (bool, default=True) – Whether to fuzzify rules after learning.
rule_stretching (bool, default=True) – Whether to use rule stretching for uncovered instances.
random_state (int or None, default=None) – Random seed for reproducibility.
- classes_¶
Unique class labels.
- Type:
ndarray
Examples
>>> from endgame.models.rules import FURIAClassifier >>> clf = FURIAClassifier(max_rules=30) >>> clf.fit(X_train, y_train) >>> predictions = clf.predict(X_test) >>> print(clf.get_rules_str())
- fit(X, y, feature_names=None)[source]¶
Fit the FURIA classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target labels.
feature_names (list of str, optional) – Names for the features.
- Return type:
- Returns:
self (FURIAClassifier) – Fitted classifier.
- predict_proba(X)[source]¶
Predict class probabilities.
Uses the fuzzy firing strengths to compute class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- get_rules_str()[source]¶
Get a human-readable string representation of the rules.
- Return type:
- Returns:
rules_str (str) – String representation of all rules.
- get_rule_importance()[source]¶
Get feature importance based on rule usage.
- Return type:
- Returns:
importance (ndarray of shape (n_features,)) – Feature importance scores.
- set_fit_request(*, feature_names='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
feature_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
feature_namesparameter infit.self (FURIAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (FURIAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.FuzzyRule(conditions=<factory>, consequent=0, weight=1.0, support=0)[source]¶
Bases:
objectA fuzzy rule consisting of multiple fuzzy conditions.
The rule’s firing strength is the minimum membership across all conditions (fuzzy AND via t-norm).
- Parameters:
conditions (list of FuzzyCondition) – The fuzzy conditions that make up this rule.
consequent (int) – The class label this rule predicts.
weight (float) – Rule weight (confidence/accuracy on training data).
support (int) – Number of training instances covered by this rule.
- conditions: list[FuzzyCondition]¶
- firing_strength(X)[source]¶
Compute firing strength (membership degree) for samples.
Uses minimum t-norm for fuzzy AND.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input data.
- Return type:
- Returns:
strength (ndarray of shape (n_samples,)) – Firing strength in [0, 1] for each sample.
- covers(X, threshold=0.0)[source]¶
Check which samples are covered by this rule.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input data.
threshold (float) – Minimum firing strength to be considered covered.
- Return type:
- Returns:
covered (ndarray of shape (n_samples,) dtype=bool) – True where sample is covered by rule.
- class endgame.models.FuzzyCondition(feature_idx, feature_name, lower_bound=None, upper_bound=None, lower_support=None, upper_support=None)[source]¶
Bases:
objectA fuzzy condition with trapezoidal membership function.
The membership function is defined by four points (a, b, c, d): - membership = 0 for x <= a or x >= d - membership = 1 for b <= x <= c - linear interpolation for a < x < b and c < x < d
For crisp conditions, a=b and c=d.
- Parameters:
feature_idx (int) – Index of the feature.
feature_name (str) – Name of the feature.
lower_bound (float or None) – Lower bound (a, b) of trapezoidal function. None means -inf.
upper_bound (float or None) – Upper bound (c, d) of trapezoidal function. None means +inf.
lower_support (float) – Support point a (where membership starts rising).
upper_support (float) – Support point d (where membership ends).
- membership(X)[source]¶
Compute fuzzy membership degree for samples.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Input data.
- Return type:
- Returns:
membership (ndarray of shape (n_samples,)) – Membership degree in [0, 1] for each sample.
- class endgame.models.TANClassifier(smoothing=1.0, root_selection='max_mi', missing_values='error', max_cardinality=100, auto_discretize=True, discretizer_strategy='mdlp', discretizer_max_bins=10, n_jobs=1, random_state=None, verbose=False)[source]¶
Bases:
BaseBayesianClassifierTree Augmented Naive Bayes classifier.
TAN extends Naive Bayes by allowing features to have one additional parent from other features, forming a tree structure. This captures pairwise feature dependencies while remaining computationally tractable.
- Parameters:
smoothing (float, default=1.0) – Laplace smoothing parameter (alpha). Use 0 for MLE, 1 for add-one smoothing.
root_selection ({'max_mi', 'random', int}, default='max_mi') – How to select the root of the tree structure. - ‘max_mi’: Feature with highest MI with target - ‘random’: Random selection (for ensembling) - int: Specific feature index
missing_values ({'error', 'marginalize'}, default='error') – Strategy for missing values during predict.
auto_discretize (bool, default=True) – If True, automatically discretize continuous features.
discretizer_strategy (str, default='mdlp') – Discretization strategy: ‘mdlp’, ‘equal_width’, ‘equal_freq’, ‘kmeans’.
discretizer_max_bins (int, default=10) – Maximum bins per feature when auto-discretizing.
n_jobs (int, default=1) – Parallelization for MI computation. -1 uses all cores.
random_state (int, optional) – Random seed for reproducibility.
verbose (bool, default=False) – Enable verbose output.
max_cardinality (int)
- structure_¶
Learned TAN structure after fit().
- Type:
nx.DiGraph
- cpts_¶
Conditional probability tables. cpts_[i] has shape that depends on parent configuration.
- class_prior_¶
Prior class probabilities P(Y).
- Type:
np.ndarray
- feature_importances_¶
Mutual information I(X_i; Y) normalized.
- Type:
np.ndarray
Examples
>>> from endgame.models.bayesian import TANClassifier >>> clf = TANClassifier(smoothing=1.0) >>> clf.fit(X_train, y_train) >>> clf.predict_proba(X_test)
- predict_proba(X)[source]¶
Predict class probabilities using TAN structure.
For each class c: P(Y=c|X) ∝ P(Y=c) * ∏_i P(X_i | Pa(X_i), Y=c)
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
ndarray of shape (n_samples, n_classes) – Class probabilities.
- class endgame.models.EBMCClassifier(score='bdeu', equivalent_sample_size=10.0, max_parents=3, max_features=None, convergence_threshold=0.0001, max_iter=100, use_equivalence_transform=True, smoothing=1.0, max_cardinality=100, auto_discretize=True, discretizer_strategy='mdlp', discretizer_max_bins=10, random_state=None, verbose=False)[source]¶
Bases:
BaseBayesianClassifierEfficient Bayesian Multivariate Classifier with automatic feature selection.
EBMC learns a Bayesian Network structure and then prunes features to those in the Markov Blanket of the target. This provides built-in feature selection while maintaining interpretability.
- Parameters:
score ({'bdeu', 'bic', 'k2'}, default='bdeu') – Scoring function for structure learning.
equivalent_sample_size (float, default=10.0) – ESS for BDeu score. Lower = more aggressive pruning.
max_parents (int, default=3) – Maximum parents per node (controls complexity).
max_features (int | None, default=None) – Maximum features to select. None = no limit.
convergence_threshold (float, default=1e-4) – Stop when score improvement falls below this.
max_iter (int, default=100) – Maximum iterations for structure search.
use_equivalence_transform (bool, default=True) – Whether to apply statistical equivalence transformation.
smoothing (float, default=1.0) – Laplace smoothing for CPT estimation.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
max_cardinality (int)
auto_discretize (bool)
discretizer_strategy (str)
discretizer_max_bins (int)
- structure_¶
Learned DAG structure.
- Type:
nx.DiGraph
Examples
>>> from endgame.models.bayesian import EBMCClassifier >>> clf = EBMCClassifier(max_parents=2) >>> clf.fit(X_train, y_train) >>> print(f"Selected {len(clf.selected_features_)} features") >>> clf.predict(X_test)
- class endgame.models.ESKDBClassifier(n_estimators=50, k=2, smoothing='hdp', diversity_method='sao', aggregation='averaging', n_jobs=-1, max_cardinality=100, auto_discretize=True, discretizer_strategy='mdlp', discretizer_max_bins=10, random_state=None, verbose=False)[source]¶
Bases:
BaseBayesianClassifierEnsemble of Selective K-Dependence Bayes classifiers.
ESKDB is a state-of-the-art BNC ensemble that achieves diversity through Stochastic Attribute Ordering (SAO) and/or bootstrapping.
- Parameters:
n_estimators (int, default=50) – Number of KDB models in ensemble.
k (int, default=2) – Maximum number of parent features per node (K-dependence).
smoothing ({'laplace', 'hdp'}, default='hdp') –
‘laplace’: Standard add-alpha smoothing
’hdp’: Hierarchical Dirichlet Process (adapts to sparsity)
diversity_method ({'sao', 'bootstrap', 'both'}, default='sao') – How to generate ensemble diversity: - ‘sao’: Stochastic Attribute Ordering - ‘bootstrap’: Sample with replacement - ‘both’: Combine both methods
aggregation ({'averaging', 'voting', 'stacking'}, default='averaging') – How to combine predictions.
n_jobs (int, default=-1) – Parallelization. -1 uses all cores.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
max_cardinality (int)
auto_discretize (bool)
discretizer_strategy (str)
discretizer_max_bins (int)
- estimators_¶
Fitted KDB models.
- Type:
- feature_importances_¶
Average feature importance across estimators.
- Type:
np.ndarray
Examples
>>> from endgame.models.bayesian import ESKDBClassifier >>> clf = ESKDBClassifier(n_estimators=50, k=2) >>> clf.fit(X_train, y_train) >>> clf.predict_proba(X_test)
- fit(X, y, **fit_params)[source]¶
Fit ensemble of KDB classifiers.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data. Can be continuous (will be auto-discretized if auto_discretize=True) or discrete/integer-valued.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- predict_proba(X)[source]¶
Predict class probabilities by aggregating estimators.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
ndarray of shape (n_samples, n_classes) – Class probabilities.
- class endgame.models.KDBClassifier(k=2, smoothing='laplace', smoothing_alpha=1.0, attribute_order=None, max_cardinality=100, auto_discretize=True, discretizer_strategy='mdlp', discretizer_max_bins=10, random_state=None, verbose=False)[source]¶
Bases:
BaseBayesianClassifierK-Dependence Bayes Classifier.
KDB allows each feature to have at most K feature parents plus the class as a parent. This generalizes: - K=0: Naive Bayes - K=1: One-Dependence Estimator (AODE) - K>=n_features: Unrestricted (but computationally expensive)
- Parameters:
k (int, default=2) – Maximum number of feature parents per node.
smoothing ({'laplace', 'hdp'}, default='laplace') – Smoothing method: - ‘laplace’: Standard add-alpha smoothing - ‘hdp’: Hierarchical Dirichlet Process
smoothing_alpha (float, default=1.0) – Smoothing parameter (for Laplace).
attribute_order (list[int] | None, default=None) – Custom attribute ordering. If None, uses MI ranking.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
max_cardinality (int)
auto_discretize (bool)
discretizer_strategy (str)
discretizer_max_bins (int)
- class endgame.models.AutoSLE(solvers=None, partition_method='spectral', max_cluster_size=50, edge_threshold=0.5, n_jobs=-1, random_state=None, verbose=False)[source]¶
Bases:
objectScalable structure learning for massive variable sets.
AutoSLE works by: 1. Partitioning variables into manageable clusters 2. Running multiple structure learning algorithms on each cluster 3. Combining results via edge voting 4. Learning inter-cluster edges 5. Fusing into a global DAG
- Parameters:
solvers (list[str], default=['pc', 'fges', 'hc']) – Base solvers to ensemble: - ‘pc’: PC-Stable (constraint-based, via pgmpy if available) - ‘fges’: Fast Greedy Equivalence Search (via causal-learn if available) - ‘hc’: Hill Climbing with restarts (built-in) - ‘ges’: Greedy Equivalence Search
partition_method ({'spectral', 'correlation', 'random'}, default='spectral') – How to partition variables into clusters.
max_cluster_size (int, default=50) – Maximum variables per cluster.
edge_threshold (float, default=0.5) – Minimum fraction of solvers that must agree on an edge.
n_jobs (int, default=-1) – Parallelization for cluster solving. -1 uses all cores.
random_state (int, optional) – Random seed for reproducibility.
verbose (bool, default=False) – Enable verbose output.
- structure_¶
Learned global DAG.
- Type:
nx.DiGraph
- cluster_assignments_¶
Which cluster each variable was assigned to.
- Type:
np.ndarray
Examples
>>> from endgame.models.bayesian.structure import AutoSLE >>> sle = AutoSLE(max_cluster_size=30) >>> structure = sle.learn(data, variable_names=['x1', 'x2', ...])
- learn(data, variable_names=None, cardinalities=None)[source]¶
Learn structure from data.
- Parameters:
- Return type:
DiGraph- Returns:
nx.DiGraph – Learned directed acyclic graph.
- class endgame.models.NGBoostRegressor(preset='endgame', distribution='normal', score='crps', n_estimators=None, learning_rate=None, minibatch_frac=None, col_sample=None, base_learner=None, natural_gradient=True, early_stopping_rounds=None, random_state=None, verbose=False, **kwargs)[source]¶
Bases:
EndgameEstimator,RegressorMixinNGBoost Regressor for probabilistic regression.
Produces full probability distributions for predictions, enabling uncertainty quantification and scoring with proper scoring rules.
- Parameters:
preset (str, default='endgame') – Hyperparameter preset: ‘endgame’, ‘fast’, ‘accurate’, ‘competition’.
distribution (str, default='normal') – Output distribution: ‘normal’, ‘lognormal’, ‘exponential’, ‘laplace’, ‘t’, ‘cauchy’, ‘poisson’.
score (str, default='crps') – Scoring rule: ‘crps’ (Continuous Ranked Probability Score), ‘mle’/’nll’ (Maximum Likelihood / Negative Log Likelihood).
n_estimators (int, optional) – Number of boosting iterations. Overrides preset.
learning_rate (float, optional) – Learning rate. Overrides preset.
minibatch_frac (float, optional) – Fraction of data to use in each iteration.
col_sample (float, optional) – Fraction of features to use in each iteration.
base_learner (estimator, optional) – Base learner for boosting. Default is DecisionTreeRegressor(max_depth=3).
natural_gradient (bool, default=True) – Use natural gradient (recommended).
early_stopping_rounds (int, optional) – Early stopping patience. If None, no early stopping.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
**kwargs – Additional parameters passed to NGBRegressor.
- model_¶
Fitted NGBoost model.
- Type:
NGBRegressor
- feature_importances_¶
Feature importances from the base learners.
- Type:
ndarray
Examples
>>> from endgame.models import NGBoostRegressor >>> model = NGBoostRegressor(distribution='normal', score='crps') >>> model.fit(X_train, y_train) >>> >>> # Point predictions >>> y_pred = model.predict(X_test) >>> >>> # Full distribution predictions >>> y_dist = model.pred_dist(X_test) >>> mean = y_dist.mean() >>> std = y_dist.std() >>> >>> # Prediction intervals >>> lower, upper = model.predict_interval(X_test, alpha=0.1) # 90% CI >>> >>> # Negative log-likelihood >>> nll = -y_dist.logpdf(y_test).mean()
References
Duan et al., 2020. “NGBoost: Natural Gradient Boosting for Probabilistic Prediction.” https://arxiv.org/abs/1910.03225
- fit(X, y, X_val=None, y_val=None, sample_weight=None, val_sample_weight=None)[source]¶
Fit the NGBoost regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
X_val (array-like, optional) – Validation features for early stopping.
y_val (array-like, optional) – Validation targets for early stopping.
sample_weight (array-like, optional) – Training sample weights.
val_sample_weight (array-like, optional) – Validation sample weights.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict the mean of the distribution.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted means.
- pred_dist(X)[source]¶
Predict the full distribution.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
dist (ngboost distribution) – Predicted distributions with methods: - mean(): Expected value - std(): Standard deviation - var(): Variance - logpdf(y): Log probability density - pdf(y): Probability density - cdf(y): Cumulative distribution function - ppf(q): Percent point function (inverse CDF) - sample(n): Draw n samples
- predict_interval(X, alpha=0.1)[source]¶
Predict prediction intervals.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
alpha (float, default=0.1) – Significance level. Returns (1-alpha) prediction interval. E.g., alpha=0.1 returns 90% prediction interval.
- Return type:
- Returns:
lower (ndarray of shape (n_samples,)) – Lower bound of prediction interval.
upper (ndarray of shape (n_samples,)) – Upper bound of prediction interval.
- predict_std(X)[source]¶
Predict the standard deviation (uncertainty).
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
std (ndarray of shape (n_samples,)) – Predicted standard deviations.
- score(X, y, sample_weight=None)[source]¶
Return the negative log-likelihood on the given data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,)) – True target values.
sample_weight (array-like, optional) – Sample weights (not used, for API compatibility).
- Return type:
- Returns:
score (float) – Mean negative log-likelihood (lower is better).
- set_fit_request(*, X_val='$UNCHANGED$', sample_weight='$UNCHANGED$', val_sample_weight='$UNCHANGED$', y_val='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
X_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
X_valparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.val_sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
val_sample_weightparameter infit.y_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
y_valparameter infit.self (NGBoostRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NGBoostRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NGBoostClassifier(preset='endgame', n_estimators=None, learning_rate=None, minibatch_frac=None, col_sample=None, base_learner=None, natural_gradient=True, early_stopping_rounds=None, random_state=None, verbose=False, **kwargs)[source]¶
Bases:
ClassifierMixin,EndgameEstimatorNGBoost Classifier for probabilistic classification.
Produces calibrated probability distributions over classes, with proper uncertainty quantification.
- Parameters:
preset (str, default='endgame') – Hyperparameter preset: ‘endgame’, ‘fast’, ‘accurate’, ‘competition’.
n_estimators (int, optional) – Number of boosting iterations. Overrides preset.
learning_rate (float, optional) – Learning rate. Overrides preset.
minibatch_frac (float, optional) – Fraction of data to use in each iteration.
col_sample (float, optional) – Fraction of features to use in each iteration.
base_learner (estimator, optional) – Base learner for boosting. Default is DecisionTreeRegressor(max_depth=3).
natural_gradient (bool, default=True) – Use natural gradient (recommended).
early_stopping_rounds (int, optional) – Early stopping patience. If None, no early stopping.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
**kwargs – Additional parameters passed to NGBClassifier.
- model_¶
Fitted NGBoost model.
- Type:
NGBClassifier
- classes_¶
Unique class labels.
- Type:
ndarray
- feature_importances_¶
Feature importances from the base learners.
- Type:
ndarray
Examples
>>> from endgame.models import NGBoostClassifier >>> model = NGBoostClassifier(preset='endgame') >>> model.fit(X_train, y_train) >>> >>> # Class predictions >>> y_pred = model.predict(X_test) >>> >>> # Probability predictions >>> y_proba = model.predict_proba(X_test) >>> >>> # Distribution predictions >>> y_dist = model.pred_dist(X_test) >>> >>> # Log-loss >>> from sklearn.metrics import log_loss >>> loss = log_loss(y_test, y_proba)
References
Duan et al., 2020. “NGBoost: Natural Gradient Boosting for Probabilistic Prediction.” https://arxiv.org/abs/1910.03225
- fit(X, y, X_val=None, y_val=None, sample_weight=None, val_sample_weight=None)[source]¶
Fit the NGBoost classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target labels.
X_val (array-like, optional) – Validation features for early stopping.
y_val (array-like, optional) – Validation labels for early stopping.
sample_weight (array-like, optional) – Training sample weights.
val_sample_weight (array-like, optional) – Validation sample weights.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- pred_dist(X)[source]¶
Predict the full distribution over classes.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Returns:
dist (ngboost distribution) – Predicted distributions.
- score(X, y, sample_weight=None)[source]¶
Return accuracy on the given data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,)) – True labels.
sample_weight (array-like, optional) – Sample weights.
- Return type:
- Returns:
score (float) – Accuracy score.
- set_fit_request(*, X_val='$UNCHANGED$', sample_weight='$UNCHANGED$', val_sample_weight='$UNCHANGED$', y_val='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
X_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
X_valparameter infit.sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.val_sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
val_sample_weightparameter infit.y_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
y_valparameter infit.self (NGBoostClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NGBoostClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.MLPClassifier(hidden_dims=None, dropout=0.3, batch_norm=True, activation='relu', learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=10, class_weight=None, scheduler='cosine', device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,_BaseMLPEstimatorMulti-Layer Perceptron classifier.
PyTorch-based MLP with modern techniques for tabular classification.
- Parameters:
hidden_dims (List[int], default=[256, 128]) – Hidden layer dimensions.
dropout (float, default=0.3) – Dropout rate for regularization.
batch_norm (bool, default=True) – Whether to use batch normalization.
activation (str, default='relu') – Activation function.
learning_rate (float, default=1e-3) – Initial learning rate.
weight_decay (float, default=1e-5) – L2 regularization strength.
n_epochs (int, default=100) – Maximum number of training epochs.
batch_size (int, default=256) – Training batch size.
early_stopping (int, default=10) – Patience for early stopping.
class_weight (str or dict, optional) – Class weights: ‘balanced’ or dict mapping classes to weights.
scheduler (str, default='cosine') – Learning rate scheduler.
device (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted PyTorch model.
- Type:
_MLPModule
Examples
>>> from endgame.models.neural import MLPClassifier >>> clf = MLPClassifier(hidden_dims=[128, 64], n_epochs=50) >>> clf.fit(X_train, y_train, val_data=(X_val, y_val)) >>> predictions = clf.predict(X_test) >>> probabilities = clf.predict_proba(X_test)
- fit(X, y, val_data=None)[source]¶
Fit the classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
val_data (tuple of (X_val, y_val), optional) – Validation data for early stopping.
- Return type:
- Returns:
self – Fitted classifier.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
ndarray of shape (n_samples,) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
ndarray of shape (n_samples, n_classes) – Class probabilities.
- set_fit_request(*, val_data='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
val_data (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
val_dataparameter infit.self (MLPClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (MLPClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.MLPRegressor(hidden_dims=None, dropout=0.3, batch_norm=True, activation='relu', learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=10, loss='mse', scheduler='cosine', device='auto', random_state=None, verbose=False)[source]¶
Bases:
_BaseMLPEstimator,RegressorMixinMulti-Layer Perceptron regressor.
PyTorch-based MLP with modern techniques for tabular regression.
- Parameters:
hidden_dims (List[int], default=[256, 128]) – Hidden layer dimensions.
dropout (float, default=0.3) – Dropout rate for regularization.
batch_norm (bool, default=True) – Whether to use batch normalization.
activation (str, default='relu') – Activation function.
learning_rate (float, default=1e-3) – Initial learning rate.
weight_decay (float, default=1e-5) – L2 regularization strength.
n_epochs (int, default=100) – Maximum number of training epochs.
batch_size (int, default=256) – Training batch size.
early_stopping (int, default=10) – Patience for early stopping.
loss (str, default='mse') – Loss function: ‘mse’, ‘mae’, ‘huber’.
scheduler (str, default='cosine') – Learning rate scheduler.
device (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
- model_¶
Fitted PyTorch model.
- Type:
_MLPModule
Examples
>>> from endgame.models.neural import MLPRegressor >>> reg = MLPRegressor(hidden_dims=[128, 64], n_epochs=50) >>> reg.fit(X_train, y_train, val_data=(X_val, y_val)) >>> predictions = reg.predict(X_test)
- fit(X, y, val_data=None)[source]¶
Fit the regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.
val_data (tuple of (X_val, y_val), optional) – Validation data for early stopping.
- Return type:
- Returns:
self – Fitted regressor.
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input samples.
- Return type:
- Returns:
ndarray of shape (n_samples,) or (n_samples, n_targets) – Predicted values.
- set_fit_request(*, val_data='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
val_data (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
val_dataparameter infit.self (MLPRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (MLPRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.EmbeddingMLPClassifier(categorical_features=None, embedding_dims=None, hidden_dims=None, dropout=0.3, embedding_dropout=0.1, batch_norm=True, activation='relu', learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=10, class_weight=None, scheduler='cosine', device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,_BaseEmbeddingMLPMLP classifier with entity embeddings for categorical features.
Learns dense representations for categorical variables, enabling effective handling of high-cardinality features.
- Parameters:
categorical_features (List[str] or List[int], optional) – Names or indices of categorical features. If None, auto-detects based on unique values.
embedding_dims (Dict[str, int] or int, optional) – Embedding dimensions per feature or default dimension.
hidden_dims (List[int], default=[256, 128]) – Hidden layer dimensions.
dropout (float, default=0.3) – Dropout rate for hidden layers.
embedding_dropout (float, default=0.1) – Dropout rate for embeddings.
batch_norm (bool, default=True) – Whether to use batch normalization.
activation (str, default='relu') – Activation function.
learning_rate (float, default=1e-3) – Initial learning rate.
weight_decay (float, default=1e-5) – L2 regularization strength.
n_epochs (int, default=100) – Maximum training epochs.
batch_size (int, default=256) – Training batch size.
early_stopping (int, default=10) – Early stopping patience.
class_weight (str or dict, optional) – Class weights: ‘balanced’ or dict.
scheduler (str, default='cosine') – Learning rate scheduler.
device (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted PyTorch model.
- Type:
_EmbeddingMLPModule
Examples
>>> from endgame.models.neural import EmbeddingMLPClassifier >>> clf = EmbeddingMLPClassifier( ... categorical_features=['category', 'brand'], ... embedding_dims={'category': 10, 'brand': 8}, ... hidden_dims=[128, 64] ... ) >>> clf.fit(X_train, y_train, val_data=(X_val, y_val)) >>> predictions = clf.predict(X_test) >>> # Get learned embeddings >>> category_embeddings = clf.get_embeddings('category')
- fit(X, y, val_data=None)[source]¶
Fit the classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
val_data (tuple of (X_val, y_val), optional) – Validation data for early stopping.
- Return type:
- Returns:
self – Fitted classifier.
- set_fit_request(*, val_data='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
val_data (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
val_dataparameter infit.self (EmbeddingMLPClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (EmbeddingMLPClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.EmbeddingMLPRegressor(categorical_features=None, embedding_dims=None, hidden_dims=None, dropout=0.3, embedding_dropout=0.1, batch_norm=True, activation='relu', learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=10, loss='mse', scheduler='cosine', device='auto', random_state=None, verbose=False)[source]¶
Bases:
_BaseEmbeddingMLP,RegressorMixinMLP regressor with entity embeddings for categorical features.
Learns dense representations for categorical variables, enabling effective handling of high-cardinality features.
- Parameters:
categorical_features (List[str] or List[int], optional) – Names or indices of categorical features.
embedding_dims (Dict[str, int] or int, optional) – Embedding dimensions per feature or default dimension.
hidden_dims (List[int], default=[256, 128]) – Hidden layer dimensions.
dropout (float, default=0.3) – Dropout rate for hidden layers.
embedding_dropout (float, default=0.1) – Dropout rate for embeddings.
batch_norm (bool, default=True) – Whether to use batch normalization.
activation (str, default='relu') – Activation function.
learning_rate (float, default=1e-3) – Initial learning rate.
weight_decay (float, default=1e-5) – L2 regularization strength.
n_epochs (int, default=100) – Maximum training epochs.
batch_size (int, default=256) – Training batch size.
early_stopping (int, default=10) – Early stopping patience.
loss (str, default='mse') – Loss function: ‘mse’, ‘mae’, ‘huber’.
scheduler (str, default='cosine') – Learning rate scheduler.
device (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
- model_¶
Fitted PyTorch model.
- Type:
_EmbeddingMLPModule
Examples
>>> from endgame.models.neural import EmbeddingMLPRegressor >>> reg = EmbeddingMLPRegressor( ... categorical_features=['store_id', 'product_id'], ... embedding_dims=16 ... ) >>> reg.fit(X_train, y_train, val_data=(X_val, y_val)) >>> predictions = reg.predict(X_test)
- set_fit_request(*, val_data='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
val_data (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
val_dataparameter infit.self (EmbeddingMLPRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (EmbeddingMLPRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.TabNetClassifier(n_d=64, n_a=64, n_steps=5, gamma=1.5, n_independent=2, n_shared=2, momentum=0.3, clip_value=None, lambda_sparse=0.0001, optimizer_fn=None, optimizer_params=None, scheduler_fn=None, scheduler_params=None, mask_type='sparsemax', n_epochs=100, patience=15, batch_size=1024, virtual_batch_size=256, device_name='auto', random_state=None, verbose=0)[source]¶
Bases:
_BaseTabNetWrapper,ClassifierMixinTabNet classifier wrapper.
Attention-based deep learning architecture for tabular classification with built-in feature selection and interpretability.
- Parameters:
n_d (int, default=64) – Width of the decision prediction layer.
n_a (int, default=64) – Width of the attention embedding.
n_steps (int, default=5) – Number of decision steps.
gamma (float, default=1.5) – Coefficient for feature reusage.
n_independent (int, default=2) – Number of independent GLU layers.
n_shared (int, default=2) – Number of shared GLU layers.
momentum (float, default=0.3) – Batch normalization momentum.
clip_value (float, optional) – Gradient clipping value.
lambda_sparse (float, default=1e-4) – Sparsity regularization coefficient.
optimizer_params (dict, optional) – Optimizer parameters.
mask_type (str, default='sparsemax') – Attention type: ‘sparsemax’ or ‘entmax’.
n_epochs (int, default=100) – Maximum training epochs.
patience (int, default=15) – Early stopping patience.
batch_size (int, default=1024) – Training batch size.
virtual_batch_size (int, default=256) – Ghost Batch Normalization batch size.
device_name (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (int, default=0) – Verbosity level.
optimizer_fn (Any | None)
scheduler_fn (Any | None)
scheduler_params (dict | None)
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted TabNet model.
- Type:
Examples
>>> from endgame.models.neural import TabNetClassifier >>> clf = TabNetClassifier(n_steps=3, n_epochs=50) >>> clf.fit(X_train, y_train, eval_set=[(X_val, y_val)]) >>> predictions = clf.predict(X_test) >>> proba = clf.predict_proba(X_test) >>> # Get feature importance masks >>> explain_matrix, masks = clf.explain(X_test)
- fit(X, y, eval_set=None, eval_name=None, eval_metric=None, weights=None, **fit_params)[source]¶
Fit the classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
eval_set (list of (X, y) tuples, optional) – Validation sets for early stopping.
eval_name (list of str, optional) – Names for evaluation sets.
weights (int or ndarray, optional) – Sample weights (0 for unweighted, 1 for balanced, or array).
**fit_params – Additional fit parameters.
- Return type:
- Returns:
self – Fitted classifier.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like) – Input samples.
- Return type:
- Returns:
ndarray – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like) – Input samples.
- Return type:
- Returns:
ndarray – Class probabilities.
- set_fit_request(*, eval_metric='$UNCHANGED$', eval_name='$UNCHANGED$', eval_set='$UNCHANGED$', weights='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_metric (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_metricparameter infit.eval_name (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_nameparameter infit.eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.weights (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
weightsparameter infit.self (TabNetClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (TabNetClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.TabNetRegressor(n_d=64, n_a=64, n_steps=5, gamma=1.5, n_independent=2, n_shared=2, momentum=0.3, clip_value=None, lambda_sparse=0.0001, optimizer_fn=None, optimizer_params=None, scheduler_fn=None, scheduler_params=None, mask_type='sparsemax', n_epochs=100, patience=15, batch_size=1024, virtual_batch_size=256, device_name='auto', random_state=None, verbose=0)[source]¶
Bases:
_BaseTabNetWrapper,RegressorMixinTabNet regressor wrapper.
Attention-based deep learning architecture for tabular regression with built-in feature selection and interpretability.
- Parameters:
n_d (int, default=64) – Width of the decision prediction layer.
n_a (int, default=64) – Width of the attention embedding.
n_steps (int, default=5) – Number of decision steps.
gamma (float, default=1.5) – Coefficient for feature reusage.
n_independent (int, default=2) – Number of independent GLU layers.
n_shared (int, default=2) – Number of shared GLU layers.
momentum (float, default=0.3) – Batch normalization momentum.
clip_value (float, optional) – Gradient clipping value.
lambda_sparse (float, default=1e-4) – Sparsity regularization coefficient.
optimizer_params (dict, optional) – Optimizer parameters.
mask_type (str, default='sparsemax') – Attention type: ‘sparsemax’ or ‘entmax’.
n_epochs (int, default=100) – Maximum training epochs.
patience (int, default=15) – Early stopping patience.
batch_size (int, default=1024) – Training batch size.
virtual_batch_size (int, default=256) – Ghost Batch Normalization batch size.
device_name (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (int, default=0) – Verbosity level.
optimizer_fn (Any | None)
scheduler_fn (Any | None)
scheduler_params (dict | None)
- model_¶
Fitted TabNet model.
- Type:
Examples
>>> from endgame.models.neural import TabNetRegressor >>> reg = TabNetRegressor(n_steps=3, n_epochs=50) >>> reg.fit(X_train, y_train, eval_set=[(X_val, y_val)]) >>> predictions = reg.predict(X_test) >>> # Get feature importance masks >>> explain_matrix, masks = reg.explain(X_test)
- fit(X, y, eval_set=None, eval_name=None, eval_metric=None, weights=None, **fit_params)[source]¶
Fit the regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,) or (n_samples, n_targets)) – Target values.
eval_set (list of (X, y) tuples, optional) – Validation sets for early stopping.
eval_name (list of str, optional) – Names for evaluation sets.
weights (int or ndarray, optional) – Sample weights.
**fit_params – Additional fit parameters.
- Return type:
- Returns:
self – Fitted regressor.
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like) – Input samples.
- Return type:
- Returns:
ndarray – Predicted values.
- set_fit_request(*, eval_metric='$UNCHANGED$', eval_name='$UNCHANGED$', eval_set='$UNCHANGED$', weights='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_metric (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_metricparameter infit.eval_name (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_nameparameter infit.eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.weights (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
weightsparameter infit.self (TabNetRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (TabNetRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NeuralKDBClassifier(k=2, embedding_dim=16, hidden_dim=64, n_hidden_layers=2, epochs=20, batch_size=256, learning_rate=0.001, weight_decay=1e-05, dropout=0.1, device='auto', early_stopping=5, validation_fraction=0.1, max_cardinality=100, auto_discretize=True, discretizer_strategy='mdlp', discretizer_max_bins=10, random_state=None, verbose=False)[source]¶
Bases:
BaseBayesianClassifierK-Dependence Bayes with neural conditional probability estimators.
NeuralKDB maintains the interpretable DAG structure of classical KDB but uses neural networks to estimate conditional probabilities. This enables handling of high-cardinality features and better generalization.
- Parameters:
k (int, default=2) – Maximum parents per feature (excluding class).
embedding_dim (int, default=16) – Dimensionality of value embeddings.
hidden_dim (int, default=64) – Hidden layer size in conditional networks.
n_hidden_layers (int, default=2) – Number of hidden layers per conditional network.
epochs (int, default=20) – Training epochs.
batch_size (int, default=256) – Mini-batch size for training.
learning_rate (float, default=1e-3) – Adam learning rate.
weight_decay (float, default=1e-5) – L2 regularization.
dropout (float, default=0.1) – Dropout rate in networks.
device (str, default='auto') – ‘cuda’, ‘cpu’, or ‘auto’ (detect GPU).
early_stopping (int | None, default=5) – Stop if validation loss doesn’t improve for this many epochs. None disables early stopping.
validation_fraction (float, default=0.1) – Fraction of training data for validation (if X_val not provided).
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
max_cardinality (int)
auto_discretize (bool)
discretizer_strategy (str)
discretizer_max_bins (int)
- structure_¶
Learned KDB structure.
- Type:
nx.DiGraph
- conditionals_¶
Neural conditional estimators for each feature.
- Type:
nn.ModuleDict
- class_prior_¶
Prior class probabilities.
- Type:
np.ndarray
Examples
>>> from endgame.models.bayesian import NeuralKDBClassifier >>> clf = NeuralKDBClassifier(k=2, epochs=10) >>> clf.fit(X_train, y_train) >>> clf.predict_proba(X_test)
- fit(X, y, X_val=None, y_val=None, **fit_params)[source]¶
Fit the Neural KDB classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data. Can be continuous (will be auto-discretized if auto_discretize=True) or discrete/integer-valued.
y (array-like of shape (n_samples,)) – Target values.
X_val (np.ndarray, optional) – Validation features for early stopping.
y_val (np.ndarray, optional) – Validation targets.
- Return type:
- Returns:
self
- predict_proba(X)[source]¶
Compute P(Y|X) using neural conditionals.
For each class c: P(Y=c|X) ∝ P(Y=c) * ∏_i P(x_i | parents(x_i), Y=c)
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
ndarray of shape (n_samples, n_classes) – Class probabilities.
- set_fit_request(*, X_val='$UNCHANGED$', y_val='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
X_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
X_valparameter infit.y_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
y_valparameter infit.self (NeuralKDBClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.FTTransformerClassifier(n_blocks=3, d_token=192, n_heads=8, attention_dropout=0.2, ffn_dropout=0.1, residual_dropout=0.0, d_ffn_factor=1.3333333333333333, learning_rate=0.0001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=15, cat_cardinality_threshold=20, device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,BaseEstimatorFeature Tokenizer Transformer for tabular classification.
Transforms each feature into an embedding and applies transformer layers. Currently state-of-the-art for deep learning on tabular data.
- Parameters:
n_blocks (int, default=3) – Number of transformer blocks.
d_token (int, default=192) – Embedding dimension for each feature token.
n_heads (int, default=8) – Number of attention heads.
attention_dropout (float, default=0.2) – Attention dropout rate.
ffn_dropout (float, default=0.1) – Feed-forward dropout rate.
residual_dropout (float, default=0.0) – Residual connection dropout.
d_ffn_factor (float, default=4/3) – FFN hidden dimension factor (d_ffn = d_token * d_ffn_factor).
learning_rate (float, default=1e-4) – Learning rate.
weight_decay (float, default=1e-5) – L2 regularization.
n_epochs (int, default=100) – Maximum training epochs.
batch_size (int, default=256) – Training batch size.
early_stopping (int, default=15) – Early stopping patience.
cat_cardinality_threshold (int, default=20) – Treat features with <= this many unique values as categorical.
device (str, default='auto') – Device: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted PyTorch model.
- Type:
_FTTransformerModule
Examples
>>> from endgame.models.tabular import FTTransformerClassifier >>> clf = FTTransformerClassifier(n_blocks=3, d_token=192) >>> clf.fit(X_train, y_train, eval_set=(X_val, y_val)) >>> proba = clf.predict_proba(X_test)
- fit(X, y, eval_set=None, **fit_params)[source]¶
Fit the FT-Transformer classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Training labels.
eval_set (tuple of (X_val, y_val), optional) – Validation set for early stopping.
- Return type:
- Returns:
self – Fitted classifier.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like) – Test samples.
- Return type:
- Returns:
ndarray of shape (n_samples, n_classes) – Class probabilities.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like) – Test samples.
- Return type:
- Returns:
ndarray – Predicted class labels.
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (FTTransformerClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (FTTransformerClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.FTTransformerRegressor(n_blocks=3, d_token=192, n_heads=8, attention_dropout=0.2, ffn_dropout=0.1, residual_dropout=0.0, d_ffn_factor=1.3333333333333333, learning_rate=0.0001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=15, cat_cardinality_threshold=20, device='auto', random_state=None, verbose=False)[source]¶
Bases:
BaseEstimator,RegressorMixinFeature Tokenizer Transformer for regression.
Same architecture as FTTransformerClassifier but with regression head.
- Parameters:
n_blocks (int, default=3) – Number of transformer blocks.
d_token (int, default=192) – Embedding dimension.
n_heads (int, default=8) – Number of attention heads.
attention_dropout (float, default=0.2) – Attention dropout.
ffn_dropout (float, default=0.1) – Feed-forward dropout.
learning_rate (float, default=1e-4) – Learning rate.
weight_decay (float, default=1e-5) – L2 regularization.
n_epochs (int, default=100) – Maximum epochs.
batch_size (int, default=256) – Batch size.
early_stopping (int, default=15) – Early stopping patience.
device (str, default='auto') – Device.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Verbose output.
residual_dropout (float)
d_ffn_factor (float)
cat_cardinality_threshold (int)
Examples
>>> reg = FTTransformerRegressor() >>> reg.fit(X_train, y_train, eval_set=(X_val, y_val)) >>> predictions = reg.predict(X_test)
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (FTTransformerRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (FTTransformerRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.SAINTClassifier(n_layers=3, d_model=32, n_heads=4, attention_dropout=0.1, ffn_dropout=0.1, d_ffn_factor=4.0, use_intersample=True, learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=15, validation_fraction=0.1, cat_cardinality_threshold=20, device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,BaseEstimatorSAINT: Self-Attention and Intersample Attention Transformer.
Combines column-wise self-attention with row-wise (intersample) attention to capture both feature interactions and sample similarities.
- Parameters:
n_layers (int, default=3) – Number of SAINT layers. 2-4 works well for most datasets.
d_model (int, default=32) – Model dimension.
n_heads (int, default=4) – Number of attention heads.
attention_dropout (float, default=0.1) – Attention dropout.
ffn_dropout (float, default=0.1) – Feed-forward dropout.
d_ffn_factor (float, default=4.0) – FFN hidden dimension factor.
use_intersample (bool, default=True) – Whether to use intersample attention (unique to SAINT).
learning_rate (float, default=1e-3) – Learning rate. Higher rates (1e-3) often work better than 1e-4.
weight_decay (float, default=1e-5) – L2 regularization.
n_epochs (int, default=100) – Maximum epochs.
batch_size (int, default=256) – Batch size.
early_stopping (int, default=15) – Early stopping patience.
validation_fraction (float, default=0.1) – Fraction of training data to use for validation when eval_set not provided.
cat_cardinality_threshold (int, default=20) – Threshold for categorical detection.
device (str, default='auto') – Device.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Verbose output.
- classes_¶
Class labels.
- Type:
ndarray
- model_¶
Fitted model.
- Type:
_SAINTModule
Examples
>>> clf = SAINTClassifier(n_layers=3, d_model=32) >>> clf.fit(X_train, y_train, eval_set=(X_val, y_val)) >>> proba = clf.predict_proba(X_test)
Notes
SAINT’s intersample attention allows it to consider relationships between different samples, which can be powerful for learning patterns that span across the dataset.
For best performance: - Use an eval_set for early stopping (or validation_fraction > 0) - Start with n_layers=3 and increase if underfitting - Higher learning rates (1e-3) often work better than typical transformer LR
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (SAINTClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (SAINTClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.SAINTRegressor(n_layers=6, d_model=32, n_heads=8, attention_dropout=0.1, ffn_dropout=0.1, d_ffn_factor=4.0, use_intersample=True, learning_rate=0.0001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=15, cat_cardinality_threshold=20, device='auto', random_state=None, verbose=False)[source]¶
Bases:
BaseEstimator,RegressorMixinSAINT for regression.
Same architecture as SAINTClassifier but with MSE loss.
Parameters are the same as SAINTClassifier except no n_classes.
- Parameters:
n_layers (int)
d_model (int)
n_heads (int)
attention_dropout (float)
ffn_dropout (float)
d_ffn_factor (float)
use_intersample (bool)
learning_rate (float)
weight_decay (float)
n_epochs (int)
batch_size (int)
early_stopping (int)
cat_cardinality_threshold (int)
device (str)
random_state (int | None)
verbose (bool)
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (SAINTRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (SAINTRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NODEClassifier(n_layers=1, n_trees=128, tree_depth=4, choice_function='softmax', bin_function='sigmoid', learning_rate=0.01, weight_decay=1e-05, n_epochs=100, batch_size=128, early_stopping=20, max_grad_norm=1.0, validation_fraction=0.1, device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,BaseEstimatorNODE: Neural Oblivious Decision Ensembles for classification.
A differentiable ensemble of oblivious decision trees. Bridges the gap between gradient boosting and neural networks.
- Parameters:
n_layers (int, default=1) – Number of dense NODE layers. Start with 1 for most datasets.
n_trees (int, default=128) – Number of trees per layer. 64-256 works well for most datasets.
tree_depth (int, default=4) – Depth of each oblivious tree. 3-5 works well; deeper trees risk overfitting.
choice_function (str, default='softmax') – Soft choice function: ‘entmax15’, ‘softmax’. Softmax is more stable.
bin_function (str, default='sigmoid') – Binning function: ‘entmoid15’, ‘sigmoid’. Sigmoid is more stable.
learning_rate (float, default=0.01) – Learning rate. Higher values (0.01-0.1) often work better for NODE.
weight_decay (float, default=1e-5) – L2 regularization.
n_epochs (int, default=100) – Maximum training epochs.
batch_size (int, default=128) – Training batch size. Smaller batches (64-256) often work better.
early_stopping (int, default=20) – Early stopping patience.
max_grad_norm (float, default=1.0) – Maximum gradient norm for clipping.
validation_fraction (float, default=0.1) – Fraction of training data to use for validation when eval_set not provided. Set to 0 to disable internal validation split (not recommended).
device (str, default='auto') – Device: ‘cuda’, ‘cpu’, ‘auto’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Verbose output.
- classes_¶
Class labels.
- Type:
ndarray
- model_¶
Fitted model.
- Type:
_NODEModule
Examples
>>> clf = NODEClassifier(n_layers=1, n_trees=128, tree_depth=4) >>> clf.fit(X_train, y_train, eval_set=(X_val, y_val)) >>> proba = clf.predict_proba(X_test)
Notes
NODE works best with: - An eval_set for early stopping (or validation_fraction > 0) - Higher learning rates (0.01-0.1) than typical neural networks - Smaller batch sizes (64-256) - Fewer/shallower trees than you might expect (start small)
When using sklearn’s cross_val_score (which doesn’t support eval_set), the model will automatically create an internal validation split using validation_fraction of the training data.
- fit(X, y, eval_set=None, **fit_params)[source]¶
Fit the NODE classifier.
- Parameters:
X (array-like) – Training features.
y (array-like) – Training labels.
eval_set (tuple, optional) – Validation set for early stopping.
- Return type:
- Returns:
self
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (NODEClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NODEClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NODERegressor(n_layers=2, n_trees=256, tree_depth=4, choice_function='softmax', bin_function='sigmoid', learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=512, early_stopping=15, max_grad_norm=1.0, device='auto', random_state=None, verbose=False)[source]¶
Bases:
BaseEstimator,RegressorMixinNODE for regression.
Same architecture as NODEClassifier but with MSE loss. See NODEClassifier for parameter descriptions.
- Parameters:
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (NODERegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NODERegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.ModernNCAClassifier(n_neighbors=32, embedding_dim=128, hidden_dims=None, temperature=0.1, dropout=0.1, learning_rate=0.001, weight_decay=1e-05, n_epochs=100, batch_size=256, early_stopping=15, device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,BaseEstimatorModern Neighborhood Component Analysis classifier.
A kNN-based approach with learned distance metric using neural networks. Surprisingly competitive with gradient boosting on many tasks.
The model learns an embedding space where samples of the same class are close together and samples of different classes are far apart. At inference, it uses kNN in this learned space.
- Parameters:
n_neighbors (int, default=32) – Number of neighbors for kNN prediction.
embedding_dim (int, default=128) – Dimension of learned embedding space.
hidden_dims (List[int], default=[256, 256]) – Hidden layer dimensions for embedding network.
temperature (float, default=0.1) – Softmax temperature for neighbor weighting.
dropout (float, default=0.1) – Dropout rate in embedding network.
learning_rate (float, default=1e-3) – Learning rate.
weight_decay (float, default=1e-5) – L2 regularization.
n_epochs (int, default=100) – Training epochs.
batch_size (int, default=256) – Batch size.
early_stopping (int, default=15) – Early stopping patience.
device (str, default='auto') – Device.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Verbose output.
- classes_¶
Class labels.
- Type:
ndarray
- model_¶
Fitted embedding network.
- Type:
_EmbeddingNetwork
- train_embeddings_¶
Embeddings of training data.
- Type:
ndarray
- train_labels_¶
Training labels.
- Type:
ndarray
Examples
>>> clf = ModernNCAClassifier(n_neighbors=32, embedding_dim=128) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
This approach is particularly effective when: - The decision boundary is locally smooth - Class separation can benefit from learned features - You want probabilistic predictions based on neighborhood
- fit(X, y, eval_set=None, **fit_params)[source]¶
Fit the ModernNCA classifier.
- Parameters:
X (array-like) – Training features.
y (array-like) – Training labels.
eval_set (tuple, optional) – Validation set.
- Return type:
- Returns:
self
- transform(X)[source]¶
Transform features to embedding space.
- Parameters:
X (array-like) – Input features.
- Return type:
- Returns:
ndarray – Learned embeddings.
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (ModernNCAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (ModernNCAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NAMClassifier(n_hidden=32, n_layers=2, activation='relu', dropout=0.0, feature_dropout=0.0, learning_rate=0.005, weight_decay=1e-05, output_regularization=0.0, n_epochs=50, batch_size=1024, early_stopping=10, validation_fraction=0.1, device='auto', random_state=None, verbose=False)[source]¶
Bases:
ClassifierMixin,BaseEstimatorNeural Additive Model for classification.
NAM learns a separate neural network for each input feature, providing interpretability similar to GAMs while leveraging neural network expressivity. The model is fully interpretable as you can visualize each feature’s contribution.
- Parameters:
n_hidden (int, default=64) – Number of hidden units per feature network.
n_layers (int, default=3) – Number of hidden layers per feature network.
activation (str, default='relu') – Activation function: ‘relu’ or ‘exu’ (exponential units). ExU can capture more complex shapes but is less stable.
dropout (float, default=0.0) – Dropout rate within feature networks.
feature_dropout (float, default=0.0) – Probability of dropping entire feature networks during training. Acts as regularization to prevent feature co-adaptation.
learning_rate (float, default=1e-3) – Learning rate for Adam optimizer.
weight_decay (float, default=1e-5) – L2 regularization strength.
output_regularization (float, default=0.0) – Regularization on feature network outputs to encourage sparsity.
n_epochs (int, default=100) – Maximum number of training epochs.
batch_size (int, default=128) – Training batch size.
early_stopping (int, default=20) – Early stopping patience (epochs without improvement).
validation_fraction (float, default=0.1) – Fraction of training data for validation when eval_set not provided.
device (str, default='auto') – Device to use: ‘cuda’, ‘cpu’, or ‘auto’.
random_state (int, optional) – Random seed for reproducibility.
verbose (bool, default=False) – Whether to print training progress.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
The fitted PyTorch module.
- Type:
_NAMModule
- feature_importances_¶
Feature importance scores.
- Type:
ndarray
Examples
>>> from endgame.models.tabular import NAMClassifier >>> clf = NAMClassifier(n_hidden=64, n_layers=3) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test) >>> # Get feature contributions for interpretability >>> contributions = clf.get_feature_contributions(X_test)
Notes
NAM provides several interpretability features: - get_feature_contributions(X): Get each feature’s contribution - feature_importances_: Overall feature importance - plot_feature_effects(): Visualize learned feature shapes (if matplotlib available)
For best results: - Start with default hyperparameters - Use feature_dropout > 0 if features are correlated - Try ‘exu’ activation for highly non-linear relationships
- fit(X, y, eval_set=None, **fit_params)[source]¶
Fit the NAM classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Training labels.
eval_set (tuple of (X_val, y_val), optional) – Validation set for early stopping. If not provided, uses validation_fraction of training data.
**fit_params (dict) – Additional parameters (ignored).
- Return type:
- Returns:
self (NAMClassifier) – Fitted classifier.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- get_feature_contributions(X)[source]¶
Get individual feature contributions for predictions.
This is the key interpretability feature of NAM. Each feature’s contribution shows how it affects the prediction independently.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to explain.
- Return type:
- Returns:
contributions (ndarray of shape (n_samples, n_features)) – Each feature’s contribution to the prediction. Positive values push toward higher class indices.
- plot_feature_effects(feature_idx=None, X=None, n_points=100)[source]¶
Plot learned feature effect shapes.
- Parameters:
- Returns:
fig (matplotlib Figure) – The figure object.
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (NAMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NAMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NAMRegressor(n_hidden=32, n_layers=2, activation='relu', dropout=0.0, feature_dropout=0.0, learning_rate=0.005, weight_decay=1e-05, output_regularization=0.0, n_epochs=50, batch_size=1024, early_stopping=10, validation_fraction=0.1, device='auto', random_state=None, verbose=False)[source]¶
Bases:
RegressorMixin,BaseEstimatorNeural Additive Model for regression.
Same architecture as NAMClassifier but with MSE loss for continuous target prediction.
- Parameters:
n_hidden (int, default=64) – Number of hidden units per feature network.
n_layers (int, default=3) – Number of hidden layers per feature network.
activation (str, default='relu') – Activation function: ‘relu’ or ‘exu’.
dropout (float, default=0.0) – Dropout rate within feature networks.
feature_dropout (float, default=0.0) – Probability of dropping entire feature networks.
learning_rate (float, default=1e-3) – Learning rate.
weight_decay (float, default=1e-5) – L2 regularization.
output_regularization (float, default=0.0) – Regularization on feature network outputs.
n_epochs (int, default=100) – Maximum training epochs.
batch_size (int, default=128) – Training batch size.
early_stopping (int, default=20) – Early stopping patience.
validation_fraction (float, default=0.1) – Fraction for validation when eval_set not provided.
device (str, default='auto') – Device to use.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Verbose output.
- model_¶
Fitted model.
- Type:
_NAMModule
- feature_importances_¶
Feature importance scores.
- Type:
ndarray
Examples
>>> from endgame.models.tabular import NAMRegressor >>> reg = NAMRegressor(n_hidden=64) >>> reg.fit(X_train, y_train) >>> predictions = reg.predict(X_test) >>> contributions = reg.get_feature_contributions(X_test)
- fit(X, y, eval_set=None, **fit_params)[source]¶
Fit the NAM regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Training targets.
eval_set (tuple of (X_val, y_val), optional) – Validation set for early stopping.
**fit_params (dict) – Additional parameters (ignored).
- Return type:
- Returns:
self (NAMRegressor) – Fitted regressor.
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- get_feature_contributions(X)[source]¶
Get individual feature contributions for predictions.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to explain.
- Return type:
- Returns:
contributions (ndarray of shape (n_samples, n_features)) – Each feature’s contribution to the prediction.
- plot_feature_effects(feature_idx=None, X=None, n_points=100)[source]¶
Plot learned feature effect shapes.
- set_fit_request(*, eval_set='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
eval_set (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
eval_setparameter infit.self (NAMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NAMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.GPClassifier(kernel='rbf', length_scale=1.0, n_restarts_optimizer=3, max_iter_predict=100, warm_start=False, multi_class='one_vs_rest', auto_scale=True, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorGaussian Process Classifier with competition-tuned defaults.
A Bayesian kernel method that provides probabilistic predictions with principled uncertainty estimates. Different inductive bias from trees and neural networks, making it valuable for ensemble diversity.
- Parameters:
kernel (str or sklearn kernel, default='rbf') – Kernel type. Options: ‘rbf’, ‘matern’, ‘matern12’, ‘matern32’, ‘matern52’, ‘rq’, ‘linear’, or a sklearn kernel object.
length_scale (float, default=1.0) – Length scale parameter for the kernel.
n_restarts_optimizer (int, default=3) – Number of restarts for the optimizer.
max_iter_predict (int, default=100) – Maximum iterations for prediction.
warm_start (bool, default=False) – Use previous fit as initialization.
multi_class (str, default='one_vs_rest') – Multi-class strategy: ‘one_vs_rest’ or ‘one_vs_one’.
auto_scale (bool, default=True) – Automatically scale features before fitting.
random_state (int, optional) – Random seed for reproducibility.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted sklearn GP classifier.
- Type:
GaussianProcessClassifier
Examples
>>> from endgame.models.kernel import GPClassifier >>> clf = GPClassifier(kernel='rbf', random_state=42) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test) >>> # Get uncertainty >>> proba, std = clf.predict_proba(X_test, return_std=True)
Notes
Gaussian Processes excel on small-medium datasets where uncertainty matters. They scale O(n^3) with training size, so not suitable for large datasets (>10k samples) without approximations.
- fit(X, y, **fit_params)[source]¶
Fit the Gaussian Process classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X, return_std=False)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
return_std (bool, default=False) – If True, also return uncertainty estimates.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
std (ndarray of shape (n_samples,), optional) – Uncertainty estimates (if return_std=True).
- set_predict_proba_request(*, return_std='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
predict_probamethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredict_probaif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict_proba.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
return_std (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
return_stdparameter inpredict_proba.self (GPClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (GPClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.GPRegressor(kernel='rbf', length_scale=1.0, alpha=1e-10, n_restarts_optimizer=3, normalize_y=True, auto_scale=True, random_state=None)[source]¶
Bases:
RegressorMixin,BaseEstimatorGaussian Process Regressor with competition-tuned defaults.
A Bayesian kernel method that provides predictions with principled uncertainty estimates through the posterior predictive distribution.
- Parameters:
kernel (str or sklearn kernel, default='rbf') – Kernel type. Options: ‘rbf’, ‘matern’, ‘matern12’, ‘matern32’, ‘matern52’, ‘rq’, ‘linear’, or a sklearn kernel object.
length_scale (float, default=1.0) – Length scale parameter for the kernel.
alpha (float, default=1e-10) – Value added to diagonal for numerical stability.
n_restarts_optimizer (int, default=3) – Number of restarts for the optimizer.
normalize_y (bool, default=True) – Normalize target values.
auto_scale (bool, default=True) – Automatically scale features before fitting.
random_state (int, optional) – Random seed for reproducibility.
- model_¶
Fitted sklearn GP regressor.
- Type:
GaussianProcessRegressor
Examples
>>> from endgame.models.kernel import GPRegressor >>> reg = GPRegressor(kernel='matern', random_state=42) >>> reg.fit(X_train, y_train) >>> y_pred, y_std = reg.predict(X_test, return_std=True) >>> # Prediction intervals >>> lower = y_pred - 1.96 * y_std >>> upper = y_pred + 1.96 * y_std
- fit(X, y, **fit_params)[source]¶
Fit the Gaussian Process regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- predict(X, return_std=False, return_cov=False)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
return_std (bool, default=False) – If True, return standard deviation of predictions.
return_cov (bool, default=False) – If True, return covariance of predictions.
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
y_std (ndarray of shape (n_samples,), optional) – Standard deviation (if return_std=True).
y_cov (ndarray of shape (n_samples, n_samples), optional) – Covariance matrix (if return_cov=True).
- predict_interval(X, alpha=0.05)[source]¶
Predict with prediction intervals.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
alpha (float, default=0.05) – Significance level (0.05 = 95% interval).
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Point predictions.
lower (ndarray of shape (n_samples,)) – Lower bound of prediction interval.
upper (ndarray of shape (n_samples,)) – Upper bound of prediction interval.
- sample_y(X, n_samples=1, random_state=None)[source]¶
Sample from the posterior predictive distribution.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Query points.
n_samples (int, default=1) – Number of samples to draw.
random_state (int, optional) – Random seed.
- Return type:
- Returns:
samples (ndarray of shape (n_query, n_samples)) – Samples from posterior predictive.
- set_predict_request(*, return_cov='$UNCHANGED$', return_std='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
return_cov (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
return_covparameter inpredict.return_std (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
return_stdparameter inpredict.self (GPRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (GPRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.SVMClassifier(kernel='rbf', C=1.0, gamma='scale', degree=3, probability=True, class_weight='balanced', auto_scale=True, max_iter=10000, cache_size=500, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorSupport Vector Machine Classifier with competition-tuned defaults.
A max-margin kernel classifier that finds the optimal separating hyperplane. Different optimization objective from probabilistic models, making it valuable for ensemble diversity.
- Parameters:
kernel (str, default='rbf') – Kernel type: ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’.
C (float, default=1.0) – Regularization parameter. Lower = more regularization.
gamma (str or float, default='scale') – Kernel coefficient for ‘rbf’, ‘poly’, ‘sigmoid’.
degree (int, default=3) – Degree for polynomial kernel.
probability (bool, default=True) – Enable probability estimates (uses Platt scaling).
class_weight (str or dict, default='balanced') – Class weights: ‘balanced’, None, or dict.
auto_scale (bool, default=True) – Automatically scale features before fitting.
max_iter (int, default=10000) – Maximum iterations for solver.
cache_size (float, default=500) – Kernel cache size in MB.
random_state (int, optional) – Random seed for reproducibility.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted sklearn SVC.
- Type:
SVC
- support_vectors_¶
Support vectors from training.
- Type:
ndarray
Examples
>>> from endgame.models.kernel import SVMClassifier >>> clf = SVMClassifier(kernel='rbf', C=1.0, random_state=42) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
SVMs work best when: - Features are scaled (auto_scale=True handles this) - Dataset is small-medium sized (scales O(n^2) to O(n^3)) - Clear margin separation exists
The max-margin objective is fundamentally different from log-loss (logistic regression) or GBDT objectives, providing ensemble diversity.
- fit(X, y, sample_weight=None, **fit_params)[source]¶
Fit the SVM classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
Uses Platt scaling for probability calibration.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- decision_function(X)[source]¶
Compute decision function values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
decision (ndarray) – Decision function values.
- property support_vectors_¶
Support vectors from training.
- property n_support_¶
Number of support vectors for each class.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (SVMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (SVMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.SVMRegressor(kernel='rbf', C=1.0, epsilon=0.1, gamma='scale', degree=3, auto_scale=True, max_iter=10000, cache_size=500)[source]¶
Bases:
RegressorMixin,BaseEstimatorSupport Vector Machine Regressor with competition-tuned defaults.
Epsilon-SVR that finds a tube around the data where deviations smaller than epsilon are ignored.
- Parameters:
kernel (str, default='rbf') – Kernel type: ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’.
C (float, default=1.0) – Regularization parameter.
epsilon (float, default=0.1) – Epsilon in the epsilon-SVR model.
degree (int, default=3) – Degree for polynomial kernel.
auto_scale (bool, default=True) – Automatically scale features before fitting.
max_iter (int, default=10000) – Maximum iterations for solver.
cache_size (float, default=500) – Kernel cache size in MB.
- model_¶
Fitted sklearn SVR.
- Type:
SVR
Examples
>>> from endgame.models.kernel import SVMRegressor >>> reg = SVMRegressor(kernel='rbf', C=1.0) >>> reg.fit(X_train, y_train) >>> y_pred = reg.predict(X_test)
- fit(X, y, sample_weight=None, **fit_params)[source]¶
Fit the SVM regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- property support_vectors_¶
Support vectors from training.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (SVMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (SVMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.ELMClassifier(n_hidden=500, activation='sigmoid', alpha=1e-06, auto_scale=True, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorExtreme Learning Machine Classifier.
A single-layer neural network with random input weights and analytically computed output weights. Training is extremely fast (milliseconds) because there’s no iterative optimization.
- Parameters:
n_hidden (int, default=500) – Number of hidden neurons.
activation (str or callable, default='sigmoid') – Activation function: ‘sigmoid’, ‘tanh’, ‘relu’, ‘leaky_relu’, ‘sin’, ‘hardlim’, or a callable.
alpha (float, default=1e-6) – Regularization parameter for ridge regression.
auto_scale (bool, default=True) – Automatically scale features before fitting.
random_state (int, optional) – Random seed for reproducibility.
- classes_¶
Unique class labels.
- Type:
ndarray
- input_weights_¶
Random input-to-hidden weights.
- Type:
ndarray
- biases_¶
Random hidden layer biases.
- Type:
ndarray
- output_weights_¶
Learned hidden-to-output weights.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import ELMClassifier >>> clf = ELMClassifier(n_hidden=500, random_state=42) >>> clf.fit(X_train, y_train) # Milliseconds! >>> proba = clf.predict_proba(X_test)
Notes
ELM is valuable for ensemble diversity because: 1. No backpropagation - fundamentally different optimization 2. Random projections explore different feature spaces 3. Extremely fast - can train many models for ensemble selection 4. Often surprisingly competitive with slower methods
The analytical solution is: beta = pinv(H) @ T where H is the hidden layer output and T is the target.
- fit(X, y, **fit_params)[source]¶
Fit the ELM classifier.
Training is O(n * m * h) where n=samples, m=features, h=hidden. The closed-form solution makes this extremely fast.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities (softmax normalized).
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (ELMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.ELMRegressor(n_hidden=500, activation='tanh', alpha=0.01, auto_scale=True, random_state=None)[source]¶
Bases:
RegressorMixin,BaseEstimatorExtreme Learning Machine Regressor.
A single-layer neural network with random input weights and analytically computed output weights for regression.
- Parameters:
n_hidden (int, default=500) – Number of hidden neurons.
activation (str or callable, default='tanh') – Activation function. ‘tanh’ is preferred for regression (unbounded, symmetric). ‘sigmoid’ compresses to [0,1].
alpha (float, default=0.01) – Regularization parameter for ridge regression on output weights.
auto_scale (bool, default=True) – Automatically scale features before fitting.
random_state (int, optional) – Random seed for reproducibility.
- input_weights_¶
Random input-to-hidden weights.
- Type:
ndarray
- output_weights_¶
Learned hidden-to-output weights.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import ELMRegressor >>> reg = ELMRegressor(n_hidden=500, random_state=42) >>> reg.fit(X_train, y_train) >>> y_pred = reg.predict(X_test)
- fit(X, y, **fit_params)[source]¶
Fit the ELM regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted values.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (ELMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NaiveBayesClassifier(variant='auto', var_smoothing=1e-09, alpha=1.0, binarize=0.0, fit_prior=True, class_prior=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorNaive Bayes Classifier with automatic variant selection.
Automatically selects the appropriate Naive Bayes variant based on feature characteristics, or uses a specified variant.
The feature independence assumption is fundamentally different from tree-based models (which capture interactions) and neural networks (which learn complex dependencies), making this valuable for ensemble diversity.
- Parameters:
variant (str, default='auto') – Naive Bayes variant: - ‘auto’: Automatically select based on features - ‘gaussian’: For continuous features - ‘bernoulli’: For binary features - ‘multinomial’: For count/frequency features - ‘complement’: For imbalanced text classification
var_smoothing (float, default=1e-9) – Portion of the largest variance of all features added to variances for stability (Gaussian only).
alpha (float, default=1.0) – Additive smoothing parameter (Bernoulli, Multinomial, Complement).
binarize (float or None, default=0.0) – Threshold for binarizing features (Bernoulli only). None means features are already binary.
fit_prior (bool, default=True) – Whether to learn class prior probabilities.
class_prior (array-like, optional) – Prior probabilities of the classes.
- classes_¶
Unique class labels.
- Type:
ndarray
- model_¶
Fitted Naive Bayes model.
- Type:
sklearn NB estimator
Examples
>>> from endgame.models.baselines import NaiveBayesClassifier >>> clf = NaiveBayesClassifier(variant='auto') >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
Despite the naive independence assumption, Naive Bayes often works surprisingly well because: 1. Classification only requires correct ordering, not accurate probabilities 2. Dependencies often “cancel out” when aggregated 3. Regularization effect from the strong prior
For ensembles, NB provides diversity because it makes fundamentally different errors from models that capture feature interactions.
- fit(X, y, sample_weight=None, **fit_params)[source]¶
Fit the Naive Bayes classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- predict_log_proba(X)[source]¶
Predict log class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
log_proba (ndarray of shape (n_samples, n_classes)) – Log class probabilities.
- property feature_log_prob_¶
Log probability of features given a class (for discrete NB).
- property class_log_prior_¶
Log probability of each class.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (NaiveBayesClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NaiveBayesClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LDAClassifier(solver='svd', shrinkage='auto', n_components=None, store_covariance=False, tol=0.0001)[source]¶
Bases:
ClassifierMixin,BaseEstimatorLinear Discriminant Analysis Classifier.
LDA assumes that all classes share the same covariance matrix. This leads to linear decision boundaries between classes.
- Parameters:
solver (str, default='svd') – Solver: ‘svd’, ‘lsqr’, ‘eigen’.
shrinkage (str, float, or None, default='auto') – Shrinkage parameter: ‘auto’ (Ledoit-Wolf), float in [0,1], or None.
n_components (int, optional) – Number of components for dimensionality reduction.
store_covariance (bool, default=False) – Store the covariance matrix.
tol (float, default=1e-4) – Tolerance for singular value decomposition.
- classes_¶
Unique class labels.
- Type:
ndarray
- coef_¶
Weights of the features.
- Type:
ndarray
- intercept_¶
Intercept term.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import LDAClassifier >>> clf = LDAClassifier(shrinkage='auto') >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
LDA is different from logistic regression because: 1. LDA is generative (models P(X|y)), LR is discriminative (models P(y|X)) 2. LDA assumes Gaussian class-conditional distributions 3. LDA can be more efficient with limited data
The shrinkage=’auto’ option uses Ledoit-Wolf estimation which improves performance when n_features > n_samples.
- fit(X, y, **fit_params)[source]¶
Fit the LDA classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
- Return type:
- Returns:
self
- property coef_¶
Feature weights.
- property intercept_¶
Intercept term.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LDAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.QDAClassifier(reg_param=0.0, store_covariance=False, tol=0.0001)[source]¶
Bases:
ClassifierMixin,BaseEstimatorQuadratic Discriminant Analysis Classifier.
QDA allows each class to have its own covariance matrix, leading to quadratic decision boundaries between classes.
- Parameters:
- classes_¶
Unique class labels.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import QDAClassifier >>> clf = QDAClassifier(reg_param=0.1) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
QDA is more flexible than LDA because it allows different class covariances. However, this requires estimating more parameters: - LDA: O(d^2) for shared covariance - QDA: O(K * d^2) for K classes
Use reg_param > 0 when you have few samples per class to regularize the covariance estimates toward the identity matrix.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (QDAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.RDAClassifier(alpha=0.5, shrinkage=0.0, store_covariance=False)[source]¶
Bases:
ClassifierMixin,BaseEstimatorRegularized Discriminant Analysis Classifier.
RDA interpolates between LDA and QDA using a regularization parameter. This allows finding the optimal trade-off between the bias of LDA and the variance of QDA.
- Parameters:
alpha (float, default=0.5) – Interpolation parameter between LDA (alpha=1) and QDA (alpha=0). alpha=0.5 is a common middle ground.
shrinkage (float, default=0.0) – Shrinkage toward scaled identity: cov = (1-shrinkage)*cov + shrinkage*trace(cov)/d*I
store_covariance (bool, default=False) – Store the covariance matrices.
- classes_¶
Unique class labels.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import RDAClassifier >>> clf = RDAClassifier(alpha=0.5, shrinkage=0.1) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
RDA was proposed by Friedman (1989) to handle the bias-variance trade-off between LDA and QDA. The regularized covariance is:
Sigma_k(alpha, gamma) = alpha * Sigma_pooled + (1-alpha) * Sigma_k
followed by shrinkage toward scaled identity.
This provides a continuous family of classifiers that can adapt to the complexity supported by the data.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (RDAClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.KNNClassifier(n_neighbors=5, weights='distance', metric='minkowski', p=2, leaf_size=30, algorithm='auto', scale_features=True, n_jobs=-1)[source]¶
Bases:
ClassifierMixin,BaseEstimatorK-Nearest Neighbors Classifier with competition-tuned defaults.
A wrapper around sklearn’s KNeighborsClassifier with automatic feature scaling and sensible defaults for competitive ML.
- Parameters:
n_neighbors (int, default=5) – Number of neighbors to use.
weights (str, default='distance') – Weight function: ‘uniform’ or ‘distance’. ‘distance’ often works better in practice.
metric (str, default='minkowski') – Distance metric: ‘minkowski’, ‘euclidean’, ‘manhattan’, ‘cosine’, etc.
p (int, default=2) – Power parameter for Minkowski metric. p=2 is Euclidean, p=1 is Manhattan.
leaf_size (int, default=30) – Leaf size for BallTree or KDTree.
algorithm (str, default='auto') – Algorithm: ‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’.
scale_features (bool, default=True) – Whether to standardize features before fitting. Highly recommended for distance-based methods.
n_jobs (int, default=-1) – Number of parallel jobs.
- classes_¶
Unique class labels.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import KNNClassifier >>> clf = KNNClassifier(n_neighbors=5, weights='distance') >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
KNN is different from other models because: 1. Instance-based - stores training data, no explicit model 2. Non-parametric - makes no assumptions about data distribution 3. Local decision boundaries - can capture complex patterns 4. Sensitive to curse of dimensionality in high dimensions
The scale_features=True default is important because KNN relies on distance calculations that can be dominated by features with larger scales.
- fit(X, y, **fit_params)[source]¶
Fit the KNN classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
- Return type:
- Returns:
self
- kneighbors(X=None, n_neighbors=None, return_distance=True)[source]¶
Find the K-neighbors of a point.
- Parameters:
- Returns:
neigh_dist (ndarray (if return_distance=True)) – Distances to neighbors.
neigh_ind (ndarray) – Indices of neighbors.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (KNNClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.KNNRegressor(n_neighbors=5, weights='distance', metric='minkowski', p=2, leaf_size=30, algorithm='auto', scale_features=True, n_jobs=-1)[source]¶
Bases:
RegressorMixin,BaseEstimatorK-Nearest Neighbors Regressor with competition-tuned defaults.
A wrapper around sklearn’s KNeighborsRegressor with automatic feature scaling and sensible defaults for competitive ML.
- Parameters:
n_neighbors (int, default=5) – Number of neighbors to use.
weights (str, default='distance') – Weight function: ‘uniform’ or ‘distance’.
metric (str, default='minkowski') – Distance metric.
p (int, default=2) – Power parameter for Minkowski metric.
leaf_size (int, default=30) – Leaf size for BallTree or KDTree.
algorithm (str, default='auto') – Algorithm: ‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’.
scale_features (bool, default=True) – Whether to standardize features before fitting.
n_jobs (int, default=-1) – Number of parallel jobs.
Examples
>>> from endgame.models.baselines import KNNRegressor >>> reg = KNNRegressor(n_neighbors=10, weights='distance') >>> reg.fit(X_train, y_train) >>> predictions = reg.predict(X_test)
Notes
KNN regression averages (or weighted-averages) the target values of the k nearest neighbors. This provides a local, non-parametric estimate that can capture complex patterns but may suffer from the curse of dimensionality.
- fit(X, y, **fit_params)[source]¶
Fit the KNN regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- kneighbors(X=None, n_neighbors=None, return_distance=True)[source]¶
Find the K-neighbors of a point.
- Parameters:
- Returns:
neigh_dist (ndarray (if return_distance=True)) – Distances to neighbors.
neigh_ind (ndarray) – Indices of neighbors.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (KNNRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LinearClassifier(penalty='l2', C=1.0, l1_ratio=0.5, solver='lbfgs', max_iter=1000, class_weight='balanced', scale_features=True, n_jobs=-1, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorLinear Classifier with competition-tuned defaults.
Wraps LogisticRegression with automatic feature scaling and sensible defaults for competitive ML. Supports both L1, L2, and ElasticNet regularization.
- Parameters:
penalty (str, default='l2') – Regularization: ‘l1’, ‘l2’, ‘elasticnet’, or ‘none’.
C (float, default=1.0) – Inverse of regularization strength. Smaller values = stronger regularization.
l1_ratio (float, default=0.5) – ElasticNet mixing parameter (only used when penalty=’elasticnet’).
solver (str, default='lbfgs') – Optimization algorithm. ‘saga’ required for L1/ElasticNet.
max_iter (int, default=1000) – Maximum iterations for solver.
class_weight (str or dict, default='balanced') – Class weights: ‘balanced’ adjusts for class imbalance.
scale_features (bool, default=True) – Whether to standardize features before fitting.
n_jobs (int, default=-1) – Number of parallel jobs.
random_state (int, optional) – Random seed for reproducibility.
- classes_¶
Unique class labels.
- Type:
ndarray
- coef_¶
Feature coefficients.
- Type:
ndarray
- intercept_¶
Intercept term.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import LinearClassifier >>> clf = LinearClassifier(penalty='l2', C=1.0) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
Linear classifiers are different from tree-based models because: 1. Global decision boundary - same coefficients for all regions 2. Monotonic feature relationships 3. Implicit feature selection with L1 penalty 4. Well-calibrated probabilities (especially with Platt scaling)
The class_weight=’balanced’ default helps with imbalanced datasets.
- fit(X, y, sample_weight=None, **fit_params)[source]¶
Fit the linear classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
sample_weight (array-like, optional) – Sample weights.
- Return type:
- Returns:
self
- property coef_¶
Feature coefficients.
- property intercept_¶
Intercept term.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (LinearClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LinearClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LinearRegressor(penalty='l2', alpha=1.0, l1_ratio=0.5, max_iter=1000, scale_features=True, random_state=None)[source]¶
Bases:
RegressorMixin,BaseEstimatorLinear Regressor with competition-tuned defaults.
Wraps Ridge/Lasso/ElasticNet with automatic feature scaling and sensible defaults for competitive ML.
- Parameters:
penalty (str, default='l2') – Regularization: ‘l1’ (Lasso), ‘l2’ (Ridge), ‘elasticnet’.
alpha (float, default=1.0) – Regularization strength. Larger values = stronger regularization.
l1_ratio (float, default=0.5) – ElasticNet mixing parameter (only used when penalty=’elasticnet’).
max_iter (int, default=1000) – Maximum iterations for solver (only for L1/ElasticNet).
scale_features (bool, default=True) – Whether to standardize features before fitting.
random_state (int, optional) – Random seed for reproducibility.
- coef_¶
Feature coefficients.
- Type:
ndarray
Examples
>>> from endgame.models.baselines import LinearRegressor >>> reg = LinearRegressor(penalty='l2', alpha=1.0) >>> reg.fit(X_train, y_train) >>> predictions = reg.predict(X_test)
Notes
Linear regression provides: 1. Interpretable coefficients 2. Fast training and inference 3. L1 penalty for feature selection 4. L2 penalty for multicollinearity
- fit(X, y, sample_weight=None, **fit_params)[source]¶
Fit the linear regressor.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like, optional) – Sample weights.
- Return type:
- Returns:
self
- property coef_¶
Feature coefficients.
- property intercept_¶
Intercept term.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (LinearRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LinearRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.PRIMClassifier(alpha=0.05, min_support=20, pasting=True, paste_alpha=0.01, n_boxes=1)[source]¶
Bases:
ClassifierMixin,BaseEstimatorPRIM for classification via one-vs-rest subgroup discovery.
Trains a PRIM regressor per class (one-vs-rest) on the binary indicator for each class. At prediction time, the class whose box gives the highest density for a sample wins; samples not in any box are assigned the majority class.
- Parameters:
alpha (float, default=0.05) – Peeling fraction.
min_support (int or float, default=20) – Minimum number of points in a box.
pasting (bool, default=True) – Whether to apply pasting after peeling.
paste_alpha (float, default=0.01) – Pasting fraction.
n_boxes (int, default=1) – Number of boxes to find per class.
Examples
>>> from endgame.models.subgroup import PRIMClassifier >>> prim = PRIMClassifier(alpha=0.05) >>> prim.fit(X, y) >>> preds = prim.predict(X) >>> print(prim.get_rules())
- fit(X, y, feature_names=None)[source]¶
Fit one PRIM model per class (one-vs-rest).
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target labels.
- Return type:
- Returns:
self
- predict_proba(X)[source]¶
Estimate class probabilities based on box densities.
For each sample, the probability for class c is the density of the best box that contains it, or the base rate if no box contains it. Probabilities are row-normalised.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Data points.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Data points.
- Return type:
- Returns:
labels (ndarray of shape (n_samples,)) – Predicted class labels.
- score(X, y)[source]¶
Classification accuracy.
- Parameters:
X (array-like) – Features.
y (array-like) – Target labels.
- Return type:
- Returns:
score (float) – Accuracy.
- set_fit_request(*, feature_names='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
feature_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
feature_namesparameter infit.self (PRIMClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.PRIMRegressor(alpha=0.05, threshold_type='quantile', threshold=0.9, min_support=20, pasting=True, paste_alpha=0.01, n_boxes=1)[source]¶
Bases:
RegressorMixin,BaseEstimatorPRIM (Patient Rule Induction Method) for regression/continuous targets.
Finds rectangular regions where the target variable has unusually high mean values. Uses iterative peeling to shrink boxes while increasing target density.
- Parameters:
alpha (float, default=0.05) – Peeling fraction - proportion of data removed in each peel. Smaller values = more “patient” peeling.
threshold_type (str, default='quantile') – How to define “interesting” regions: ‘quantile’ or ‘absolute’.
threshold (float, default=0.9) – Threshold for defining interesting regions. If ‘quantile’, fraction of top values to consider interesting.
min_support (int or float, default=20) – Minimum number of points in a box. If float, interpreted as fraction.
pasting (bool, default=True) – Whether to apply pasting (box expansion) after peeling.
paste_alpha (float, default=0.01) – Pasting fraction for box expansion.
n_boxes (int, default=1) – Number of boxes to find (sequential covering).
- result_¶
Full PRIM analysis result.
- Type:
- feature_names_in_¶
Names of features.
- Type:
ndarray
Examples
>>> from endgame.models.subgroup import PRIMRegressor >>> prim = PRIMRegressor(alpha=0.05, min_support=30) >>> prim.fit(X, y) >>> print(prim.boxes_[0].to_rules()) >>> mask = prim.predict(X) # Boolean mask of points in box
Notes
PRIM works best when: 1. You’re looking for interpretable subgroups 2. The target has heterogeneous behavior across the feature space 3. You want rectangular (axis-aligned) regions
- fit(X, y, feature_names=None)[source]¶
Fit PRIM to find high-density regions.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values (higher = more interesting).
feature_names (list of str, optional) – Names of features for interpretable output.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict target values based on box membership.
Points inside a box get that box’s mean target density. Points outside all boxes get the global training mean.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Data points.
- Return type:
- Returns:
predictions (ndarray of shape (n_samples,)) – Predicted target values.
- predict_membership(X)[source]¶
Predict whether points fall in the found box(es).
- Parameters:
X (array-like of shape (n_samples, n_features)) – Data points.
- Return type:
- Returns:
mask (ndarray of shape (n_samples,)) – Boolean mask, True if point is in any box.
- score(X, y)[source]¶
Score the model: mean target value in predicted boxes.
- Parameters:
X (array-like) – Features.
y (array-like) – Target values.
- Return type:
- Returns:
score (float) – Mean target value in boxes minus overall mean.
- set_fit_request(*, feature_names='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
feature_names (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
feature_namesparameter infit.self (PRIMRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.Box(limits=<factory>, coverage=1.0, density=0.0, support=0)[source]¶
Bases:
objectA rectangular region (box) in feature space.
- contains(X)[source]¶
Check which points are inside the box.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – Data points to check.
- Return type:
- Returns:
mask (ndarray of shape (n_samples,)) – Boolean mask, True if point is inside box.
- class endgame.models.PRIMResult(boxes=<factory>, peeling_trajectory=<factory>, selected_box=None, selected_idx=-1)[source]¶
Bases:
objectResult of PRIM analysis.
- Parameters:
- peeling_trajectory¶
Statistics at each peeling step.
- Type:
List[Dict]
- class endgame.models.OrdinalClassifier(variant='auto', alpha=1.0, max_iter=1000, auto_scale=True, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorUnified Ordinal Regression Classifier with auto-variant selection.
Wraps mord library ordinal regression methods with automatic model selection based on data characteristics.
Ordinal regression is critical for ordered categorical targets where standard classification ignores the ordering (e.g., rating prediction, grade classification, severity levels).
- Parameters:
variant (str, default='auto') – Ordinal regression variant: - ‘auto’: Automatically select based on data - ‘at’: All-Threshold (LogisticAT) - most common - ‘it’: Immediate-Threshold (LogisticIT) - ‘se’: All-Threshold with absolute errors - ‘lad’: Least Absolute Deviation - ‘ridge’: Ordinal Ridge regression
alpha (float, default=1.0) – Regularization strength (inverse of C for logistic models, regularization strength for Ridge/LAD).
max_iter (int, default=1000) – Maximum iterations for optimization.
auto_scale (bool, default=True) – Whether to standardize features before fitting.
random_state (int, optional) – Random seed (not used by all variants).
- classes_¶
Ordered class labels.
- Type:
ndarray
- model_¶
Fitted ordinal regression model.
- Type:
mord estimator
- coef_¶
Feature coefficients.
- Type:
ndarray
- theta_¶
Class thresholds (boundaries).
- Type:
ndarray
Examples
>>> from endgame.models.ordinal import OrdinalClassifier >>> clf = OrdinalClassifier(variant='at', alpha=1.0) >>> clf.fit(X_train, y_train) # y_train has ordered labels >>> y_pred = clf.predict(X_test) >>> proba = clf.predict_proba(X_test)
Notes
Ordinal regression assumes: 1. Target classes have a meaningful order 2. A latent continuous variable underlies the ordered categories 3. Thresholds partition this latent space into ordered categories
- The cumulative model is:
P(Y <= j) = g(theta_j - X @ beta)
where g is a link function (logistic, probit, etc.).
- fit(X, y, sample_weight=None, **fit_params)[source]¶
Fit the ordinal regression model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Ordered target labels. Labels should be integers 0, 1, 2, … or will be encoded to integers preserving order.
sample_weight (array-like, optional) – Not supported by mord, ignored.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict ordinal class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
For ordinal regression, probabilities are derived from the cumulative model:
P(Y = j) = P(Y <= j) - P(Y <= j-1)
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (OrdinalClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (OrdinalClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.OrdinalRidge(alpha=1.0, max_iter=1000, auto_scale=True, random_state=None)[source]¶
Bases:
OrdinalClassifierOrdinal Ridge Regression.
Ridge regression for ordinal targets. Uses L2 regularization. Good for smaller datasets and many ordinal classes.
- Parameters:
Examples
>>> from endgame.models.ordinal import OrdinalRidge >>> clf = OrdinalRidge(alpha=1.0) >>> clf.fit(X_train, y_train) >>> y_pred = clf.predict(X_test)
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (OrdinalRidge)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (OrdinalRidge)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LogisticAT(alpha=1.0, max_iter=1000, auto_scale=True, random_state=None)[source]¶
Bases:
OrdinalClassifierAll-Threshold Ordinal Logistic Regression.
The most common ordinal regression model. Each class boundary has its own threshold parameter.
Also known as: Proportional Odds Model, Cumulative Logit Model.
- Parameters:
Examples
>>> from endgame.models.ordinal import LogisticAT >>> clf = LogisticAT(alpha=1.0) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (LogisticAT)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LogisticAT)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LogisticIT(alpha=1.0, max_iter=1000, auto_scale=True, random_state=None)[source]¶
Bases:
OrdinalClassifierImmediate-Threshold Ordinal Logistic Regression.
Adjacent classes share threshold boundaries. More constrained than All-Threshold, which can help with small datasets.
- Parameters:
Examples
>>> from endgame.models.ordinal import LogisticIT >>> clf = LogisticIT(alpha=1.0) >>> clf.fit(X_train, y_train) >>> y_pred = clf.predict(X_test)
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (LogisticIT)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LogisticIT)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LogisticSE(alpha=1.0, max_iter=1000, auto_scale=True, random_state=None)[source]¶
Bases:
OrdinalClassifierSquared-Error Ordinal Logistic Regression.
All-Threshold variant but using squared errors in optimization. Can be more robust to outliers.
- Parameters:
Examples
>>> from endgame.models.ordinal import LogisticSE >>> clf = LogisticSE(alpha=1.0) >>> clf.fit(X_train, y_train) >>> y_pred = clf.predict(X_test)
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter infit.self (LogisticSE)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (LogisticSE)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.LAD(alpha=1.0, max_iter=1000, auto_scale=True, random_state=None)[source]¶
Bases:
OrdinalClassifierLeast Absolute Deviation Ordinal Regression.
Uses L1 loss (absolute errors) instead of L2. More robust to outliers in the target variable.
- Parameters:
Examples
>>> from endgame.models.ordinal import LAD >>> clf = LAD(alpha=1.0) >>> clf.fit(X_train, y_train) >>> y_pred = clf.predict(X_test)
- set_fit_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- class endgame.models.BARTClassifier(n_trees=50, n_samples=1000, n_tune=500, n_chains=2, alpha=0.95, beta=2.0, auto_scale=True, random_state=None)[source]¶
Bases:
ClassifierMixin,BaseEstimatorBayesian Additive Regression Trees Classifier.
BART for classification uses a probit or logit link function to model class probabilities. The latent function is modeled as a sum of many trees with Bayesian priors.
- Parameters:
n_trees (int, default=50) – Number of trees in the ensemble.
n_samples (int, default=1000) – Number of posterior samples.
n_tune (int, default=500) – Number of tuning samples.
n_chains (int, default=2) – Number of MCMC chains.
alpha (float, default=0.95) – Tree depth prior parameter.
beta (float, default=2.0) – Tree depth penalty parameter.
auto_scale (bool, default=True) – Whether to standardize features.
random_state (int, optional) – Random seed.
- classes_¶
Unique class labels.
- Type:
ndarray
- variable_importance_¶
Feature importance scores.
- Type:
ndarray
Examples
>>> from endgame.models.probabilistic import BARTClassifier >>> clf = BARTClassifier(n_trees=50, n_samples=500, random_state=42) >>> clf.fit(X_train, y_train) >>> proba = clf.predict_proba(X_test)
Notes
- For binary classification, BART uses probit regression:
P(y=1|X) = Phi(sum of trees)
where Phi is the standard normal CDF.
For multiclass, a softmax or one-vs-rest approach is used.
- fit(X, y, **fit_params)[source]¶
Fit the BART classifier.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target class labels.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict class labels.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted class labels.
- predict_proba(X)[source]¶
Predict class probabilities.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
proba (ndarray of shape (n_samples, n_classes)) – Class probabilities.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (BARTClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.BARTRegressor(n_trees=50, n_samples=1000, n_tune=500, n_chains=2, alpha=0.95, beta=2.0, auto_scale=True, random_state=None)[source]¶
Bases:
RegressorMixin,BaseEstimatorBayesian Additive Regression Trees Regressor.
BART models the conditional mean function as a sum of many regression trees, using Bayesian priors to regularize complexity. Unlike greedy boosting (XGBoost, LightGBM), BART uses MCMC to explore the posterior distribution of tree structures.
- Parameters:
n_trees (int, default=50) – Number of trees in the ensemble. 50-200 trees typically work well. More trees = smoother predictions but slower inference.
n_samples (int, default=1000) – Number of posterior samples to draw via MCMC.
n_tune (int, default=500) – Number of tuning samples (burn-in) before posterior sampling.
n_chains (int, default=2) – Number of MCMC chains to run in parallel.
alpha (float, default=0.95) – Prior probability that a tree split terminates at depth d. Higher values favor shallower trees.
beta (float, default=2.0) – Prior rate of decrease in split probability with depth. Higher values penalize deeper trees more strongly.
auto_scale (bool, default=True) – Whether to standardize features before fitting.
random_state (int, optional) – Random seed for reproducibility.
- variable_importance_¶
Relative importance of each feature (based on split frequency).
- Type:
ndarray of shape (n_features,)
Examples
>>> from endgame.models.probabilistic import BARTRegressor >>> reg = BARTRegressor(n_trees=50, n_samples=500, random_state=42) >>> reg.fit(X_train, y_train) >>> y_pred = reg.predict(X_test) >>> intervals = reg.predict_interval(X_test, alpha=0.1) # 90% intervals
Notes
BART’s Bayesian approach provides: 1. Uncertainty quantification: Full posterior over predictions 2. Regularization via priors: Avoids overfitting without CV 3. Variable importance: Based on posterior split frequencies 4. Different inductive bias: Complements greedy boosted trees
For ensemble diversity, BART makes fundamentally different errors than XGBoost/LightGBM because it explores tree space via MCMC rather than greedy sequential fitting.
- fit(X, y, **fit_params)[source]¶
Fit the BART regressor using MCMC.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- predict(X)[source]¶
Predict mean target values.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
y_pred (ndarray of shape (n_samples,)) – Predicted mean values.
- predict_interval(X, alpha=0.1)[source]¶
Predict credible intervals.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
alpha (float, default=0.1) – Significance level. Returns (1-alpha)*100% credible intervals.
- Return type:
- Returns:
intervals (ndarray of shape (n_samples, 2)) – Lower and upper bounds of credible intervals.
- predict_std(X)[source]¶
Predict posterior standard deviation.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Samples to predict.
- Return type:
- Returns:
std (ndarray of shape (n_samples,)) – Posterior standard deviation for each prediction.
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (BARTRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.SymbolicRegressor(preset='default', operators='scientific', binary_operators=None, unary_operators=None, niterations=None, maxsize=None, maxdepth=None, populations=None, population_size=None, parsimony=None, model_selection='best', loss='L2DistLoss()', constraints=None, nested_constraints=None, denoise=False, select_k_features=None, turbo=False, parallelism='multithreading', procs=None, random_state=None, verbosity=0, temp_equation_file=True, output_directory=None)[source]¶
Bases:
BaseEstimator,RegressorMixinSymbolic Regression for discovering interpretable equations.
Uses multi-population genetic programming with Pareto-frontier tracking to find symbolic expressions balancing accuracy and complexity.
- Parameters:
preset (str, default="default") – Preset configuration: “fast”, “default”, “competition”, “interpretable”.
operators (str or dict, default="scientific") – Operator set name or dict with “binary_operators”/”unary_operators”.
binary_operators (list of str, optional) – Explicit binary operators (overrides operators).
unary_operators (list of str, optional) – Explicit unary operators (overrides operators).
niterations (int, optional) – Number of GP iterations.
maxsize (int, optional) – Max tree complexity (nodes).
maxdepth (int, optional) – Max tree depth.
populations (int, optional) – Number of sub-populations.
population_size (int, optional) – Individuals per population.
parsimony (float, optional) – Complexity penalty added to loss.
model_selection (str, default="best") – “best” (lowest loss) or “score” (loss-complexity trade-off).
loss (str, default="L2DistLoss()") – Loss function name. Accepts Julia-style names for backward compatibility (e.g.
"L2DistLoss()") or Python names ("mse","mae","huber").constraints (dict, optional) – Reserved for API compatibility (not enforced in GP engine).
nested_constraints (dict, optional) – Reserved for API compatibility.
denoise (bool, default=False) – Reserved for API compatibility.
select_k_features (int, optional) – Reserved for API compatibility.
turbo (bool, default=False) – Reserved for API compatibility.
parallelism (str, default="multithreading") – Reserved for API compatibility (GP runs single-threaded).
procs (int, optional) – Reserved for API compatibility.
random_state (int, optional) – Random seed.
verbosity (int, default=0) – 0 = silent, 1 = progress, 2 = detailed.
temp_equation_file (bool, default=True) – Reserved for API compatibility.
output_directory (str, optional) – Reserved for API compatibility.
- equations_¶
All discovered equations with loss and complexity.
- Type:
DataFrame
- feature_names_in_¶
Feature names.
- Type:
ndarray
- fit(X, y, **fit_params)[source]¶
Fit symbolic regression model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
- Return type:
- Returns:
self
- sympy(index=None)[source]¶
Return SymPy expression of the best (or indexed) equation.
- Parameters:
index (int | None)
- get_pareto_frontier()[source]¶
Return Pareto-optimal equations as a DataFrame.
- Return type:
DataFrame
- property feature_importances_: ndarray[tuple[Any, ...], dtype[_ScalarT]]¶
Feature importances from equation structure (occurrence count).
- set_predict_request(*, index='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
predictmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed topredictif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it topredict.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
index (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
indexparameter inpredict.self (SymbolicRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (SymbolicRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.SymbolicClassifier(preset='default', operators='scientific', binary_operators=None, unary_operators=None, niterations=None, maxsize=None, maxdepth=None, populations=None, population_size=None, parsimony=None, model_selection='best', constraints=None, nested_constraints=None, denoise=False, select_k_features=None, turbo=False, parallelism='multithreading', procs=None, random_state=None, verbosity=0, temp_equation_file=True, output_directory=None, threshold=0.5)[source]¶
Bases:
BaseEstimator,ClassifierMixinSymbolic Classification via logistic transformation of symbolic regression.
For binary classification, fits a symbolic regression model to the log-odds and applies sigmoid transformation for probabilities.
For multiclass, uses one-vs-rest strategy with multiple symbolic regressors.
- Parameters:
accepted. (All parameters from SymbolicRegressor are)
threshold (float, default=0.5) – Classification threshold for binary classification.
preset (str)
niterations (int | None)
maxsize (int | None)
maxdepth (int | None)
populations (int | None)
population_size (int | None)
parsimony (float | None)
model_selection (str)
constraints (dict | None)
nested_constraints (dict | None)
denoise (bool)
select_k_features (int | None)
turbo (bool)
parallelism (str)
procs (int | None)
random_state (int | None)
verbosity (int)
temp_equation_file (bool)
output_directory (str | None)
- model_¶
Underlying symbolic regressor(s).
- Type:
- classes_¶
Unique class labels.
- Type:
ndarray
- fit(X, y, **fit_params)[source]¶
Fit symbolic classifier.
- Parameters:
X (array-like of shape (n_samples, n_features))
y (array-like of shape (n_samples,))
- Return type:
- Returns:
self
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (SymbolicClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NEATClassifier(population_size=150, n_generations=100, n_hidden=0, activation_default='sigmoid', random_state=None, verbose=0)[source]¶
Bases:
BaseEstimator,ClassifierMixinNEAT classifier using neat-python.
Evolves neural network topology and weights using the NEAT algorithm.
- Parameters:
population_size (int) – Number of individuals per generation.
n_generations (int) – Number of evolutionary generations.
n_hidden (int) – Initial number of hidden nodes (0 = minimal topology).
activation_default (str) – Default activation function for new nodes.
random_state (int or None) – Random seed for reproducibility.
verbose (int) – Verbosity level (0 = silent).
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NEATClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.NEATRegressor(population_size=150, n_generations=100, n_hidden=0, activation_default='tanh', random_state=None, verbose=0)[source]¶
Bases:
BaseEstimator,RegressorMixinNEAT regressor using neat-python.
Evolves neural network topology and weights using the NEAT algorithm, optimizing for mean squared error. Targets are normalized internally so that network outputs (near [-1, 1]) can match the target scale.
- Parameters:
population_size (int) – Number of individuals per generation.
n_generations (int) – Number of evolutionary generations.
n_hidden (int) – Initial number of hidden nodes (0 = minimal topology).
activation_default (str) – Default activation function for new nodes.
random_state (int or None) – Random seed for reproducibility.
verbose (int) – Verbosity level (0 = silent).
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (NEATRegressor)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.TensorNEATClassifier(population_size=1000, n_generations=100, species_size=10, random_state=None, verbose=0)[source]¶
Bases:
BaseEstimator,ClassifierMixinTensorNEAT classifier — GPU-accelerated neuroevolution via JAX.
- Parameters:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (TensorNEATClassifier)
- Returns:
self (object) – The updated object.
- Return type:
- class endgame.models.TensorNEATRegressor(population_size=1000, n_generations=100, species_size=10, random_state=None, verbose=0)[source]¶
Bases:
BaseEstimator,RegressorMixinTensorNEAT regressor — GPU-accelerated neuroevolution via JAX.
- Parameters:
- set_score_request(*, sample_weight='$UNCHANGED$')¶
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
- Parameters:
sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for
sample_weightparameter inscore.self (TensorNEATRegressor)
- Returns:
self (object) – The updated object.
- Return type: