Ensemble

class endgame.ensemble.VotingClassifier(estimators, voting='soft', weights=None, flatten_transform=True, n_jobs=None, verbose=False)[source]

Bases: BaseEstimator, ClassifierMixin

Soft / hard voting meta-classifier.

Parameters:
  • estimators (list of (str, estimator) tuples) – Named base classifiers.

  • voting ({'hard', 'soft'}, default='soft') –

    • 'hard': majority-vote on predicted labels.

    • 'soft': average predicted probabilities, then argmax.

  • weights (array-like of shape (n_estimators,), optional) – Per-estimator weights. None means uniform.

  • flatten_transform (bool, default=True) – If True, transform returns shape (n_samples, n_classifiers * n_classes) instead of 3-D.

  • n_jobs (int or None, default=None) – Parallel fitting jobs. -1 uses all CPUs.

  • verbose (bool, default=False) – Print progress during fit.

estimators_

Fitted clones in the same order as estimators.

Type:

list of estimator

classes_

Unique class labels.

Type:

ndarray of shape (n_classes,)

le_

Mapping from label to integer index (for hard voting).

Type:

dict

Examples

>>> vc = VotingClassifier(
...     estimators=[("rf", RandomForest()), ("lr", LogisticRegression())],
...     voting="soft",
... )
>>> vc.fit(X_train, y_train).predict(X_test)
fit(X, y, sample_weight=None, **fit_params)[source]

Fit all base estimators.

Parameters:
predict(X)[source]

Predict class labels.

Parameters:

X (array-like of shape (n_samples, n_features))

Returns:

ndarray of shape (n_samples,)

predict_proba(X)[source]

Average predicted probabilities.

Parameters:

X (array-like of shape (n_samples, n_features))

Returns:

ndarray of shape (n_samples, n_classes)

transform(X)[source]

Return per-estimator predictions or probabilities.

Parameters:

X (array-like of shape (n_samples, n_features))

Returns:

ndarray

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

property named_estimators: dict[str, BaseEstimator]

Access fitted estimators by name.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (VotingClassifier)

Returns:

self (object) – The updated object.

Return type:

VotingClassifier

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (VotingClassifier)

Returns:

self (object) – The updated object.

Return type:

VotingClassifier

class endgame.ensemble.VotingRegressor(estimators, weights=None, n_jobs=None, verbose=False)[source]

Bases: BaseEstimator, RegressorMixin

Voting meta-regressor: averages predictions from multiple regressors.

Parameters:
  • estimators (list of (str, estimator) tuples) – Named base regressors.

  • weights (array-like of shape (n_estimators,), optional) – Per-estimator weights. None means uniform.

  • n_jobs (int or None, default=None) – Parallel fitting jobs.

  • verbose (bool, default=False) – Print progress during fit.

estimators_

Fitted clones.

Type:

list of estimator

Examples

>>> vr = VotingRegressor(
...     estimators=[("ridge", Ridge()), ("rf", RandomForestRegressor())],
...     weights=[1, 2],
... )
>>> vr.fit(X_train, y_train).predict(X_test)
fit(X, y, sample_weight=None, **fit_params)[source]
predict(X)[source]
transform(X)[source]
property named_estimators: dict[str, BaseEstimator]
set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (VotingRegressor)

Returns:

self (object) – The updated object.

Return type:

VotingRegressor

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (VotingRegressor)

Returns:

self (object) – The updated object.

Return type:

VotingRegressor

class endgame.ensemble.BaggingClassifier(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, n_jobs=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, ClassifierMixin

Bootstrap Aggregating classifier.

Parameters:
  • base_estimator (estimator, optional) – Base learner to bag. Default: DecisionTreeClassifier().

  • n_estimators (int, default=10) – Number of bootstrap replicas.

  • max_samples (float or int, default=1.0) – Fraction (if float) or count (if int) of samples per bag.

  • max_features (float or int, default=1.0) – Fraction (if float) or count (if int) of features per bag.

  • bootstrap (bool, default=True) – Sample rows with replacement.

  • bootstrap_features (bool, default=False) – Sample features with replacement.

  • oob_score (bool, default=False) – Whether to compute out-of-bag accuracy.

  • n_jobs (int or None, default=None) – Parallel jobs. -1 uses all CPUs.

  • random_state (int or None, default=None) – Random seed.

  • verbose (bool, default=False)

estimators_

Fitted base estimators.

Type:

list of estimator

estimator_features_

Feature indices used by each estimator.

Type:

list of ndarray

oob_score_

OOB accuracy (only if oob_score=True).

Type:

float

oob_decision_function_

OOB predicted probabilities (only if oob_score=True).

Type:

ndarray

classes_
Type:

ndarray

fit(X, y, sample_weight=None)[source]
predict(X)[source]
predict_proba(X)[source]
property feature_importances_

Average feature importances across bags (original feature space).

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (BaggingClassifier)

Returns:

self (object) – The updated object.

Return type:

BaggingClassifier

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (BaggingClassifier)

Returns:

self (object) – The updated object.

Return type:

BaggingClassifier

class endgame.ensemble.BaggingRegressor(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, n_jobs=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, RegressorMixin

Bootstrap Aggregating regressor.

Parameters:
  • base_estimator (estimator, optional) – Base learner. Default: DecisionTreeRegressor().

  • n_estimators (int, default=10)

  • max_samples (float or int, default=1.0)

  • max_features (float or int, default=1.0)

  • bootstrap (bool, default=True)

  • bootstrap_features (bool, default=False)

  • oob_score (bool, default=False)

  • n_jobs (int or None, default=None)

  • random_state (int or None, default=None)

  • verbose (bool, default=False)

estimators_
Type:

list of estimator

estimator_features_
Type:

list of ndarray

oob_score_

OOB R² score (only if oob_score=True).

Type:

float

oob_prediction_

OOB predictions (only if oob_score=True).

Type:

ndarray

fit(X, y, sample_weight=None)[source]
predict(X)[source]
property feature_importances_
set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (BaggingRegressor)

Returns:

self (object) – The updated object.

Return type:

BaggingRegressor

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (BaggingRegressor)

Returns:

self (object) – The updated object.

Return type:

BaggingRegressor

class endgame.ensemble.AdaBoostClassifier(base_estimator=None, n_estimators=50, learning_rate=1.0, algorithm='SAMME.R', random_state=None)[source]

Bases: BaseEstimator, ClassifierMixin

AdaBoost classifier (SAMME / SAMME.R).

Parameters:
  • base_estimator (estimator, optional) – Base learner. Default: DecisionTreeClassifier(max_depth=1) (stump).

  • n_estimators (int, default=50) – Maximum number of boosting rounds.

  • learning_rate (float, default=1.0) – Shrinkage applied to each estimator’s weight. Lower values require more estimators but generalize better.

  • algorithm ({'SAMME', 'SAMME.R'}, default='SAMME.R') –

    • 'SAMME': discrete AdaBoost using class labels.

    • 'SAMME.R': real AdaBoost using class probabilities (requires predict_proba).

  • random_state (int or None, default=None)

estimators_

Fitted weak learners.

Type:

list of estimator

estimator_weights_

Weight of each estimator (SAMME only).

Type:

ndarray

estimator_errors_

Weighted error of each estimator.

Type:

ndarray

classes_
Type:

ndarray

n_classes_
Type:

int

feature_importances_

Sum of feature importances weighted by estimator weight.

Type:

ndarray

fit(X, y, sample_weight=None)[source]
predict(X)[source]
predict_proba(X)[source]
property feature_importances_
set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (AdaBoostClassifier)

Returns:

self (object) – The updated object.

Return type:

AdaBoostClassifier

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (AdaBoostClassifier)

Returns:

self (object) – The updated object.

Return type:

AdaBoostClassifier

class endgame.ensemble.AdaBoostRegressor(base_estimator=None, n_estimators=50, learning_rate=1.0, loss='linear', random_state=None)[source]

Bases: BaseEstimator, RegressorMixin

AdaBoost.R2 regressor.

Parameters:
  • base_estimator (estimator, optional) – Default: DecisionTreeRegressor(max_depth=3).

  • n_estimators (int, default=50)

  • learning_rate (float, default=1.0)

  • loss ({'linear', 'square', 'exponential'}, default='linear') – Loss function for computing sample weights.

  • random_state (int or None, default=None)

estimators_
Type:

list of estimator

estimator_weights_
Type:

ndarray

estimator_errors_
Type:

ndarray

feature_importances_
Type:

ndarray

fit(X, y, sample_weight=None)[source]
predict(X)[source]
property feature_importances_
set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (AdaBoostRegressor)

Returns:

self (object) – The updated object.

Return type:

AdaBoostRegressor

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (AdaBoostRegressor)

Returns:

self (object) – The updated object.

Return type:

AdaBoostRegressor

class endgame.ensemble.StackingEnsemble(base_estimators=None, meta_estimator=None, cv=5, passthrough=False, use_proba=True, stack_method='auto', random_state=None, verbose=False)[source]

Bases: BaseEnsemble

Multi-level stacking with out-of-fold prediction handling.

Level 1: Diverse base models (GBDTs, NNs, etc.) Level 2: Meta-learner (typically Ridge/Linear Regression)

The meta-learner is trained on out-of-fold predictions from Level 1 to prevent overfitting.

Parameters:
  • base_estimators (List[estimator]) – Level 1 models.

  • meta_estimator (estimator, optional) – Level 2 model. Default: Ridge for regression, LogisticRegression for classification.

  • cv (int or CV splitter, default=5) – Cross-validation strategy for OOF predictions.

  • passthrough (bool, default=False) – Whether to include original features in Level 2.

  • use_proba (bool, default=True) – Use predict_proba for classification (if available).

  • stack_method (str, default='auto') – Method for stacking: ‘auto’, ‘predict’, ‘predict_proba’.

  • random_state (int, optional) – Random seed.

  • verbose (bool, default=False) – Enable verbose output.

base_estimators_

Fitted Level 1 models.

Type:

List[estimator]

meta_estimator_

Fitted Level 2 model.

Type:

estimator

oof_predictions_

Out-of-fold predictions used for meta-learner training.

Type:

ndarray

Examples

>>> from endgame.ensemble import StackingEnsemble
>>> base_models = [LGBMWrapper(), XGBWrapper(), CatBoostWrapper()]
>>> stacker = StackingEnsemble(base_estimators=base_models)
>>> stacker.fit(X_train, y_train)
>>> predictions = stacker.predict(X_test)
fit(X, y, sample_weight=None, **fit_params)[source]

Fit the stacking ensemble.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data.

  • y (array-like of shape (n_samples,)) – Target values.

  • sample_weight (array-like, optional) – Sample weights.

  • **fit_params – Additional parameters.

Return type:

StackingEnsemble

Returns:

self

predict(X)[source]

Predict using the stacking ensemble.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Return type:

ndarray

Returns:

ndarray – Predictions.

predict_proba(X)[source]

Predict class probabilities.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Return type:

ndarray

Returns:

ndarray of shape (n_samples, n_classes) – Class probabilities.

score(X, y, sample_weight=None)[source]

Return the mean accuracy on the given test data and labels.

For classification, this is the accuracy score. For regression, this is the R^2 score.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,)) – True labels for classification, true values for regression.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Return type:

float

Returns:

float – Score of the predictions.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (StackingEnsemble)

Returns:

self (object) – The updated object.

Return type:

StackingEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (StackingEnsemble)

Returns:

self (object) – The updated object.

Return type:

StackingEnsemble

class endgame.ensemble.BlendingEnsemble(base_estimators=None, meta_estimator=None, blend_fraction=0.2, use_proba=True, passthrough=False, cv=None, random_state=None, verbose=False)[source]

Bases: BaseEnsemble, ClassifierMixin

Blending Ensemble using hold-out set for meta-learner training.

Unlike stacking which uses cross-validation, blending uses a hold-out portion of the training data to generate meta-features for the second-level learner.

Parameters:
  • base_estimators (List[estimator]) – Level 1 models.

  • meta_estimator (estimator, optional) – Level 2 model. Default: LogisticRegression for classification.

  • blend_fraction (float, default=0.2) – Fraction of training data to use for blending (meta-learner training).

  • use_proba (bool, default=True) – Use predict_proba for classification (if available).

  • passthrough (bool, default=False) – Whether to include original features in Level 2.

  • cv (int, optional) – Ignored. For API compatibility with StackingEnsemble.

  • random_state (int, optional) – Random seed.

  • verbose (bool, default=False) – Enable verbose output.

base_estimators_

Fitted Level 1 models.

Type:

List[estimator]

meta_estimator_

Fitted Level 2 model.

Type:

estimator

classes_

Unique class labels (for classification).

Type:

ndarray

Examples

>>> from endgame.ensemble import BlendingEnsemble
>>> base_models = [RandomForestClassifier(), GradientBoostingClassifier()]
>>> blender = BlendingEnsemble(base_estimators=base_models)
>>> blender.fit(X_train, y_train)
>>> predictions = blender.predict(X_test)
fit(X, y, sample_weight=None, **fit_params)[source]

Fit the blending ensemble.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training data.

  • y (array-like of shape (n_samples,)) – Target values.

  • sample_weight (array-like, optional) – Sample weights.

Return type:

BlendingEnsemble

Returns:

self

predict(X)[source]

Predict using the blending ensemble.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Return type:

ndarray

Returns:

ndarray – Predictions.

predict_proba(X)[source]

Predict class probabilities.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Return type:

ndarray

Returns:

ndarray of shape (n_samples, n_classes) – Class probabilities.

score(X, y, sample_weight=None)[source]

Return accuracy score on the given data.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • y (array-like of shape (n_samples,)) – True labels.

  • sample_weight (array-like, optional) – Sample weights.

Return type:

float

Returns:

float – Accuracy score.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (BlendingEnsemble)

Returns:

self (object) – The updated object.

Return type:

BlendingEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (BlendingEnsemble)

Returns:

self (object) – The updated object.

Return type:

BlendingEnsemble

class endgame.ensemble.OptimizedBlender(metric='roc_auc', n_trials=100, weight_bounds=(0, 1), normalize=True, maximize=True, random_state=None, verbose=False)[source]

Bases: BaseEnsemble

Optuna-powered blend weight optimization.

Uses Bayesian optimization to find optimal weights for combining model predictions.

Parameters:
  • metric (str or callable) – Metric to optimize: ‘roc_auc’, ‘rmse’, ‘mae’, etc.

  • n_trials (int, default=100) – Number of optimization trials.

  • weight_bounds (Tuple[float, float], default=(0, 1)) – Bounds for individual model weights.

  • normalize (bool, default=True) – Whether weights must sum to 1.

  • maximize (bool, default=True) – Whether to maximize or minimize the metric.

  • random_state (int, optional) – Random seed.

  • verbose (bool, default=False) – Enable verbose output.

weights_

Optimized model weights.

Type:

Dict[int, float]

best_score_

Best score achieved.

Type:

float

study_

Optuna study object for further analysis.

Type:

optuna.Study

Examples

>>> blender = OptimizedBlender(metric='roc_auc', n_trials=100)
>>> blender.fit(oof_predictions, y_train)
>>> final_pred = blender.predict(test_predictions)
fit(predictions, y_true)[source]

Optimize blend weights using Optuna.

Parameters:
  • predictions (List of arrays) – Out-of-fold predictions from each model.

  • y_true (array-like) – True target values.

Return type:

OptimizedBlender

Returns:

self

predict(predictions)[source]

Apply optimized weights.

Parameters:

predictions (List of arrays) – Predictions from each model.

Return type:

ndarray

Returns:

ndarray – Blended prediction.

set_fit_request(*, predictions='$UNCHANGED$', y_true='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.

  • y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.

  • self (OptimizedBlender)

Returns:

self (object) – The updated object.

Return type:

OptimizedBlender

set_predict_request(*, predictions='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.

  • self (OptimizedBlender)

Returns:

self (object) – The updated object.

Return type:

OptimizedBlender

class endgame.ensemble.RankAverageBlender(method='average', normalize=True, weights=None, random_state=None, verbose=False)[source]

Bases: BaseEnsemble

Rank-based blending for submissions.

Converts predictions to ranks before averaging. Robust to different prediction scales across models.

Parameters:
  • method (str, default='average') – Rank method: ‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’.

  • normalize (bool, default=True) – Whether to normalize ranks to [0, 1].

  • weights (Dict[int, float], optional) – Optional model weights. If None, uniform weights.

  • random_state (int | None)

  • verbose (bool)

Examples

>>> blender = RankAverageBlender()
>>> final_pred = blender.blend(test_predictions)
fit(predictions=None, y_true=None)[source]

Fit the blender (stores weights if provided).

Parameters:
  • predictions (ignored)

  • y_true (ignored)

Return type:

RankAverageBlender

Returns:

self

blend(predictions)[source]

Blend predictions using rank averaging.

Parameters:

predictions (List of arrays) – Predictions from each model.

Return type:

ndarray

Returns:

ndarray – Rank-averaged prediction.

predict(predictions)[source]

Alias for blend().

Return type:

ndarray

Parameters:

predictions (list[ndarray])

set_fit_request(*, predictions='$UNCHANGED$', y_true='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.

  • y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.

  • self (RankAverageBlender)

Returns:

self (object) – The updated object.

Return type:

RankAverageBlender

set_predict_request(*, predictions='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.

  • self (RankAverageBlender)

Returns:

self (object) – The updated object.

Return type:

RankAverageBlender

class endgame.ensemble.PowerBlender(scores=None, power=2.0, higher_is_better=True, random_state=None, verbose=False)[source]

Bases: BaseEnsemble

Power-weighted blending based on individual scores.

Weights models by their validation scores raised to a power. Higher power = more weight to best models.

Parameters:
  • scores (List[float]) – Validation scores for each model.

  • power (float, default=2.0) – Power to raise scores to (higher = more aggressive weighting).

  • higher_is_better (bool, default=True) – Whether higher scores are better.

  • random_state (int | None)

  • verbose (bool)

Examples

>>> scores = [0.85, 0.87, 0.86]
>>> blender = PowerBlender(scores=scores, power=3.0)
>>> final_pred = blender.predict(test_predictions)
fit(predictions=None, y_true=None, scores=None)[source]

Compute power-weighted blending weights.

Parameters:
  • predictions (ignored)

  • y_true (ignored)

  • scores (List[float], optional) – Model scores (overrides constructor scores).

Return type:

PowerBlender

Returns:

self

predict(predictions)[source]

Apply power weights.

Parameters:

predictions (List of arrays) – Predictions from each model.

Return type:

ndarray

Returns:

ndarray – Power-weighted prediction.

set_fit_request(*, predictions='$UNCHANGED$', scores='$UNCHANGED$', y_true='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.

  • scores (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for scores parameter in fit.

  • y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.

  • self (PowerBlender)

Returns:

self (object) – The updated object.

Return type:

PowerBlender

set_predict_request(*, predictions='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.

  • self (PowerBlender)

Returns:

self (object) – The updated object.

Return type:

PowerBlender

class endgame.ensemble.HillClimbingEnsemble(metric='roc_auc', n_iterations=100, early_stopping=20, maximize=True, init_weights='best_single', random_state=None, verbose=False)[source]

Bases: BaseEnsemble

Forward ensemble selection with replacement.

Iteratively adds models that maximize validation metric. Key technique for non-differentiable metrics (F1, MAP@K).

Algorithm: 1. Start with empty ensemble 2. For each iteration:

  1. For each model in pool: - Compute metric if added to current ensemble

  2. Add model that provides best improvement

  1. Allow repeating models (weighted averaging)

Parameters:
  • metric (str or callable, default='roc_auc') – Metric to optimize: ‘roc_auc’, ‘log_loss’, ‘f1’, ‘accuracy’, ‘rmse’, ‘mae’, ‘r2’, or custom callable(y_true, y_pred).

  • n_iterations (int, default=100) – Number of hill climbing iterations.

  • early_stopping (int, default=20) – Stop if no improvement for this many iterations.

  • maximize (bool, default=True) – Whether to maximize or minimize the metric.

  • init_weights (str, default='best_single') – Initial weight strategy: ‘best_single’, ‘uniform’, ‘none’.

  • random_state (int, optional) – Random seed for tie-breaking.

  • verbose (bool, default=False) – Enable verbose output.

weights_

Optimized model weights (by model index).

Type:

Dict[int, float]

best_score_

Best ensemble score achieved.

Type:

float

selection_history_

Order in which models were selected.

Type:

List[int]

Examples

>>> from endgame.ensemble import HillClimbingEnsemble
>>> ensemble = HillClimbingEnsemble(metric='roc_auc', n_iterations=100)
>>> ensemble.fit(oof_predictions, y_train)
>>> print(f"Weights: {ensemble.weights_}")
>>> test_pred = ensemble.predict(test_predictions)
fit(predictions, y_true)[source]

Find optimal ensemble weights via hill climbing.

Parameters:
  • predictions (List of shape (n_models, n_samples, ...)) – Out-of-fold predictions from each model.

  • y_true (array-like) – True target values.

Return type:

HillClimbingEnsemble

Returns:

self

predict(predictions)[source]

Apply learned weights to generate ensemble prediction.

Parameters:

predictions (List of shape (n_models, n_samples, ...)) – Predictions from each model.

Return type:

ndarray

Returns:

ndarray – Weighted ensemble prediction.

get_result()[source]

Get ensemble result summary.

Return type:

EnsembleResult

Returns:

EnsembleResult – Result containing weights, score, and selected models.

set_fit_request(*, predictions='$UNCHANGED$', y_true='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.

  • y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.

  • self (HillClimbingEnsemble)

Returns:

self (object) – The updated object.

Return type:

HillClimbingEnsemble

set_predict_request(*, predictions='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.

  • self (HillClimbingEnsemble)

Returns:

self (object) – The updated object.

Return type:

HillClimbingEnsemble

class endgame.ensemble.SuperLearner(base_estimators, meta_learner='nnls', cv=5, use_proba=True, include_original_features=False, random_state=None, verbose=False)[source]

Bases: BaseEstimator

Cross-validated Super Learner ensemble.

Parameters:
  • base_estimators (list of (str, estimator) tuples) – Named base learners to combine.

  • meta_learner ({'nnls', 'ridge', 'best'} or estimator, default='nnls') – How to combine OOF predictions: - 'nnls': Non-negative least squares (convex combination). - 'ridge': Ridge regression on OOF predictions. - 'best': Use the single best base learner (no blending). - An sklearn estimator for custom meta-learning.

  • cv (int or CV splitter, default=5) – Cross-validation strategy for OOF predictions.

  • use_proba (bool, default=True) – Use predict_proba for classifiers (if available).

  • include_original_features (bool, default=False) – Pass original features to the meta-learner alongside OOF predictions.

  • random_state (int or None, default=None)

  • verbose (bool, default=False)

coef_

Meta-learner weights (one per base estimator).

Type:

ndarray

base_estimators_

Fitted base estimators (on full training data).

Type:

list of estimator

oof_predictions_

Out-of-fold predictions used for meta-learning.

Type:

ndarray

cv_scores_

Per-estimator cross-validated score.

Type:

dict of {name: float}

is_classifier_
Type:

bool

References

van der Laan, M.J., Polley, E.C. & Hubbard, A.E. (2007). Super Learner. Statistical Applications in Genetics and Molecular Biology, 6(1).

fit(X, y, sample_weight=None)[source]

Fit the Super Learner.

  1. Generate OOF predictions for each base estimator.

  2. Solve for optimal combination weights.

  3. Refit all base estimators on the full training set.

predict(X)[source]
predict_proba(X)[source]
property named_estimators
set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (SuperLearner)

Returns:

self (object) – The updated object.

Return type:

SuperLearner

class endgame.ensemble.BayesianModelAveraging(criterion='bic', prior='uniform', task='auto')[source]

Bases: BaseEstimator

Bayesian Model Averaging using information-criterion weights.

Parameters:
  • criterion ({'bic', 'aic', 'aic_c'}, default='bic') –

    • 'bic': Bayesian Information Criterion (penalizes complexity more).

    • 'aic': Akaike Information Criterion.

    • 'aic_c': Corrected AIC for small samples.

  • prior ({'uniform', 'complexity'} or array-like, optional) – Prior over models. Default is uniform.

  • task ({'auto', 'classification', 'regression'}, default='auto')

weights_

Posterior model weights summing to 1.

Type:

ndarray

ic_scores_

Information criterion values for each model.

Type:

ndarray

estimators_

Fitted estimators (references, not clones).

Type:

list of estimator

fit(estimators, X_val, y_val)[source]

Compute posterior weights from validation data.

Parameters:
  • estimators (list of fitted estimators) – Already-fitted models.

  • X_val (array-like) – Validation features.

  • y_val (array-like) – Validation target.

Return type:

BayesianModelAveraging

Returns:

self

predict(X)[source]
predict_proba(X)[source]
set_fit_request(*, X_val='$UNCHANGED$', estimators='$UNCHANGED$', y_val='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • X_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for X_val parameter in fit.

  • estimators (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for estimators parameter in fit.

  • y_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_val parameter in fit.

  • self (BayesianModelAveraging)

Returns:

self (object) – The updated object.

Return type:

BayesianModelAveraging

class endgame.ensemble.NegativeCorrelationEnsemble(base_estimators, lambda_ncl=0.5, n_iterations=10, weights=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, RegressorMixin

Negative Correlation Learning ensemble for regression.

Trains all base learners together, each with a modified loss that includes a diversity term penalizing correlation with the ensemble average. This produces models that are individually weaker but collectively stronger.

Parameters:
  • base_estimators (list of estimator) – Base regressors to train. Must support sample_weight or partial fit.

  • lambda_ncl (float, default=0.5) – Strength of the negative correlation penalty. - 0: independent training (standard ensemble). - 1: maximum diversity pressure.

  • n_iterations (int, default=10) – Number of NCL training rounds.

  • weights (array-like, optional) – Static model weights. Default is uniform.

  • random_state (int or None, default=None)

  • verbose (bool, default=False)

estimators_

Fitted base regressors.

Type:

list of estimator

weights_

Model combination weights.

Type:

ndarray

diversity_

Measured ensemble diversity (average pairwise disagreement).

Type:

float

References

Liu, Y. & Yao, X. (1999). Ensemble Learning via Negative Correlation. Neural Networks, 12(10), 1399-1404.

fit(X, y, sample_weight=None)[source]

Fit with negative correlation learning.

Each round: 1. Compute ensemble prediction (average of all learners). 2. For each learner i, compute modified sample weights

that up-weight samples where learner i disagrees with the ensemble (promoting diversity).

  1. Refit each learner with the modified weights.

predict(X)[source]
property feature_importances_
set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (NegativeCorrelationEnsemble)

Returns:

self (object) – The updated object.

Return type:

NegativeCorrelationEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (NegativeCorrelationEnsemble)

Returns:

self (object) – The updated object.

Return type:

NegativeCorrelationEnsemble

class endgame.ensemble.SnapshotEnsemble(base_estimator, n_snapshots=5, epochs_per_cycle=40, initial_lr=0.1, min_lr=1e-05, verbose=False)[source]

Bases: BaseEstimator

Snapshot Ensemble via cosine annealing warm restarts.

Trains a single neural-network-like estimator with a cyclic learning rate schedule. At the end of each cycle (when LR reaches its minimum), takes a “snapshot” of the model. The final ensemble averages predictions across all snapshots.

Parameters:
  • base_estimator (estimator) – A model supporting partial_fit (e.g., MLPClassifier, SGDClassifier, SGDRegressor). Must accept learning_rate_init or eta0.

  • n_snapshots (int, default=5) – Number of snapshots (cycles) to collect.

  • epochs_per_cycle (int, default=40) – Training epochs per cosine annealing cycle.

  • initial_lr (float, default=0.1) – Peak learning rate at the start of each cycle.

  • min_lr (float, default=1e-5) – Minimum learning rate at end of each cycle (snapshot point).

  • verbose (bool, default=False)

snapshots_

Saved model snapshots.

Type:

list of estimator

lr_history_

Learning rate at each epoch.

Type:

list of float

is_classifier_
Type:

bool

References

Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J.E., & Weinberger, K.Q. (2017). Snapshot Ensembles: Train 1, Get M for Free. ICLR.

fit(X, y, **fit_params)[source]

Train with cyclic LR and collect snapshots.

Parameters:
  • X (array-like)

  • y (array-like)

predict(X)[source]
predict_proba(X)[source]
class endgame.ensemble.CascadeEnsemble(stages, confidence_threshold=0.95, cv=3, use_proba=True, passthrough=True, max_stages=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, ClassifierMixin

Multi-stage cascade classifier with early-exit.

Parameters:
  • stages (list of list of estimator) – Each stage is a list of base classifiers. Predictions from stage k are concatenated as features for stage k+1.

  • confidence_threshold (float, default=0.95) – If max predicted probability exceeds this, the sample exits the cascade early (only at prediction time).

  • cv (int, default=3) – CV folds for generating OOF features during training.

  • use_proba (bool, default=True) – Use predicted probabilities as cascade features (vs. labels).

  • passthrough (bool, default=True) – Include original features at every stage.

  • max_stages (int or None, default=None) – Maximum number of stages. If None, use all provided stages.

  • random_state (int or None, default=None)

  • verbose (bool, default=False)

stages_

Fitted estimators per stage.

Type:

list of list of estimator

classes_
Type:

ndarray

n_stages_

Number of fitted stages.

Type:

int

stage_scores_

Per-stage validation accuracy.

Type:

list of float

fit(X, y, sample_weight=None)[source]

Fit the cascade stage by stage.

At each stage, generate OOF predictions, concatenate them as features for the next stage, then refit on all data.

predict(X)[source]
predict_proba(X)[source]

Predict with early exit based on confidence.

Samples whose max probability exceeds confidence_threshold at any stage are assigned their prediction from that stage. Remaining samples proceed to the next stage.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (CascadeEnsemble)

Returns:

self (object) – The updated object.

Return type:

CascadeEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (CascadeEnsemble)

Returns:

self (object) – The updated object.

Return type:

CascadeEnsemble

class endgame.ensemble.ThresholdOptimizer(metric='f1', search_method='grid', n_thresholds=100, threshold_range=(0.1, 0.9), multiclass=False, random_state=None, verbose=False)[source]

Bases: EndgameEstimator

Optimizes classification thresholds for target metrics.

Standard 0.5 threshold is often suboptimal. This optimizer finds per-class thresholds that maximize the target metric.

Parameters:
  • metric (str or callable, default='f1') – Metric to optimize: ‘f1’, ‘f1_macro’, ‘f1_weighted’, ‘accuracy’, ‘balanced_accuracy’, or custom callable.

  • search_method (str, default='grid') – Search method: ‘grid’, ‘optuna’, ‘hill_climb’.

  • n_thresholds (int, default=100) – Number of thresholds to search (for grid search).

  • threshold_range (Tuple[float, float], default=(0.1, 0.9)) – Range of thresholds to search.

  • multiclass (bool, default=False) – Whether to optimize per-class thresholds.

  • random_state (int, optional) – Random seed.

  • verbose (bool, default=False) – Enable verbose output.

threshold_

Optimized threshold(s).

Type:

float or Dict[int, float]

best_score_

Best score achieved.

Type:

float

Examples

>>> optimizer = ThresholdOptimizer(metric='f1')
>>> optimizer.fit(y_true, y_proba)
>>> print(f"Optimal threshold: {optimizer.threshold_}")
>>> y_pred = optimizer.predict(y_proba)
fit(y_true, y_proba)[source]

Find optimal threshold(s).

Parameters:
  • y_true (array-like) – True labels.

  • y_proba (array-like) – Predicted probabilities. Shape (n_samples,) for binary, (n_samples, n_classes) for multiclass.

Return type:

ThresholdOptimizer

Returns:

self

predict(y_proba)[source]

Apply optimized threshold(s) to predictions.

Parameters:

y_proba (array-like) – Predicted probabilities.

Return type:

ndarray

Returns:

ndarray – Predicted labels.

transform(y_proba)[source]

Alias for predict().

Return type:

ndarray

Parameters:

y_proba (ndarray)

set_fit_request(*, y_proba='$UNCHANGED$', y_true='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • y_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_proba parameter in fit.

  • y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.

  • self (ThresholdOptimizer)

Returns:

self (object) – The updated object.

Return type:

ThresholdOptimizer

set_predict_request(*, y_proba='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • y_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_proba parameter in predict.

  • self (ThresholdOptimizer)

Returns:

self (object) – The updated object.

Return type:

ThresholdOptimizer

set_transform_request(*, y_proba='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the transform method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to transform.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • y_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_proba parameter in transform.

  • self (ThresholdOptimizer)

Returns:

self (object) – The updated object.

Return type:

ThresholdOptimizer

class endgame.ensemble.KnowledgeDistiller(teacher, student, temperature=3.0, alpha=0.7, augment=False, augment_ratio=1.0, augment_swap_prob=0.1, random_state=None)[source]

Bases: BaseEstimator

Knowledge distillation from teacher to student model.

Trains a simpler student model to mimic the predictions of a complex teacher model (or ensemble), enabling deployment of lightweight models with minimal accuracy loss.

Parameters:
  • teacher (estimator) – Fitted teacher model. Must have predict_proba (classification) or predict (regression).

  • student (estimator) – Unfitted student model to train.

  • temperature (float, default=3.0) – Softmax temperature for soft label generation (classification only). Higher values produce softer probability distributions that reveal more about the teacher’s learned relationships.

  • alpha (float, default=0.7) – Weight for soft labels vs hard labels. Loss = alpha * soft_loss + (1 - alpha) * hard_loss. Set to 1.0 for pure distillation.

  • augment (bool, default=False) – Whether to use MUNGE data augmentation to generate additional training data labeled by the teacher.

  • augment_ratio (float, default=1.0) – Ratio of augmented samples to original samples.

  • augment_swap_prob (float, default=0.1) – Feature swap probability for MUNGE augmentation.

  • random_state (int or None, default=None) – Random state.

student_

The trained student model.

Type:

estimator

teacher_score_

Teacher’s accuracy/R2 on training data (for reference).

Type:

float or None

student_score_

Student’s accuracy/R2 on training data.

Type:

float or None

is_classifier_

Whether this is a classification task.

Type:

bool

Example

>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.linear_model import LogisticRegression
>>>
>>> teacher = RandomForestClassifier(n_estimators=500).fit(X, y)
>>> distiller = KnowledgeDistiller(
...     teacher=teacher,
...     student=LogisticRegression(),
...     temperature=4.0,
...     alpha=0.8,
...     augment=True
... )
>>> distiller.fit(X, y)
>>> y_pred = distiller.predict(X_test)
fit(X, y, **fit_params)[source]

Train the student model using knowledge distillation.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training features.

  • y (array-like of shape (n_samples,)) – True labels (hard targets).

Return type:

KnowledgeDistiller

Returns:

self

predict(X)[source]

Predict using the trained student model.

Return type:

ndarray

predict_proba(X)[source]

Predict probabilities using the trained student model.

Return type:

ndarray

property feature_importances_

Feature importances from the student model.

compression_report()[source]

Generate a report comparing teacher and student performance.

Return type:

WSGIEnvironment

Returns:

dict with keys – teacher_score, student_score, score_retention, teacher_type, student_type

class endgame.ensemble.MultiOutputClassifier(estimator=None, n_jobs=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, ClassifierMixin

Wraps a single-output classifier for multi-output classification.

Fits one independent clone of the base classifier per output column. Supports parallel fitting via joblib.

Parameters:
  • estimator (estimator) – The base classifier to clone for each output. Must implement fit and predict.

  • n_jobs (int, optional) – Number of jobs for parallel fitting. None means 1 (sequential). -1 means using all processors.

  • random_state (int, optional) – Random seed. Passed to each cloned estimator if it accepts random_state.

  • verbose (bool, default=False) – Enable verbose output during fitting.

estimators_

Fitted classifiers, one per output.

Type:

List[estimator]

classes_

Class labels for each output.

Type:

List[ndarray]

n_outputs_

Number of output columns.

Type:

int

Examples

>>> from endgame.ensemble.multi_output import MultiOutputClassifier
>>> from sklearn.tree import DecisionTreeClassifier
>>> import numpy as np
>>> X = np.random.randn(100, 5)
>>> Y = np.random.randint(0, 3, size=(100, 3))
>>> clf = MultiOutputClassifier(DecisionTreeClassifier(), n_jobs=-1)
>>> clf.fit(X, Y)
>>> preds = clf.predict(X)
>>> preds.shape
(100, 3)
fit(X, Y, sample_weight=None)[source]

Fit one classifier per output column.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training features.

  • Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights passed to each estimator.

Returns:

self

predict(X)[source]

Predict class labels for each output.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

ndarray of shape (n_samples, n_outputs) – Predicted class labels.

predict_proba(X)[source]

Predict class probabilities for each output.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

list of ndarray – List of length n_outputs_, where each element is an array of shape (n_samples, n_classes_k) containing class probabilities for output k.

score(X, Y, sample_weight=None)[source]

Return the mean accuracy across all outputs.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • Y (array-like of shape (n_samples, n_outputs)) – True labels.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output accuracy scores.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (MultiOutputClassifier)

Returns:

self (object) – The updated object.

Return type:

MultiOutputClassifier

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (MultiOutputClassifier)

Returns:

self (object) – The updated object.

Return type:

MultiOutputClassifier

class endgame.ensemble.MultiOutputRegressor(estimator=None, n_jobs=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, RegressorMixin

Wraps a single-output regressor for multi-output regression.

Fits one independent clone of the base regressor per output column. Supports parallel fitting via joblib.

Parameters:
  • estimator (estimator) – The base regressor to clone for each output. Must implement fit and predict.

  • n_jobs (int, optional) – Number of jobs for parallel fitting. None means 1 (sequential). -1 means using all processors.

  • random_state (int, optional) – Random seed. Passed to each cloned estimator if it accepts random_state.

  • verbose (bool, default=False) – Enable verbose output during fitting.

estimators_

Fitted regressors, one per output.

Type:

List[estimator]

n_outputs_

Number of output columns.

Type:

int

Examples

>>> from endgame.ensemble.multi_output import MultiOutputRegressor
>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> X = np.random.randn(100, 5)
>>> Y = np.random.randn(100, 3)
>>> reg = MultiOutputRegressor(Ridge(), n_jobs=-1)
>>> reg.fit(X, Y)
>>> preds = reg.predict(X)
>>> preds.shape
(100, 3)
fit(X, Y, sample_weight=None)[source]

Fit one regressor per output column.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training features.

  • Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights passed to each estimator.

Returns:

self

predict(X)[source]

Predict target values for each output.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

ndarray of shape (n_samples, n_outputs) – Predicted values.

property feature_importances_

Average feature importances across all output estimators.

Returns:

ndarray of shape (n_features,) – Mean of feature_importances_ across fitted estimators.

Raises:

AttributeError – If the base estimators do not expose feature_importances_.

score(X, Y, sample_weight=None)[source]

Return the mean R^2 score across all outputs.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • Y (array-like of shape (n_samples, n_outputs)) – True target values.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output R^2 scores.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (MultiOutputRegressor)

Returns:

self (object) – The updated object.

Return type:

MultiOutputRegressor

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (MultiOutputRegressor)

Returns:

self (object) – The updated object.

Return type:

MultiOutputRegressor

class endgame.ensemble.ClassifierChain(estimator=None, order='auto', n_jobs=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, ClassifierMixin

Chain classifiers where each uses predictions of previous outputs as features.

Each classifier in the chain receives the original feature matrix X augmented with the predictions from all preceding classifiers. This allows the chain to model dependencies between outputs.

Parameters:
  • estimator (estimator) – The base classifier to clone for each link in the chain.

  • order (str or list of int, default='auto') –

    Chain ordering strategy: - 'auto': greedy ordering by pairwise correlation so that

    adjacent outputs in the chain are maximally correlated.

    • 'random': random permutation (seeded by random_state).

    • list of int: explicit column ordering.

  • n_jobs (int, optional) – Not used directly (chain is inherently sequential), but stored for API consistency.

  • random_state (int, optional) – Random seed for random ordering and estimator cloning.

  • verbose (bool, default=False) – Enable verbose output.

estimators_

Fitted classifiers in chain order.

Type:

List[estimator]

order_

The resolved output ordering.

Type:

list of int

classes_

Class labels for each output (in original column order).

Type:

List[ndarray]

n_outputs_

Number of output columns.

Type:

int

Examples

>>> from endgame.ensemble.multi_output import ClassifierChain
>>> from sklearn.linear_model import LogisticRegression
>>> import numpy as np
>>> X = np.random.randn(200, 5)
>>> Y = np.random.randint(0, 2, size=(200, 3))
>>> chain = ClassifierChain(LogisticRegression(), order='auto')
>>> chain.fit(X, Y)
>>> preds = chain.predict(X)
>>> preds.shape
(200, 3)
fit(X, Y, sample_weight=None)[source]

Fit the classifier chain.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training features.

  • Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Returns:

self

predict(X)[source]

Predict class labels for each output.

At prediction time, the chain uses its own predictions (rather than ground truth) for augmentation.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

ndarray of shape (n_samples, n_outputs) – Predicted class labels in original column order.

predict_proba(X)[source]

Predict class probabilities for each output.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

list of ndarray – List of length n_outputs_ (in original column order), where each element is an array of shape (n_samples, n_classes_k).

score(X, Y, sample_weight=None)[source]

Return the mean accuracy across all outputs.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • Y (array-like of shape (n_samples, n_outputs)) – True labels.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output accuracy scores.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (ClassifierChain)

Returns:

self (object) – The updated object.

Return type:

ClassifierChain

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (ClassifierChain)

Returns:

self (object) – The updated object.

Return type:

ClassifierChain

class endgame.ensemble.RegressorChain(estimator=None, order='auto', n_jobs=None, random_state=None, verbose=False)[source]

Bases: BaseEstimator, RegressorMixin

Chain regressors where each uses predictions of previous outputs as features.

Each regressor in the chain receives the original feature matrix X augmented with the predictions from all preceding regressors. This allows the chain to model dependencies between outputs.

Parameters:
  • estimator (estimator) – The base regressor to clone for each link in the chain.

  • order (str or list of int, default='auto') –

    Chain ordering strategy: - 'auto': greedy ordering by pairwise correlation so that

    adjacent outputs in the chain are maximally correlated.

    • 'random': random permutation (seeded by random_state).

    • list of int: explicit column ordering.

  • n_jobs (int, optional) – Not used directly (chain is inherently sequential), but stored for API consistency.

  • random_state (int, optional) – Random seed for random ordering and estimator cloning.

  • verbose (bool, default=False) – Enable verbose output.

estimators_

Fitted regressors in chain order.

Type:

List[estimator]

order_

The resolved output ordering.

Type:

list of int

n_outputs_

Number of output columns.

Type:

int

Examples

>>> from endgame.ensemble.multi_output import RegressorChain
>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> X = np.random.randn(200, 5)
>>> Y = np.random.randn(200, 3)
>>> chain = RegressorChain(Ridge(), order='auto')
>>> chain.fit(X, Y)
>>> preds = chain.predict(X)
>>> preds.shape
(200, 3)
fit(X, Y, sample_weight=None)[source]

Fit the regressor chain.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Training features.

  • Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.

  • sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Returns:

self

predict(X)[source]

Predict target values for each output.

At prediction time, the chain uses its own predictions (rather than ground truth) for augmentation.

Parameters:

X (array-like of shape (n_samples, n_features)) – Samples to predict.

Returns:

ndarray of shape (n_samples, n_outputs) – Predicted values in original column order.

property feature_importances_

Average feature importances across chain estimators.

Only includes importances for the original features (not the chained predictions), averaged across all estimators.

Returns:

ndarray of shape (n_features,) – Mean feature importances for the original features.

Raises:

AttributeError – If the base estimators do not expose feature_importances_.

score(X, Y, sample_weight=None)[source]

Return the mean R^2 score across all outputs.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Test samples.

  • Y (array-like of shape (n_samples, n_outputs)) – True target values.

  • sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output R^2 scores.

set_fit_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.

  • self (RegressorChain)

Returns:

self (object) – The updated object.

Return type:

RegressorChain

set_score_request(*, sample_weight='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

  • self (RegressorChain)

Returns:

self (object) – The updated object.

Return type:

RegressorChain