Ensemble¶

class endgame.ensemble.VotingClassifier(estimators, voting='soft', weights=None, flatten_transform=True, n_jobs=None, verbose=False)[source]¶

Bases: BaseEstimator, ClassifierMixin

Soft / hard voting meta-classifier.

Parameters:

estimators (list of (str, estimator) tuples) – Named base classifiers.
voting ({'hard', 'soft'}, default='soft') –
- 'hard': majority-vote on predicted labels.
- 'soft': average predicted probabilities, then argmax.
weights (array-like of shape (n_estimators,), optional) – Per-estimator weights. None means uniform.
flatten_transform (bool, default=True) – If True, transform returns shape (n_samples, n_classifiers * n_classes) instead of 3-D.
n_jobs (int or None, default=None) – Parallel fitting jobs. -1 uses all CPUs.
verbose (bool, default=False) – Print progress during fit.

estimators_¶

Fitted clones in the same order as estimators.

Type:: list of estimator

classes_¶

Unique class labels.

Type:: ndarray of shape (n_classes,)

le_¶

Mapping from label to integer index (for hard voting).

Type:: dict

Examples

>>> vc = VotingClassifier(
...     estimators=[("rf", RandomForest()), ("lr", LogisticRegression())],
...     voting="soft",
... )
>>> vc.fit(X_train, y_train).predict(X_test)

fit(X, y, sample_weight=None, **fit_params)[source]¶

Fit all base estimators.

Parameters:

X (array-like of shape (n_samples, n_features))
y (array-like of shape (n_samples,))
sample_weight (array-like, optional)

predict(X)[source]¶

Predict class labels.

Parameters:: X (array-like of shape (n_samples, n_features))
Returns:: ndarray of shape (n_samples,)

predict_proba(X)[source]¶

Average predicted probabilities.

Parameters:: X (array-like of shape (n_samples, n_features))
Returns:: ndarray of shape (n_samples, n_classes)

transform(X)[source]¶

Return per-estimator predictions or probabilities.

Parameters:: X (array-like of shape (n_samples, n_features))
Returns:: ndarray

get_params(deep=True)[source]¶

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params (dict) – Parameter names mapped to their values.

property named_estimators: dict[str, BaseEstimator]¶: Access fitted estimators by name.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (VotingClassifier)

Returns:

self (object) – The updated object.

Return type:

VotingClassifier

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (VotingClassifier)

Returns:

self (object) – The updated object.

Return type:

VotingClassifier

class endgame.ensemble.VotingRegressor(estimators, weights=None, n_jobs=None, verbose=False)[source]¶

Bases: BaseEstimator, RegressorMixin

Voting meta-regressor: averages predictions from multiple regressors.

Parameters:

estimators (list of (str, estimator) tuples) – Named base regressors.
weights (array-like of shape (n_estimators,), optional) – Per-estimator weights. None means uniform.
n_jobs (int or None, default=None) – Parallel fitting jobs.
verbose (bool, default=False) – Print progress during fit.

estimators_¶

Fitted clones.

Type:: list of estimator

Examples

>>> vr = VotingRegressor(
...     estimators=[("ridge", Ridge()), ("rf", RandomForestRegressor())],
...     weights=[1, 2],
... )
>>> vr.fit(X_train, y_train).predict(X_test)

fit(X, y, sample_weight=None, **fit_params)[source]¶

predict(X)[source]¶

transform(X)[source]¶

property named_estimators: dict[str, BaseEstimator]¶

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (VotingRegressor)

Returns:

self (object) – The updated object.

Return type:

VotingRegressor

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (VotingRegressor)

Returns:

self (object) – The updated object.

Return type:

VotingRegressor

class endgame.ensemble.BaggingClassifier(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, n_jobs=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, ClassifierMixin

Bootstrap Aggregating classifier.

Parameters:

base_estimator (estimator, optional) – Base learner to bag. Default: DecisionTreeClassifier().
n_estimators (int, default=10) – Number of bootstrap replicas.
max_samples (float or int, default=1.0) – Fraction (if float) or count (if int) of samples per bag.
max_features (float or int, default=1.0) – Fraction (if float) or count (if int) of features per bag.
bootstrap (bool, default=True) – Sample rows with replacement.
bootstrap_features (bool, default=False) – Sample features with replacement.
oob_score (bool, default=False) – Whether to compute out-of-bag accuracy.
n_jobs (int or None, default=None) – Parallel jobs. -1 uses all CPUs.
random_state (int or None, default=None) – Random seed.
verbose (bool, default=False)

estimators_¶

Fitted base estimators.

Type:: list of estimator

estimator_features_¶

Feature indices used by each estimator.

Type:: list of ndarray

oob_score_¶

OOB accuracy (only if oob_score=True).

Type:: float

oob_decision_function_¶

OOB predicted probabilities (only if oob_score=True).

Type:: ndarray

classes_¶

Type:: ndarray

fit(X, y, sample_weight=None)[source]¶

predict(X)[source]¶

predict_proba(X)[source]¶

property feature_importances_¶: Average feature importances across bags (original feature space).

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (BaggingClassifier)

Returns:

self (object) – The updated object.

Return type:

BaggingClassifier

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (BaggingClassifier)

Returns:

self (object) – The updated object.

Return type:

BaggingClassifier

class endgame.ensemble.BaggingRegressor(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, n_jobs=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, RegressorMixin

Bootstrap Aggregating regressor.

Parameters:

base_estimator (estimator, optional) – Base learner. Default: DecisionTreeRegressor().
n_estimators (int, default=10)
max_samples (float or int, default=1.0)
max_features (float or int, default=1.0)
bootstrap (bool, default=True)
bootstrap_features (bool, default=False)
oob_score (bool, default=False)
n_jobs (int or None, default=None)
random_state (int or None, default=None)
verbose (bool, default=False)

estimators_¶

Type:: list of estimator

estimator_features_¶

Type:: list of ndarray

oob_score_¶

OOB R² score (only if oob_score=True).

Type:: float

oob_prediction_¶

OOB predictions (only if oob_score=True).

Type:: ndarray

fit(X, y, sample_weight=None)[source]¶

predict(X)[source]¶

property feature_importances_¶

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (BaggingRegressor)

Returns:

self (object) – The updated object.

Return type:

BaggingRegressor

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (BaggingRegressor)

Returns:

self (object) – The updated object.

Return type:

BaggingRegressor

class endgame.ensemble.AdaBoostClassifier(base_estimator=None, n_estimators=50, learning_rate=1.0, algorithm='SAMME.R', random_state=None)[source]¶

Bases: BaseEstimator, ClassifierMixin

AdaBoost classifier (SAMME / SAMME.R).

Parameters:

base_estimator (estimator, optional) – Base learner. Default: DecisionTreeClassifier(max_depth=1) (stump).
n_estimators (int, default=50) – Maximum number of boosting rounds.
learning_rate (float, default=1.0) – Shrinkage applied to each estimator’s weight. Lower values require more estimators but generalize better.
algorithm ({'SAMME', 'SAMME.R'}, default='SAMME.R') –
- 'SAMME': discrete AdaBoost using class labels.
- 'SAMME.R': real AdaBoost using class probabilities (requires predict_proba).
random_state (int or None, default=None)

estimators_¶

Fitted weak learners.

Type:: list of estimator

estimator_weights_¶

Weight of each estimator (SAMME only).

Type:: ndarray

estimator_errors_¶

Weighted error of each estimator.

Type:: ndarray

classes_¶

Type:: ndarray

n_classes_¶

Type:: int

feature_importances_¶

Sum of feature importances weighted by estimator weight.

Type:: ndarray

fit(X, y, sample_weight=None)[source]¶

predict(X)[source]¶

predict_proba(X)[source]¶

property feature_importances_¶

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (AdaBoostClassifier)

Returns:

self (object) – The updated object.

Return type:

AdaBoostClassifier

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (AdaBoostClassifier)

Returns:

self (object) – The updated object.

Return type:

AdaBoostClassifier

class endgame.ensemble.AdaBoostRegressor(base_estimator=None, n_estimators=50, learning_rate=1.0, loss='linear', random_state=None)[source]¶

Bases: BaseEstimator, RegressorMixin

AdaBoost.R2 regressor.

Parameters:

base_estimator (estimator, optional) – Default: DecisionTreeRegressor(max_depth=3).
n_estimators (int, default=50)
learning_rate (float, default=1.0)
loss ({'linear', 'square', 'exponential'}, default='linear') – Loss function for computing sample weights.
random_state (int or None, default=None)

estimators_¶

Type:: list of estimator

estimator_weights_¶

Type:: ndarray

estimator_errors_¶

Type:: ndarray

feature_importances_¶

Type:: ndarray

fit(X, y, sample_weight=None)[source]¶

predict(X)[source]¶

property feature_importances_¶

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (AdaBoostRegressor)

Returns:

self (object) – The updated object.

Return type:

AdaBoostRegressor

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (AdaBoostRegressor)

Returns:

self (object) – The updated object.

Return type:

AdaBoostRegressor

class endgame.ensemble.StackingEnsemble(base_estimators=None, meta_estimator=None, cv=5, passthrough=False, use_proba=True, stack_method='auto', random_state=None, verbose=False)[source]¶

Bases: BaseEnsemble

Multi-level stacking with out-of-fold prediction handling.

Level 1: Diverse base models (GBDTs, NNs, etc.) Level 2: Meta-learner (typically Ridge/Linear Regression)

The meta-learner is trained on out-of-fold predictions from Level 1 to prevent overfitting.

Parameters:

base_estimators (List[estimator]) – Level 1 models.
meta_estimator (estimator, optional) – Level 2 model. Default: Ridge for regression, LogisticRegression for classification.
cv (int or CV splitter, default=5) – Cross-validation strategy for OOF predictions.
passthrough (bool, default=False) – Whether to include original features in Level 2.
use_proba (bool, default=True) – Use predict_proba for classification (if available).
stack_method (str, default='auto') – Method for stacking: ‘auto’, ‘predict’, ‘predict_proba’.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.

base_estimators_¶

Fitted Level 1 models.

Type:: List[estimator]

meta_estimator_¶

Fitted Level 2 model.

Type:: estimator

oof_predictions_¶

Out-of-fold predictions used for meta-learner training.

Type:: ndarray

Examples

>>> from endgame.ensemble import StackingEnsemble
>>> base_models = [LGBMWrapper(), XGBWrapper(), CatBoostWrapper()]
>>> stacker = StackingEnsemble(base_estimators=base_models)
>>> stacker.fit(X_train, y_train)
>>> predictions = stacker.predict(X_test)

fit(X, y, sample_weight=None, **fit_params)[source]¶

Fit the stacking ensemble.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like, optional) – Sample weights.
**fit_params – Additional parameters.

Return type:

StackingEnsemble

Returns:

self

predict(X)[source]¶

Predict using the stacking ensemble.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Return type:: ndarray
Returns:: ndarray – Predictions.

predict_proba(X)[source]¶

Predict class probabilities.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Return type:: ndarray
Returns:: ndarray of shape (n_samples, n_classes) – Class probabilities.

score(X, y, sample_weight=None)[source]¶

Return the mean accuracy on the given test data and labels.

For classification, this is the accuracy score. For regression, this is the R^2 score.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,)) – True labels for classification, true values for regression.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Return type:

float

Returns:

float – Score of the predictions.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (StackingEnsemble)

Returns:

self (object) – The updated object.

Return type:

StackingEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (StackingEnsemble)

Returns:

self (object) – The updated object.

Return type:

StackingEnsemble

class endgame.ensemble.BlendingEnsemble(base_estimators=None, meta_estimator=None, blend_fraction=0.2, use_proba=True, passthrough=False, cv=None, random_state=None, verbose=False)[source]¶

Bases: BaseEnsemble, ClassifierMixin

Blending Ensemble using hold-out set for meta-learner training.

Unlike stacking which uses cross-validation, blending uses a hold-out portion of the training data to generate meta-features for the second-level learner.

Parameters:

base_estimators (List[estimator]) – Level 1 models.
meta_estimator (estimator, optional) – Level 2 model. Default: LogisticRegression for classification.
blend_fraction (float, default=0.2) – Fraction of training data to use for blending (meta-learner training).
use_proba (bool, default=True) – Use predict_proba for classification (if available).
passthrough (bool, default=False) – Whether to include original features in Level 2.
cv (int, optional) – Ignored. For API compatibility with StackingEnsemble.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.

base_estimators_¶

Fitted Level 1 models.

Type:: List[estimator]

meta_estimator_¶

Fitted Level 2 model.

Type:: estimator

classes_¶

Unique class labels (for classification).

Type:: ndarray

Examples

>>> from endgame.ensemble import BlendingEnsemble
>>> base_models = [RandomForestClassifier(), GradientBoostingClassifier()]
>>> blender = BlendingEnsemble(base_estimators=base_models)
>>> blender.fit(X_train, y_train)
>>> predictions = blender.predict(X_test)

fit(X, y, sample_weight=None, **fit_params)[source]¶

Fit the blending ensemble.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training data.
y (array-like of shape (n_samples,)) – Target values.
sample_weight (array-like, optional) – Sample weights.

Return type:

BlendingEnsemble

Returns:

self

predict(X)[source]¶

Predict using the blending ensemble.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Return type:: ndarray
Returns:: ndarray – Predictions.

predict_proba(X)[source]¶

Predict class probabilities.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Return type:: ndarray
Returns:: ndarray of shape (n_samples, n_classes) – Class probabilities.

score(X, y, sample_weight=None)[source]¶

Return accuracy score on the given data.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
y (array-like of shape (n_samples,)) – True labels.
sample_weight (array-like, optional) – Sample weights.

Return type:

float

Returns:

float – Accuracy score.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (BlendingEnsemble)

Returns:

self (object) – The updated object.

Return type:

BlendingEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (BlendingEnsemble)

Returns:

self (object) – The updated object.

Return type:

BlendingEnsemble

class endgame.ensemble.OptimizedBlender(metric='roc_auc', n_trials=100, weight_bounds=(0, 1), normalize=True, maximize=True, random_state=None, verbose=False)[source]¶

Bases: BaseEnsemble

Optuna-powered blend weight optimization.

Uses Bayesian optimization to find optimal weights for combining model predictions.

Parameters:

metric (str or callable) – Metric to optimize: ‘roc_auc’, ‘rmse’, ‘mae’, etc.
n_trials (int, default=100) – Number of optimization trials.
weight_bounds (Tuple[float, float], default=(0, 1)) – Bounds for individual model weights.
normalize (bool, default=True) – Whether weights must sum to 1.
maximize (bool, default=True) – Whether to maximize or minimize the metric.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.

weights_¶

Optimized model weights.

Type:: Dict[int, float]

best_score_¶

Best score achieved.

Type:: float

study_¶

Optuna study object for further analysis.

Type:: optuna.Study

Examples

>>> blender = OptimizedBlender(metric='roc_auc', n_trials=100)
>>> blender.fit(oof_predictions, y_train)
>>> final_pred = blender.predict(test_predictions)

fit(predictions, y_true)[source]¶

Optimize blend weights using Optuna.

Parameters:

predictions (List of arrays) – Out-of-fold predictions from each model.
y_true (array-like) – True target values.

Return type:

OptimizedBlender

Returns:

self

predict(predictions)[source]¶

Apply optimized weights.

Parameters:: predictions (List of arrays) – Predictions from each model.
Return type:: ndarray
Returns:: ndarray – Blended prediction.

set_fit_request(*, predictions='$UNCHANGED$', y_true='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.
y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.
self (OptimizedBlender)

Returns:

self (object) – The updated object.

Return type:

OptimizedBlender

set_predict_request(*, predictions='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.
self (OptimizedBlender)

Returns:

self (object) – The updated object.

Return type:

OptimizedBlender

class endgame.ensemble.RankAverageBlender(method='average', normalize=True, weights=None, random_state=None, verbose=False)[source]¶

Bases: BaseEnsemble

Rank-based blending for submissions.

Converts predictions to ranks before averaging. Robust to different prediction scales across models.

Parameters:

method (str, default='average') – Rank method: ‘average’, ‘min’, ‘max’, ‘dense’, ‘ordinal’.
normalize (bool, default=True) – Whether to normalize ranks to [0, 1].
weights (Dict[int, float], optional) – Optional model weights. If None, uniform weights.
random_state (int | None)
verbose (bool)

Examples

>>> blender = RankAverageBlender()
>>> final_pred = blender.blend(test_predictions)

fit(predictions=None, y_true=None)[source]¶

Fit the blender (stores weights if provided).

Parameters:

predictions (ignored)
y_true (ignored)

Return type:

RankAverageBlender

Returns:

self

blend(predictions)[source]¶

Blend predictions using rank averaging.

Parameters:: predictions (List of arrays) – Predictions from each model.
Return type:: ndarray
Returns:: ndarray – Rank-averaged prediction.

predict(predictions)[source]¶

Alias for blend().

Return type:: ndarray
Parameters:: predictions (list[ndarray])

set_fit_request(*, predictions='$UNCHANGED$', y_true='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.
y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.
self (RankAverageBlender)

Returns:

self (object) – The updated object.

Return type:

RankAverageBlender

set_predict_request(*, predictions='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.
self (RankAverageBlender)

Returns:

self (object) – The updated object.

Return type:

RankAverageBlender

class endgame.ensemble.PowerBlender(scores=None, power=2.0, higher_is_better=True, random_state=None, verbose=False)[source]¶

Bases: BaseEnsemble

Power-weighted blending based on individual scores.

Weights models by their validation scores raised to a power. Higher power = more weight to best models.

Parameters:

scores (List[float]) – Validation scores for each model.
power (float, default=2.0) – Power to raise scores to (higher = more aggressive weighting).
higher_is_better (bool, default=True) – Whether higher scores are better.
random_state (int | None)
verbose (bool)

Examples

>>> scores = [0.85, 0.87, 0.86]
>>> blender = PowerBlender(scores=scores, power=3.0)
>>> final_pred = blender.predict(test_predictions)

fit(predictions=None, y_true=None, scores=None)[source]¶

Compute power-weighted blending weights.

Parameters:

predictions (ignored)
y_true (ignored)
scores (List[float], optional) – Model scores (overrides constructor scores).

Return type:

PowerBlender

Returns:

self

predict(predictions)[source]¶

Apply power weights.

Parameters:: predictions (List of arrays) – Predictions from each model.
Return type:: ndarray
Returns:: ndarray – Power-weighted prediction.

set_fit_request(*, predictions='$UNCHANGED$', scores='$UNCHANGED$', y_true='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.
scores (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for scores parameter in fit.
y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.
self (PowerBlender)

Returns:

self (object) – The updated object.

Return type:

PowerBlender

set_predict_request(*, predictions='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.
self (PowerBlender)

Returns:

self (object) – The updated object.

Return type:

PowerBlender

class endgame.ensemble.HillClimbingEnsemble(metric='roc_auc', n_iterations=100, early_stopping=20, maximize=True, init_weights='best_single', random_state=None, verbose=False)[source]¶

Bases: BaseEnsemble

Forward ensemble selection with replacement.

Iteratively adds models that maximize validation metric. Key technique for non-differentiable metrics (F1, MAP@K).

Algorithm: 1. Start with empty ensemble 2. For each iteration:

For each model in pool: - Compute metric if added to current ensemble

Add model that provides best improvement

Allow repeating models (weighted averaging)

Parameters:

metric (str or callable, default='roc_auc') – Metric to optimize: ‘roc_auc’, ‘log_loss’, ‘f1’, ‘accuracy’, ‘rmse’, ‘mae’, ‘r2’, or custom callable(y_true, y_pred).
n_iterations (int, default=100) – Number of hill climbing iterations.
early_stopping (int, default=20) – Stop if no improvement for this many iterations.
maximize (bool, default=True) – Whether to maximize or minimize the metric.
init_weights (str, default='best_single') – Initial weight strategy: ‘best_single’, ‘uniform’, ‘none’.
random_state (int, optional) – Random seed for tie-breaking.
verbose (bool, default=False) – Enable verbose output.

weights_¶

Optimized model weights (by model index).

Type:: Dict[int, float]

best_score_¶

Best ensemble score achieved.

Type:: float

selection_history_¶

Order in which models were selected.

Type:: List[int]

Examples

>>> from endgame.ensemble import HillClimbingEnsemble
>>> ensemble = HillClimbingEnsemble(metric='roc_auc', n_iterations=100)
>>> ensemble.fit(oof_predictions, y_train)
>>> print(f"Weights: {ensemble.weights_}")
>>> test_pred = ensemble.predict(test_predictions)

fit(predictions, y_true)[source]¶

Find optimal ensemble weights via hill climbing.

Parameters:

predictions (List of shape (n_models, n_samples, ...)) – Out-of-fold predictions from each model.
y_true (array-like) – True target values.

Return type:

HillClimbingEnsemble

Returns:

self

predict(predictions)[source]¶

Apply learned weights to generate ensemble prediction.

Parameters:: predictions (List of shape (n_models, n_samples, ...)) – Predictions from each model.
Return type:: ndarray
Returns:: ndarray – Weighted ensemble prediction.

get_result()[source]¶

Get ensemble result summary.

Return type:: EnsembleResult
Returns:: EnsembleResult – Result containing weights, score, and selected models.

set_fit_request(*, predictions='$UNCHANGED$', y_true='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in fit.
y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.
self (HillClimbingEnsemble)

Returns:

self (object) – The updated object.

Return type:

HillClimbingEnsemble

set_predict_request(*, predictions='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

predictions (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for predictions parameter in predict.
self (HillClimbingEnsemble)

Returns:

self (object) – The updated object.

Return type:

HillClimbingEnsemble

class endgame.ensemble.SuperLearner(base_estimators, meta_learner='nnls', cv=5, use_proba=True, include_original_features=False, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator

Cross-validated Super Learner ensemble.

Parameters:

base_estimators (list of (str, estimator) tuples) – Named base learners to combine.
meta_learner ({'nnls', 'ridge', 'best'} or estimator, default='nnls') – How to combine OOF predictions: - 'nnls': Non-negative least squares (convex combination). - 'ridge': Ridge regression on OOF predictions. - 'best': Use the single best base learner (no blending). - An sklearn estimator for custom meta-learning.
cv (int or CV splitter, default=5) – Cross-validation strategy for OOF predictions.
use_proba (bool, default=True) – Use predict_proba for classifiers (if available).
include_original_features (bool, default=False) – Pass original features to the meta-learner alongside OOF predictions.
random_state (int or None, default=None)
verbose (bool, default=False)

coef_¶

Meta-learner weights (one per base estimator).

Type:: ndarray

base_estimators_¶

Fitted base estimators (on full training data).

Type:: list of estimator

oof_predictions_¶

Out-of-fold predictions used for meta-learning.

Type:: ndarray

cv_scores_¶

Per-estimator cross-validated score.

Type:: dict of {name: float}

is_classifier_¶

Type:: bool

References

van der Laan, M.J., Polley, E.C. & Hubbard, A.E. (2007). Super Learner. Statistical Applications in Genetics and Molecular Biology, 6(1).

fit(X, y, sample_weight=None)[source]¶

Fit the Super Learner.

Generate OOF predictions for each base estimator.
Solve for optimal combination weights.
Refit all base estimators on the full training set.

predict(X)[source]¶

predict_proba(X)[source]¶

property named_estimators¶

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (SuperLearner)

Returns:

self (object) – The updated object.

Return type:

SuperLearner

class endgame.ensemble.BayesianModelAveraging(criterion='bic', prior='uniform', task='auto')[source]¶

Bases: BaseEstimator

Bayesian Model Averaging using information-criterion weights.

Parameters:

criterion ({'bic', 'aic', 'aic_c'}, default='bic') –
- 'bic': Bayesian Information Criterion (penalizes complexity more).
- 'aic': Akaike Information Criterion.
- 'aic_c': Corrected AIC for small samples.
prior ({'uniform', 'complexity'} or array-like, optional) – Prior over models. Default is uniform.
task ({'auto', 'classification', 'regression'}, default='auto')

weights_¶

Posterior model weights summing to 1.

Type:: ndarray

ic_scores_¶

Information criterion values for each model.

Type:: ndarray

estimators_¶

Fitted estimators (references, not clones).

Type:: list of estimator

fit(estimators, X_val, y_val)[source]¶

Compute posterior weights from validation data.

Parameters:

estimators (list of fitted estimators) – Already-fitted models.
X_val (array-like) – Validation features.
y_val (array-like) – Validation target.

Return type:

BayesianModelAveraging

Returns:

self

predict(X)[source]¶

predict_proba(X)[source]¶

set_fit_request(*, X_val='$UNCHANGED$', estimators='$UNCHANGED$', y_val='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

X_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for X_val parameter in fit.
estimators (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for estimators parameter in fit.
y_val (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_val parameter in fit.
self (BayesianModelAveraging)

Returns:

self (object) – The updated object.

Return type:

BayesianModelAveraging

class endgame.ensemble.NegativeCorrelationEnsemble(base_estimators, lambda_ncl=0.5, n_iterations=10, weights=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, RegressorMixin

Negative Correlation Learning ensemble for regression.

Trains all base learners together, each with a modified loss that includes a diversity term penalizing correlation with the ensemble average. This produces models that are individually weaker but collectively stronger.

Parameters:

base_estimators (list of estimator) – Base regressors to train. Must support sample_weight or partial fit.
lambda_ncl (float, default=0.5) – Strength of the negative correlation penalty. - 0: independent training (standard ensemble). - 1: maximum diversity pressure.
n_iterations (int, default=10) – Number of NCL training rounds.
weights (array-like, optional) – Static model weights. Default is uniform.
random_state (int or None, default=None)
verbose (bool, default=False)

estimators_¶

Fitted base regressors.

Type:: list of estimator

weights_¶

Model combination weights.

Type:: ndarray

diversity_¶

Measured ensemble diversity (average pairwise disagreement).

Type:: float

References

Liu, Y. & Yao, X. (1999). Ensemble Learning via Negative Correlation. Neural Networks, 12(10), 1399-1404.

fit(X, y, sample_weight=None)[source]¶

Fit with negative correlation learning.

Each round: 1. Compute ensemble prediction (average of all learners). 2. For each learner i, compute modified sample weights

that up-weight samples where learner i disagrees with the ensemble (promoting diversity).

Refit each learner with the modified weights.

predict(X)[source]¶

property feature_importances_¶

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (NegativeCorrelationEnsemble)

Returns:

self (object) – The updated object.

Return type:

NegativeCorrelationEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (NegativeCorrelationEnsemble)

Returns:

self (object) – The updated object.

Return type:

NegativeCorrelationEnsemble

class endgame.ensemble.SnapshotEnsemble(base_estimator, n_snapshots=5, epochs_per_cycle=40, initial_lr=0.1, min_lr=1e-05, verbose=False)[source]¶

Bases: BaseEstimator

Snapshot Ensemble via cosine annealing warm restarts.

Trains a single neural-network-like estimator with a cyclic learning rate schedule. At the end of each cycle (when LR reaches its minimum), takes a “snapshot” of the model. The final ensemble averages predictions across all snapshots.

Parameters:

base_estimator (estimator) – A model supporting partial_fit (e.g., MLPClassifier, SGDClassifier, SGDRegressor). Must accept learning_rate_init or eta0.
n_snapshots (int, default=5) – Number of snapshots (cycles) to collect.
epochs_per_cycle (int, default=40) – Training epochs per cosine annealing cycle.
initial_lr (float, default=0.1) – Peak learning rate at the start of each cycle.
min_lr (float, default=1e-5) – Minimum learning rate at end of each cycle (snapshot point).
verbose (bool, default=False)

snapshots_¶

Saved model snapshots.

Type:: list of estimator

lr_history_¶

Learning rate at each epoch.

Type:: list of float

is_classifier_¶

Type:: bool

References

Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J.E., & Weinberger, K.Q. (2017). Snapshot Ensembles: Train 1, Get M for Free. ICLR.

fit(X, y, **fit_params)[source]¶

Train with cyclic LR and collect snapshots.

Parameters:

X (array-like)
y (array-like)

predict(X)[source]¶

predict_proba(X)[source]¶

class endgame.ensemble.CascadeEnsemble(stages, confidence_threshold=0.95, cv=3, use_proba=True, passthrough=True, max_stages=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, ClassifierMixin

Multi-stage cascade classifier with early-exit.

Parameters:

stages (list of list of estimator) – Each stage is a list of base classifiers. Predictions from stage k are concatenated as features for stage k+1.
confidence_threshold (float, default=0.95) – If max predicted probability exceeds this, the sample exits the cascade early (only at prediction time).
cv (int, default=3) – CV folds for generating OOF features during training.
use_proba (bool, default=True) – Use predicted probabilities as cascade features (vs. labels).
passthrough (bool, default=True) – Include original features at every stage.
max_stages (int or None, default=None) – Maximum number of stages. If None, use all provided stages.
random_state (int or None, default=None)
verbose (bool, default=False)

stages_¶

Fitted estimators per stage.

Type:: list of list of estimator

classes_¶

Type:: ndarray

n_stages_¶

Number of fitted stages.

Type:: int

stage_scores_¶

Per-stage validation accuracy.

Type:: list of float

fit(X, y, sample_weight=None)[source]¶

Fit the cascade stage by stage.

At each stage, generate OOF predictions, concatenate them as features for the next stage, then refit on all data.

predict(X)[source]¶

predict_proba(X)[source]¶

Predict with early exit based on confidence.

Samples whose max probability exceeds confidence_threshold at any stage are assigned their prediction from that stage. Remaining samples proceed to the next stage.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (CascadeEnsemble)

Returns:

self (object) – The updated object.

Return type:

CascadeEnsemble

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (CascadeEnsemble)

Returns:

self (object) – The updated object.

Return type:

CascadeEnsemble

class endgame.ensemble.ThresholdOptimizer(metric='f1', search_method='grid', n_thresholds=100, threshold_range=(0.1, 0.9), multiclass=False, random_state=None, verbose=False)[source]¶

Bases: EndgameEstimator

Optimizes classification thresholds for target metrics.

Standard 0.5 threshold is often suboptimal. This optimizer finds per-class thresholds that maximize the target metric.

Parameters:

metric (str or callable, default='f1') – Metric to optimize: ‘f1’, ‘f1_macro’, ‘f1_weighted’, ‘accuracy’, ‘balanced_accuracy’, or custom callable.
search_method (str, default='grid') – Search method: ‘grid’, ‘optuna’, ‘hill_climb’.
n_thresholds (int, default=100) – Number of thresholds to search (for grid search).
threshold_range (Tuple[float, float], default=(0.1, 0.9)) – Range of thresholds to search.
multiclass (bool, default=False) – Whether to optimize per-class thresholds.
random_state (int, optional) – Random seed.
verbose (bool, default=False) – Enable verbose output.

threshold_¶

Optimized threshold(s).

Type:: float or Dict[int, float]

best_score_¶

Best score achieved.

Type:: float

Examples

>>> optimizer = ThresholdOptimizer(metric='f1')
>>> optimizer.fit(y_true, y_proba)
>>> print(f"Optimal threshold: {optimizer.threshold_}")
>>> y_pred = optimizer.predict(y_proba)

fit(y_true, y_proba)[source]¶

Find optimal threshold(s).

Parameters:

y_true (array-like) – True labels.
y_proba (array-like) – Predicted probabilities. Shape (n_samples,) for binary, (n_samples, n_classes) for multiclass.

Return type:

ThresholdOptimizer

Returns:

self

predict(y_proba)[source]¶

Apply optimized threshold(s) to predictions.

Parameters:: y_proba (array-like) – Predicted probabilities.
Return type:: ndarray
Returns:: ndarray – Predicted labels.

transform(y_proba)[source]¶

Alias for predict().

Return type:: ndarray
Parameters:: y_proba (ndarray)

set_fit_request(*, y_proba='$UNCHANGED$', y_true='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

y_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_proba parameter in fit.
y_true (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_true parameter in fit.
self (ThresholdOptimizer)

Returns:

self (object) – The updated object.

Return type:

ThresholdOptimizer

set_predict_request(*, y_proba='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the predict method.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

y_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_proba parameter in predict.
self (ThresholdOptimizer)

Returns:

self (object) – The updated object.

Return type:

ThresholdOptimizer

set_transform_request(*, y_proba='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the transform method.

The options for each parameter are:

True: metadata is requested, and passed to transform if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to transform.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

y_proba (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for y_proba parameter in transform.
self (ThresholdOptimizer)

Returns:

self (object) – The updated object.

Return type:

ThresholdOptimizer

class endgame.ensemble.KnowledgeDistiller(teacher, student, temperature=3.0, alpha=0.7, augment=False, augment_ratio=1.0, augment_swap_prob=0.1, random_state=None)[source]¶

Bases: BaseEstimator

Knowledge distillation from teacher to student model.

Trains a simpler student model to mimic the predictions of a complex teacher model (or ensemble), enabling deployment of lightweight models with minimal accuracy loss.

Parameters:

teacher (estimator) – Fitted teacher model. Must have predict_proba (classification) or predict (regression).
student (estimator) – Unfitted student model to train.
temperature (float, default=3.0) – Softmax temperature for soft label generation (classification only). Higher values produce softer probability distributions that reveal more about the teacher’s learned relationships.
alpha (float, default=0.7) – Weight for soft labels vs hard labels. Loss = alpha * soft_loss + (1 - alpha) * hard_loss. Set to 1.0 for pure distillation.
augment (bool, default=False) – Whether to use MUNGE data augmentation to generate additional training data labeled by the teacher.
augment_ratio (float, default=1.0) – Ratio of augmented samples to original samples.
augment_swap_prob (float, default=0.1) – Feature swap probability for MUNGE augmentation.
random_state (int or None, default=None) – Random state.

student_¶

The trained student model.

Type:: estimator

teacher_score_¶

Teacher’s accuracy/R2 on training data (for reference).

Type:: float or None

student_score_¶

Student’s accuracy/R2 on training data.

Type:: float or None

is_classifier_¶

Whether this is a classification task.

Type:: bool

Example

>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.linear_model import LogisticRegression
>>>
>>> teacher = RandomForestClassifier(n_estimators=500).fit(X, y)
>>> distiller = KnowledgeDistiller(
...     teacher=teacher,
...     student=LogisticRegression(),
...     temperature=4.0,
...     alpha=0.8,
...     augment=True
... )
>>> distiller.fit(X, y)
>>> y_pred = distiller.predict(X_test)

fit(X, y, **fit_params)[source]¶

Train the student model using knowledge distillation.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training features.
y (array-like of shape (n_samples,)) – True labels (hard targets).

Return type:

KnowledgeDistiller

Returns:

self

predict(X)[source]¶

Predict using the trained student model.

Return type:: ndarray

predict_proba(X)[source]¶

Predict probabilities using the trained student model.

Return type:: ndarray

property feature_importances_¶: Feature importances from the student model.

compression_report()[source]¶

Generate a report comparing teacher and student performance.

Return type:: WSGIEnvironment
Returns:: dict with keys – teacher_score, student_score, score_retention, teacher_type, student_type

class endgame.ensemble.MultiOutputClassifier(estimator=None, n_jobs=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, ClassifierMixin

Wraps a single-output classifier for multi-output classification.

Fits one independent clone of the base classifier per output column. Supports parallel fitting via joblib.

Parameters:

estimator (estimator) – The base classifier to clone for each output. Must implement fit and predict.
n_jobs (int, optional) – Number of jobs for parallel fitting. None means 1 (sequential). -1 means using all processors.
random_state (int, optional) – Random seed. Passed to each cloned estimator if it accepts random_state.
verbose (bool, default=False) – Enable verbose output during fitting.

estimators_¶

Fitted classifiers, one per output.

Type:: List[estimator]

classes_¶

Class labels for each output.

Type:: List[ndarray]

n_outputs_¶

Number of output columns.

Type:: int

Examples

>>> from endgame.ensemble.multi_output import MultiOutputClassifier
>>> from sklearn.tree import DecisionTreeClassifier
>>> import numpy as np
>>> X = np.random.randn(100, 5)
>>> Y = np.random.randint(0, 3, size=(100, 3))
>>> clf = MultiOutputClassifier(DecisionTreeClassifier(), n_jobs=-1)
>>> clf.fit(X, Y)
>>> preds = clf.predict(X)
>>> preds.shape
(100, 3)

fit(X, Y, sample_weight=None)[source]¶

Fit one classifier per output column.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training features.
Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights passed to each estimator.

Returns:

self

predict(X)[source]¶

Predict class labels for each output.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: ndarray of shape (n_samples, n_outputs) – Predicted class labels.

predict_proba(X)[source]¶

Predict class probabilities for each output.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: list of ndarray – List of length n_outputs_, where each element is an array of shape (n_samples, n_classes_k) containing class probabilities for output k.

score(X, Y, sample_weight=None)[source]¶

Return the mean accuracy across all outputs.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
Y (array-like of shape (n_samples, n_outputs)) – True labels.
sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output accuracy scores.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (MultiOutputClassifier)

Returns:

self (object) – The updated object.

Return type:

MultiOutputClassifier

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (MultiOutputClassifier)

Returns:

self (object) – The updated object.

Return type:

MultiOutputClassifier

class endgame.ensemble.MultiOutputRegressor(estimator=None, n_jobs=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, RegressorMixin

Wraps a single-output regressor for multi-output regression.

Fits one independent clone of the base regressor per output column. Supports parallel fitting via joblib.

Parameters:

estimator (estimator) – The base regressor to clone for each output. Must implement fit and predict.
n_jobs (int, optional) – Number of jobs for parallel fitting. None means 1 (sequential). -1 means using all processors.
random_state (int, optional) – Random seed. Passed to each cloned estimator if it accepts random_state.
verbose (bool, default=False) – Enable verbose output during fitting.

estimators_¶

Fitted regressors, one per output.

Type:: List[estimator]

n_outputs_¶

Number of output columns.

Type:: int

Examples

>>> from endgame.ensemble.multi_output import MultiOutputRegressor
>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> X = np.random.randn(100, 5)
>>> Y = np.random.randn(100, 3)
>>> reg = MultiOutputRegressor(Ridge(), n_jobs=-1)
>>> reg.fit(X, Y)
>>> preds = reg.predict(X)
>>> preds.shape
(100, 3)

fit(X, Y, sample_weight=None)[source]¶

Fit one regressor per output column.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training features.
Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights passed to each estimator.

Returns:

self

predict(X)[source]¶

Predict target values for each output.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: ndarray of shape (n_samples, n_outputs) – Predicted values.

property feature_importances_¶

Average feature importances across all output estimators.

Returns:: ndarray of shape (n_features,) – Mean of feature_importances_ across fitted estimators.
Raises:: AttributeError – If the base estimators do not expose feature_importances_.

score(X, Y, sample_weight=None)[source]¶

Return the mean R^2 score across all outputs.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
Y (array-like of shape (n_samples, n_outputs)) – True target values.
sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output R^2 scores.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (MultiOutputRegressor)

Returns:

self (object) – The updated object.

Return type:

MultiOutputRegressor

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (MultiOutputRegressor)

Returns:

self (object) – The updated object.

Return type:

MultiOutputRegressor

class endgame.ensemble.ClassifierChain(estimator=None, order='auto', n_jobs=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, ClassifierMixin

Chain classifiers where each uses predictions of previous outputs as features.

Each classifier in the chain receives the original feature matrix X augmented with the predictions from all preceding classifiers. This allows the chain to model dependencies between outputs.

Parameters:

estimator (estimator) – The base classifier to clone for each link in the chain.
order (str or list of int, default='auto') –
Chain ordering strategy: - 'auto': greedy ordering by pairwise correlation so that

adjacent outputs in the chain are maximally correlated.
- 'random': random permutation (seeded by random_state).
- list of int: explicit column ordering.
n_jobs (int, optional) – Not used directly (chain is inherently sequential), but stored for API consistency.
random_state (int, optional) – Random seed for random ordering and estimator cloning.
verbose (bool, default=False) – Enable verbose output.

estimators_¶

Fitted classifiers in chain order.

Type:: List[estimator]

order_¶

The resolved output ordering.

Type:: list of int

classes_¶

Class labels for each output (in original column order).

Type:: List[ndarray]

n_outputs_¶

Number of output columns.

Type:: int

Examples

>>> from endgame.ensemble.multi_output import ClassifierChain
>>> from sklearn.linear_model import LogisticRegression
>>> import numpy as np
>>> X = np.random.randn(200, 5)
>>> Y = np.random.randint(0, 2, size=(200, 3))
>>> chain = ClassifierChain(LogisticRegression(), order='auto')
>>> chain.fit(X, Y)
>>> preds = chain.predict(X)
>>> preds.shape
(200, 3)

fit(X, Y, sample_weight=None)[source]¶

Fit the classifier chain.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training features.
Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Returns:

self

predict(X)[source]¶

Predict class labels for each output.

At prediction time, the chain uses its own predictions (rather than ground truth) for augmentation.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: ndarray of shape (n_samples, n_outputs) – Predicted class labels in original column order.

predict_proba(X)[source]¶

Predict class probabilities for each output.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: list of ndarray – List of length n_outputs_ (in original column order), where each element is an array of shape (n_samples, n_classes_k).

score(X, Y, sample_weight=None)[source]¶

Return the mean accuracy across all outputs.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
Y (array-like of shape (n_samples, n_outputs)) – True labels.
sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output accuracy scores.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (ClassifierChain)

Returns:

self (object) – The updated object.

Return type:

ClassifierChain

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (ClassifierChain)

Returns:

self (object) – The updated object.

Return type:

ClassifierChain

class endgame.ensemble.RegressorChain(estimator=None, order='auto', n_jobs=None, random_state=None, verbose=False)[source]¶

Bases: BaseEstimator, RegressorMixin

Chain regressors where each uses predictions of previous outputs as features.

Each regressor in the chain receives the original feature matrix X augmented with the predictions from all preceding regressors. This allows the chain to model dependencies between outputs.

Parameters:

estimator (estimator) – The base regressor to clone for each link in the chain.
order (str or list of int, default='auto') –
Chain ordering strategy: - 'auto': greedy ordering by pairwise correlation so that

adjacent outputs in the chain are maximally correlated.
- 'random': random permutation (seeded by random_state).
- list of int: explicit column ordering.
n_jobs (int, optional) – Not used directly (chain is inherently sequential), but stored for API consistency.
random_state (int, optional) – Random seed for random ordering and estimator cloning.
verbose (bool, default=False) – Enable verbose output.

estimators_¶

Fitted regressors in chain order.

Type:: List[estimator]

order_¶

The resolved output ordering.

Type:: list of int

n_outputs_¶

Number of output columns.

Type:: int

Examples

>>> from endgame.ensemble.multi_output import RegressorChain
>>> from sklearn.linear_model import Ridge
>>> import numpy as np
>>> X = np.random.randn(200, 5)
>>> Y = np.random.randn(200, 3)
>>> chain = RegressorChain(Ridge(), order='auto')
>>> chain.fit(X, Y)
>>> preds = chain.predict(X)
>>> preds.shape
(200, 3)

fit(X, Y, sample_weight=None)[source]¶

Fit the regressor chain.

Parameters:

X (array-like of shape (n_samples, n_features)) – Training features.
Y (array-like of shape (n_samples, n_outputs)) – Multi-output target matrix.
sample_weight (array-like of shape (n_samples,), optional) – Sample weights.

Returns:

self

predict(X)[source]¶

Predict target values for each output.

At prediction time, the chain uses its own predictions (rather than ground truth) for augmentation.

Parameters:: X (array-like of shape (n_samples, n_features)) – Samples to predict.
Returns:: ndarray of shape (n_samples, n_outputs) – Predicted values in original column order.

property feature_importances_¶

Average feature importances across chain estimators.

Only includes importances for the original features (not the chained predictions), averaged across all estimators.

Returns:: ndarray of shape (n_features,) – Mean feature importances for the original features.
Raises:: AttributeError – If the base estimators do not expose feature_importances_.

score(X, Y, sample_weight=None)[source]¶

Return the mean R^2 score across all outputs.

Parameters:

X (array-like of shape (n_samples, n_features)) – Test samples.
Y (array-like of shape (n_samples, n_outputs)) – True target values.
sample_weight (array-like, optional) – Sample weights.

Returns:

float – Mean of per-output R^2 scores.

set_fit_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the fit method.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in fit.
self (RegressorChain)

Returns:

self (object) – The updated object.

Return type:

RegressorChain

set_score_request(*, sample_weight='$UNCHANGED$')¶

Configure whether metadata should be requested to be passed to the score method.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.
self (RegressorChain)

Returns:

self (object) – The updated object.

Return type:

RegressorChain