Persistence¶

endgame.persistence.save(estimator, path, backend='auto', compress=None)[source]¶

Save any sklearn-compatible estimator to disk.

Parameters:

estimator (estimator object) – A fitted (or unfitted) sklearn-compatible estimator.
path (str) – Destination file or directory path. The appropriate extension (.egm for single-file, .egd for PyTorch directory) will be added automatically.
backend (str, default="auto") – Serialization backend. "auto" selects "torch" for estimators containing nn.Module attributes, "joblib" otherwise. Explicit options: "joblib", "torch", "pickle".
compress (int or None) – Compression level (0-9). Only used by the joblib backend. None means no compression.

Return type:

Text

Returns:

str – The actual path where the model was saved.

Examples

>>> from sklearn.linear_model import LogisticRegression
>>> import endgame as eg
>>> model = LogisticRegression().fit(X_train, y_train)
>>> eg.save(model, "/tmp/my_model")
'/tmp/my_model.egm'
>>> loaded = eg.load("/tmp/my_model.egm")

endgame.persistence.load(path, map_location=None)[source]¶

Load an estimator from disk.

Parameters:

path (str) – Path to a .egm file or .egd directory.
map_location (str or None) – PyTorch map_location for loading tensors (e.g. "cpu"). Only relevant for .egd (PyTorch) saves.

Return type:

Any

Returns:

estimator – The loaded estimator.

Examples

>>> import endgame as eg
>>> model = eg.load("/tmp/my_model.egm")
>>> model.predict(X_test)

class endgame.persistence.ModelMetadata(endgame_version='', format_version=1, model_class='', model_params=<factory>, created_at='', python_version='', dependencies=<factory>, n_features_in_=None, feature_names_in_=None, classes_=None, is_fitted=False, backend='joblib', compression=None)[source]¶

Bases: object

Metadata stored alongside a persisted model.

Parameters:

endgame_version (str)
format_version (int)
model_class (str)
model_params (dict[str, Any])
created_at (str)
python_version (str)
dependencies (dict[str, str])
n_features_in_ (int | None)
feature_names_in_ (list[str] | None)
classes_ (list | None)
is_fitted (bool)
backend (str)
compression (int | None)

endgame_version¶

Version of endgame that saved the model.

Type:: str

format_version¶

Persistence format version for forward compatibility.

Type:: int

model_class¶

Fully qualified class name of the estimator.

Type:: str

model_params¶

Parameters from get_params().

Type:: dict

created_at¶

ISO 8601 timestamp of when the model was saved.

Type:: str

python_version¶

Python version used to save the model.

Type:: str

dependencies¶

Versions of key dependencies (numpy, sklearn, torch, etc.).

Type:: dict

n_features_in_¶

Number of input features (if fitted).

Type:: int or None

feature_names_in_¶

Input feature names (if available).

Type:: list of str or None

classes_¶

Class labels for classifiers.

Type:: list or None

is_fitted¶

Whether the estimator was fitted when saved.

Type:: bool

backend¶

Backend used for serialization (“joblib”, “torch”, or “pickle”).

Type:: str

compression¶

Compression level used.

Type:: int or None

endgame_version: str = ''¶

format_version: int = 1¶

model_class: str = ''¶

model_params: dict[str, Any]¶

created_at: str = ''¶

python_version: str = ''¶

dependencies: dict[str, str]¶

n_features_in_: int | None = None¶

feature_names_in_: list[str] | None = None¶

classes_: list | None = None¶

is_fitted: bool = False¶

backend: str = 'joblib'¶

compression: int | None = None¶

to_dict()[source]¶

Convert to a JSON-serializable dictionary.

Return type:: WSGIEnvironment

classmethod from_dict(d)[source]¶

Create from a dictionary.

Return type:: ModelMetadata
Parameters:: d (dict)

endgame.persistence.export_onnx(estimator, path, sample_input=None, opset_version=15, backend='auto')[source]¶

Export a fitted estimator to ONNX format.

Auto-detects the best conversion backend based on the estimator type:

sklearn models -> skl2onnx
Tree-based GBDTs (LightGBM, XGBoost, CatBoost) -> skl2onnx (with registered converters from onnxmltools)
PyTorch nn.Module-backed models -> torch.onnx.export
Fallback -> hummingbird-ml

Parameters:

estimator (Any) – Fitted sklearn-compatible estimator.
path (Text | Path) – Output file path. The .onnx extension is added automatically if not present.
sample_input (ndarray | None) – Sample input array for shape inference. Required for PyTorch models; strongly recommended for all others.
opset_version (int) – ONNX opset version. Default is 15, which provides broad operator coverage.
backend (Text) – Conversion backend. "auto" selects the best backend based on estimator type. Explicit options: "skl2onnx", "hummingbird", "torch".

Return type:

Text

Returns:

Path to the saved ONNX file.

Raises:

ValueError – If the backend is unknown or input shape cannot be inferred.
RuntimeError – If the ONNX conversion fails.

Examples

Export a scikit-learn model:

>>> from sklearn.ensemble import RandomForestClassifier
>>> import endgame as eg
>>> model = RandomForestClassifier(n_estimators=10).fit(X, y)
>>> eg.export_onnx(model, "rf_model.onnx", sample_input=X[:1])
'rf_model.onnx'

Export a LightGBM model:

>>> from endgame.models.wrappers import LGBMWrapper
>>> model = LGBMWrapper(task='classification').fit(X, y)
>>> eg.export_onnx(model, "lgbm.onnx", sample_input=X[:1])
'lgbm.onnx'

Export with a specific backend:

>>> eg.export_onnx(model, "model.onnx", backend='hummingbird')
'model.onnx'

class endgame.persistence.ModelServer(model_path, providers=None, session_options=None)[source]¶

Bases: object

Lightweight inference server using ONNX Runtime.

Loads an ONNX model and provides predict and predict_proba methods that match the sklearn estimator interface. Supports both CPU and GPU execution providers.

Parameters:

model_path (Text | Path) – Path to the .onnx model file.
providers (Sequence[Text] | None) – ONNX Runtime execution providers. Defaults to ["CPUExecutionProvider"]. Use ["CUDAExecutionProvider", "CPUExecutionProvider"] for GPU.
session_options (Any | None) – Optional ort.SessionOptions for tuning thread count, graph optimisation level, etc.

session¶: The underlying ort.InferenceSession.

input_names¶: Names of the model’s input tensors.

output_names¶: Names of the model’s output tensors.

metadata¶: ONNX model metadata properties (if any).

Examples

Basic inference:

>>> from endgame.persistence import ModelServer
>>> server = ModelServer("model.onnx")
>>> preds = server.predict(X_test)

With probability outputs:

>>> proba = server.predict_proba(X_test)
>>> proba.shape
(100, 2)

GPU inference:

>>> server = ModelServer(
...     "model.onnx",
...     providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
... )

Custom session options:

>>> import onnxruntime as ort
>>> opts = ort.SessionOptions()
>>> opts.intra_op_num_threads = 4
>>> server = ModelServer("model.onnx", session_options=opts)

predict(X)[source]¶

Generate predictions from the ONNX model.

For classifiers exported via skl2onnx, the first output is typically the predicted label. For regressors, it is the predicted value.

Parameters:: X (ndarray) – Input feature array of shape (n_samples, n_features).
Return type:: ndarray
Returns:: Predictions of shape (n_samples,) for single-output models, or (n_samples, n_outputs) for multi-output.

Examples

>>> server = ModelServer("model.onnx")
>>> preds = server.predict(X_test)
>>> preds.shape
(100,)

predict_proba(X)[source]¶

Generate class probability estimates from the ONNX model.

For classifiers exported via skl2onnx, the second output is typically the probability array. If only one output exists, it is returned directly.

Parameters:: X (ndarray) – Input feature array of shape (n_samples, n_features).
Return type:: ndarray
Returns:: Probability array of shape (n_samples, n_classes).
Raises:: RuntimeError – If the model does not produce probability outputs.

Examples

>>> server = ModelServer("model.onnx")
>>> proba = server.predict_proba(X_test)
>>> proba.shape
(100, 2)

predict_raw(X)[source]¶

Return all raw ONNX model outputs without post-processing.

Useful for debugging or when the model has non-standard outputs.

Parameters:: X (ndarray) – Input feature array of shape (n_samples, n_features).
Return type:: list[ndarray]
Returns:: List of all output arrays from the ONNX model.

Examples

>>> server = ModelServer("model.onnx")
>>> outputs = server.predict_raw(X_test)
>>> len(outputs)
2

property input_shapes: list[list | None]¶

Return the expected input shapes.

Returns:: List of shape lists, one per input. Dimensions may be None for dynamic axes.

property output_shapes: list[list | None]¶

Return the output shapes.

Returns:: List of shape lists, one per output. Dimensions may be None for dynamic axes.