Persistence

endgame.persistence.save(estimator, path, backend='auto', compress=None)[source]

Save any sklearn-compatible estimator to disk.

Parameters:
  • estimator (estimator object) – A fitted (or unfitted) sklearn-compatible estimator.

  • path (str) – Destination file or directory path. The appropriate extension (.egm for single-file, .egd for PyTorch directory) will be added automatically.

  • backend (str, default="auto") – Serialization backend. "auto" selects "torch" for estimators containing nn.Module attributes, "joblib" otherwise. Explicit options: "joblib", "torch", "pickle".

  • compress (int or None) – Compression level (0-9). Only used by the joblib backend. None means no compression.

Return type:

Text

Returns:

str – The actual path where the model was saved.

Examples

>>> from sklearn.linear_model import LogisticRegression
>>> import endgame as eg
>>> model = LogisticRegression().fit(X_train, y_train)
>>> eg.save(model, "/tmp/my_model")
'/tmp/my_model.egm'
>>> loaded = eg.load("/tmp/my_model.egm")
endgame.persistence.load(path, map_location=None)[source]

Load an estimator from disk.

Parameters:
  • path (str) – Path to a .egm file or .egd directory.

  • map_location (str or None) – PyTorch map_location for loading tensors (e.g. "cpu"). Only relevant for .egd (PyTorch) saves.

Return type:

Any

Returns:

estimator – The loaded estimator.

Examples

>>> import endgame as eg
>>> model = eg.load("/tmp/my_model.egm")
>>> model.predict(X_test)
class endgame.persistence.ModelMetadata(endgame_version='', format_version=1, model_class='', model_params=<factory>, created_at='', python_version='', dependencies=<factory>, n_features_in_=None, feature_names_in_=None, classes_=None, is_fitted=False, backend='joblib', compression=None)[source]

Bases: object

Metadata stored alongside a persisted model.

Parameters:
  • endgame_version (str)

  • format_version (int)

  • model_class (str)

  • model_params (dict[str, Any])

  • created_at (str)

  • python_version (str)

  • dependencies (dict[str, str])

  • n_features_in_ (int | None)

  • feature_names_in_ (list[str] | None)

  • classes_ (list | None)

  • is_fitted (bool)

  • backend (str)

  • compression (int | None)

endgame_version

Version of endgame that saved the model.

Type:

str

format_version

Persistence format version for forward compatibility.

Type:

int

model_class

Fully qualified class name of the estimator.

Type:

str

model_params

Parameters from get_params().

Type:

dict

created_at

ISO 8601 timestamp of when the model was saved.

Type:

str

python_version

Python version used to save the model.

Type:

str

dependencies

Versions of key dependencies (numpy, sklearn, torch, etc.).

Type:

dict

n_features_in_

Number of input features (if fitted).

Type:

int or None

feature_names_in_

Input feature names (if available).

Type:

list of str or None

classes_

Class labels for classifiers.

Type:

list or None

is_fitted

Whether the estimator was fitted when saved.

Type:

bool

backend

Backend used for serialization (“joblib”, “torch”, or “pickle”).

Type:

str

compression

Compression level used.

Type:

int or None

endgame_version: str = ''
format_version: int = 1
model_class: str = ''
model_params: dict[str, Any]
created_at: str = ''
python_version: str = ''
dependencies: dict[str, str]
n_features_in_: int | None = None
feature_names_in_: list[str] | None = None
classes_: list | None = None
is_fitted: bool = False
backend: str = 'joblib'
compression: int | None = None
to_dict()[source]

Convert to a JSON-serializable dictionary.

Return type:

WSGIEnvironment

classmethod from_dict(d)[source]

Create from a dictionary.

Return type:

ModelMetadata

Parameters:

d (dict)

endgame.persistence.export_onnx(estimator, path, sample_input=None, opset_version=15, backend='auto')[source]

Export a fitted estimator to ONNX format.

Auto-detects the best conversion backend based on the estimator type:

  • sklearn models -> skl2onnx

  • Tree-based GBDTs (LightGBM, XGBoost, CatBoost) -> skl2onnx (with registered converters from onnxmltools)

  • PyTorch nn.Module-backed models -> torch.onnx.export

  • Fallback -> hummingbird-ml

Parameters:
  • estimator (Any) – Fitted sklearn-compatible estimator.

  • path (Text | Path) – Output file path. The .onnx extension is added automatically if not present.

  • sample_input (ndarray | None) – Sample input array for shape inference. Required for PyTorch models; strongly recommended for all others.

  • opset_version (int) – ONNX opset version. Default is 15, which provides broad operator coverage.

  • backend (Text) – Conversion backend. "auto" selects the best backend based on estimator type. Explicit options: "skl2onnx", "hummingbird", "torch".

Return type:

Text

Returns:

Path to the saved ONNX file.

Raises:
  • ValueError – If the backend is unknown or input shape cannot be inferred.

  • RuntimeError – If the ONNX conversion fails.

Examples

Export a scikit-learn model:

>>> from sklearn.ensemble import RandomForestClassifier
>>> import endgame as eg
>>> model = RandomForestClassifier(n_estimators=10).fit(X, y)
>>> eg.export_onnx(model, "rf_model.onnx", sample_input=X[:1])
'rf_model.onnx'

Export a LightGBM model:

>>> from endgame.models.wrappers import LGBMWrapper
>>> model = LGBMWrapper(task='classification').fit(X, y)
>>> eg.export_onnx(model, "lgbm.onnx", sample_input=X[:1])
'lgbm.onnx'

Export with a specific backend:

>>> eg.export_onnx(model, "model.onnx", backend='hummingbird')
'model.onnx'
class endgame.persistence.ModelServer(model_path, providers=None, session_options=None)[source]

Bases: object

Lightweight inference server using ONNX Runtime.

Loads an ONNX model and provides predict and predict_proba methods that match the sklearn estimator interface. Supports both CPU and GPU execution providers.

Parameters:
  • model_path (Text | Path) – Path to the .onnx model file.

  • providers (Sequence[Text] | None) – ONNX Runtime execution providers. Defaults to ["CPUExecutionProvider"]. Use ["CUDAExecutionProvider", "CPUExecutionProvider"] for GPU.

  • session_options (Any | None) – Optional ort.SessionOptions for tuning thread count, graph optimisation level, etc.

session

The underlying ort.InferenceSession.

input_names

Names of the model’s input tensors.

output_names

Names of the model’s output tensors.

metadata

ONNX model metadata properties (if any).

Examples

Basic inference:

>>> from endgame.persistence import ModelServer
>>> server = ModelServer("model.onnx")
>>> preds = server.predict(X_test)

With probability outputs:

>>> proba = server.predict_proba(X_test)
>>> proba.shape
(100, 2)

GPU inference:

>>> server = ModelServer(
...     "model.onnx",
...     providers=["CUDAExecutionProvider", "CPUExecutionProvider"],
... )

Custom session options:

>>> import onnxruntime as ort
>>> opts = ort.SessionOptions()
>>> opts.intra_op_num_threads = 4
>>> server = ModelServer("model.onnx", session_options=opts)
predict(X)[source]

Generate predictions from the ONNX model.

For classifiers exported via skl2onnx, the first output is typically the predicted label. For regressors, it is the predicted value.

Parameters:

X (ndarray) – Input feature array of shape (n_samples, n_features).

Return type:

ndarray

Returns:

Predictions of shape (n_samples,) for single-output models, or (n_samples, n_outputs) for multi-output.

Examples

>>> server = ModelServer("model.onnx")
>>> preds = server.predict(X_test)
>>> preds.shape
(100,)
predict_proba(X)[source]

Generate class probability estimates from the ONNX model.

For classifiers exported via skl2onnx, the second output is typically the probability array. If only one output exists, it is returned directly.

Parameters:

X (ndarray) – Input feature array of shape (n_samples, n_features).

Return type:

ndarray

Returns:

Probability array of shape (n_samples, n_classes).

Raises:

RuntimeError – If the model does not produce probability outputs.

Examples

>>> server = ModelServer("model.onnx")
>>> proba = server.predict_proba(X_test)
>>> proba.shape
(100, 2)
predict_raw(X)[source]

Return all raw ONNX model outputs without post-processing.

Useful for debugging or when the model has non-standard outputs.

Parameters:

X (ndarray) – Input feature array of shape (n_samples, n_features).

Return type:

list[ndarray]

Returns:

List of all output arrays from the ONNX model.

Examples

>>> server = ModelServer("model.onnx")
>>> outputs = server.predict_raw(X_test)
>>> len(outputs)
2
property input_shapes: list[list | None]

Return the expected input shapes.

Returns:

List of shape lists, one per input. Dimensions may be None for dynamic axes.

property output_shapes: list[list | None]

Return the output shapes.

Returns:

List of shape lists, one per output. Dimensions may be None for dynamic axes.