Persistence¶
- endgame.persistence.save(estimator, path, backend='auto', compress=None)[source]¶
Save any sklearn-compatible estimator to disk.
- Parameters:
estimator (estimator object) – A fitted (or unfitted) sklearn-compatible estimator.
path (str) – Destination file or directory path. The appropriate extension (
.egmfor single-file,.egdfor PyTorch directory) will be added automatically.backend (str, default="auto") – Serialization backend.
"auto"selects"torch"for estimators containingnn.Moduleattributes,"joblib"otherwise. Explicit options:"joblib","torch","pickle".compress (int or None) – Compression level (0-9). Only used by the joblib backend.
Nonemeans no compression.
- Return type:
- Returns:
str – The actual path where the model was saved.
Examples
>>> from sklearn.linear_model import LogisticRegression >>> import endgame as eg >>> model = LogisticRegression().fit(X_train, y_train) >>> eg.save(model, "/tmp/my_model") '/tmp/my_model.egm' >>> loaded = eg.load("/tmp/my_model.egm")
- endgame.persistence.load(path, map_location=None)[source]¶
Load an estimator from disk.
- Parameters:
- Return type:
- Returns:
estimator – The loaded estimator.
Examples
>>> import endgame as eg >>> model = eg.load("/tmp/my_model.egm") >>> model.predict(X_test)
- class endgame.persistence.ModelMetadata(endgame_version='', format_version=1, model_class='', model_params=<factory>, created_at='', python_version='', dependencies=<factory>, n_features_in_=None, feature_names_in_=None, classes_=None, is_fitted=False, backend='joblib', compression=None)[source]¶
Bases:
objectMetadata stored alongside a persisted model.
- Parameters:
- endgame.persistence.export_onnx(estimator, path, sample_input=None, opset_version=15, backend='auto')[source]¶
Export a fitted estimator to ONNX format.
Auto-detects the best conversion backend based on the estimator type:
sklearn models -> skl2onnx
Tree-based GBDTs (LightGBM, XGBoost, CatBoost) -> skl2onnx (with registered converters from onnxmltools)
PyTorch
nn.Module-backed models ->torch.onnx.exportFallback -> hummingbird-ml
- Parameters:
estimator (
Any) – Fitted sklearn-compatible estimator.path (
Text|Path) – Output file path. The.onnxextension is added automatically if not present.sample_input (
ndarray|None) – Sample input array for shape inference. Required for PyTorch models; strongly recommended for all others.opset_version (
int) – ONNX opset version. Default is 15, which provides broad operator coverage.backend (
Text) – Conversion backend."auto"selects the best backend based on estimator type. Explicit options:"skl2onnx","hummingbird","torch".
- Return type:
- Returns:
Path to the saved ONNX file.
- Raises:
ValueError – If the backend is unknown or input shape cannot be inferred.
RuntimeError – If the ONNX conversion fails.
Examples
Export a scikit-learn model:
>>> from sklearn.ensemble import RandomForestClassifier >>> import endgame as eg >>> model = RandomForestClassifier(n_estimators=10).fit(X, y) >>> eg.export_onnx(model, "rf_model.onnx", sample_input=X[:1]) 'rf_model.onnx'
Export a LightGBM model:
>>> from endgame.models.wrappers import LGBMWrapper >>> model = LGBMWrapper(task='classification').fit(X, y) >>> eg.export_onnx(model, "lgbm.onnx", sample_input=X[:1]) 'lgbm.onnx'
Export with a specific backend:
>>> eg.export_onnx(model, "model.onnx", backend='hummingbird') 'model.onnx'
- class endgame.persistence.ModelServer(model_path, providers=None, session_options=None)[source]¶
Bases:
objectLightweight inference server using ONNX Runtime.
Loads an ONNX model and provides
predictandpredict_probamethods that match the sklearn estimator interface. Supports both CPU and GPU execution providers.- Parameters:
- session¶
The underlying
ort.InferenceSession.
- input_names¶
Names of the model’s input tensors.
- output_names¶
Names of the model’s output tensors.
- metadata¶
ONNX model metadata properties (if any).
Examples
Basic inference:
>>> from endgame.persistence import ModelServer >>> server = ModelServer("model.onnx") >>> preds = server.predict(X_test)
With probability outputs:
>>> proba = server.predict_proba(X_test) >>> proba.shape (100, 2)
GPU inference:
>>> server = ModelServer( ... "model.onnx", ... providers=["CUDAExecutionProvider", "CPUExecutionProvider"], ... )
Custom session options:
>>> import onnxruntime as ort >>> opts = ort.SessionOptions() >>> opts.intra_op_num_threads = 4 >>> server = ModelServer("model.onnx", session_options=opts)
- predict(X)[source]¶
Generate predictions from the ONNX model.
For classifiers exported via skl2onnx, the first output is typically the predicted label. For regressors, it is the predicted value.
- Parameters:
X (
ndarray) – Input feature array of shape(n_samples, n_features).- Return type:
- Returns:
Predictions of shape
(n_samples,)for single-output models, or(n_samples, n_outputs)for multi-output.
Examples
>>> server = ModelServer("model.onnx") >>> preds = server.predict(X_test) >>> preds.shape (100,)
- predict_proba(X)[source]¶
Generate class probability estimates from the ONNX model.
For classifiers exported via skl2onnx, the second output is typically the probability array. If only one output exists, it is returned directly.
- Parameters:
X (
ndarray) – Input feature array of shape(n_samples, n_features).- Return type:
- Returns:
Probability array of shape
(n_samples, n_classes).- Raises:
RuntimeError – If the model does not produce probability outputs.
Examples
>>> server = ModelServer("model.onnx") >>> proba = server.predict_proba(X_test) >>> proba.shape (100, 2)
- predict_raw(X)[source]¶
Return all raw ONNX model outputs without post-processing.
Useful for debugging or when the model has non-standard outputs.
- Parameters:
X (
ndarray) – Input feature array of shape(n_samples, n_features).- Return type:
- Returns:
List of all output arrays from the ONNX model.
Examples
>>> server = ModelServer("model.onnx") >>> outputs = server.predict_raw(X_test) >>> len(outputs) 2