Python API
This page summarizes the main Python entry points in OmniGBDT. For installation and runnable examples, see Installation and Examples.
Common data requirements
feature arrays should be
float64and two-dimensional with shape(n_samples, n_features)multi-output labels should be
float64orint32and two-dimensional with shape(n_samples, out_dim)single-output labels should be contiguous
float64orint32one-dimensional arraysFor slicing one column out of a 2D label matrix, use
np.ascontiguousarray(...)before passing it toSingleOutputGBDT
Core models
MultiOutputGBDT
- class MultiOutputGBDT(lib=None, out_dim=1, params=None)
Multi-output boosted tree model.
- Parameters:
MultiOutputGBDTis the main entry point to learn multiple outputs jointly.When
params["base_score"]is left asNonewithloss=b"mse", the initial prediction is inferred from the training-label mean for each output column.When
params["deterministic"]isTrue, repeated CPU runs on the same platform are intended to be repeatable for a fixednum_threadssetting.- set_data(train_set=None, eval_set=None)
Register training and optional evaluation data.
train_setandeval_setare tuples of(X, y).Xmust be a 2Dfloat64arrayymay beNoneor a 2Dfloat64/int32array with one column per output
- train(num, objective=None, eval_metric=None, maximize=None)
Train the model for
numboosting rounds.when
objectiveis omitted, OmniGBDT uses the built-in native loss fromparams["loss"]when
objectiveis provided, it must return(grad, hess)from the current prediction matrix and label matrixfor
MultiOutputGBDT, those callback arrays are 2D with shape(n_samples, out_dim)eval_metricmay be used to report a scalar metric for the train and eval splits during custom-objective trainingif
early_stop > 0and evaluation labels are registered on the custom-objective path, theneval_metricandmaximizemust also be provided
- predict(x, num_trees=0)
Predict on a 2D
float64feature matrix.when
num_trees == 0, all learned trees are usedreturns a 2D array with shape
(n_samples, out_dim)
- dump(path)
Write the learned model to a text file.
pathacceptsstr,bytes, andpathlib.Path.
- load(path)
Load a text-dumped model from disk.
pathacceptsstr,bytes, andpathlib.Path.
- _set_gh(g, h)
Set gradient and hessian arrays for the next call to
boost(). This is an advanced escape hatch for manual custom-loss workflows.
- _set_label(x, is_train)
Replace labels for the training or evaluation dataset without rebuilding the feature binning.
- boost()
Grow a single tree after calling
_set_gh(...).
- close()
Release the underlying native model explicitly. This is optional, but useful in longer-running scripts.
SingleOutputGBDT
- class SingleOutputGBDT(lib=None, out_dim=1, params=None)
Single-output boosted tree model.
- Parameters:
SingleOutputGBDTcan be used to train one model per target column as a simple baseline.When
params["base_score"]is left asNonewithloss=b"mse", the initial prediction is inferred from the training-label mean.When
params["deterministic"]isTrue, repeated CPU runs on the same platform are intended to be repeatable for a fixednum_threadssetting.- set_data(train_set=None, eval_set=None)
Register training and optional evaluation data.
train_setandeval_setare tuples of(X, y)where:Xis a 2Dfloat64arrayyis typically a contiguous 1Dfloat64orint32array
- train(num, objective=None, eval_metric=None, maximize=None)
Train a single-output model for
numboosting rounds.when
objectiveis omitted, OmniGBDT uses the built-in native loss fromparams["loss"]when
objectiveis provided, it must return(grad, hess)from the current prediction vector and label vectorfor
SingleOutputGBDT, callback arrays are 1D with shape(n_samples,)the custom-objective path is only supported for the normal
out_dim == 1workflowif
early_stop > 0and evaluation labels are registered on the custom-objective path, theneval_metricandmaximizemust also be provided
- predict(x, num_trees=0)
Predict on a 2D
float64feature matrix.with
out_dim == 1, the return value is a 1D arraywith
out_dim > 1, the return value is shaped as(n_samples, out_dim)
- train_multi(num)
Legacy helper used by the original code for multi-classification style workflows.
- reset()
Clear learned trees and reset predictions back to the resolved base score.
- close()
Release the underlying native model explicitly.
Optional sklearn wrappers
The sklearn-compatible wrappers are optional and require the sklearn extra:
pip install "omnigbdt[sklearn]"
This is a fork-specific addition intended to make OmniGBDT work with sklearn tooling such as sklearn.inspection.permutation_importance.
SingleOutputGBDTRegressor
- class SingleOutputGBDTRegressor(...)
sklearn-compatible single-target regressor wrapper around
SingleOutputGBDT.It exposes
fit(...),predict(...), andscore(...)for use with tools such assklearn.inspection.permutation_importance.Its constructor also accepts
objective=None,eval_metric=None,maximize=None, anddeterministic=Trueand forwards them toSingleOutputGBDT.
MultiOutputGBDTRegressor
- class MultiOutputGBDTRegressor(...)
sklearn-compatible multi-output regressor wrapper around
MultiOutputGBDT.It exposes
fit(...),predict(...), andscore(...)for sklearn-style multi-output workflows.Its constructor also accepts
objective=None,eval_metric=None,maximize=None, anddeterministic=Trueand forwards them toMultiOutputGBDT.
Utilities
load_lib
- load_lib(path=None)
Load the compiled native library and return a configured
ctypeshandle.pathmay be:omitted, in which case the packaged native library is loaded automatically
a direct path to the compiled library file
a directory that contains the compiled library
Most users do not need to call this directly.
Verbosity
- class Verbosity
Small enum-like helper for training output levels.
Verbosity.SILENT: no native training outputVerbosity.SUMMARY: only the final best score when evaluation data is presentVerbosity.FULL: per-round metrics plus the final best score
create_graph
- create_graph(file_name, tree_index=0, value_list=None)
Build a
graphviz.Digraphfrom a dumped text model.This helper is optional and requires the plotting dependency:
pip install "omnigbdt[plot]"
- Parameters:
file_name – path to a text model dump
tree_index (int) – zero-based tree index
value_list – optional list of output indices to display in leaf nodes