Parameters
This page describes the Python parameter dictionary used by SingleOutputGBDT and MultiOutputGBDT in this fork. Unless noted otherwise, defaults come from omnigbdt.lib_utils.default_params(). Several defaults intentionally differ from the original package; see Differences From Upstream.
General
loss: default =b"mse", type = bytes - Supported values areb"mse",b"bce",b"ce", andb"ce_column"-b"ce_column"is only relevant to legacySingleOutputGBDTclassification-style workflows - The Python API expects a byte string, for exampleb"mse"- when training with a customobjective=..., the native booster still requireslossto be a supported built-in value at construction time, but custom rounds will use the custom callback instead of the built-in objectiveverbosity: default =Verbosity.FULL(2), type =Verbosityor int -Verbosity.SILENT/0prints nothing from the native trainer -Verbosity.SUMMARY/1prints only the final best score when evaluation data is present -Verbosity.FULL/2prints per-round metrics and the final best scoreverbose: default =True, type = bool - Backward-compatible alias for the old two-level behavior -Falsemaps toVerbosity.SILENT-Truemaps toVerbosity.FULLnum_threads: default =2, type = int - Number of training threadsdeterministic: default =True, type = bool - Enables the documented fixed-thread CPU repeatability mode on the same platform - The current packaged CPU implementation already uses deterministic split selectionseed: default =0, type = int - Retained for API compatibility - The current deterministic CPU training path does not actively use randomness during tree growthhist_cache: default =16, type = int - Maximum number of histogram cachesmax_bins: default =128, type = int - Maximum number of bins for each input featuretopk: default =0, type = int - Sparse split-finding parameter - If0, the dense split-search path is usedone_side: default =True, type = bool - Selects the sparse split-search variant - Only used whentopk != 0
Tree
max_depth: default =4, type = int - Maximum tree depth - Must be at least1max_leaves: default =32, type = int - Maximum number of leaves per treemin_samples: default =20, type = int - Minimum number of samples allowed in a leafearly_stop: default =15, type = int - Early-stopping patience in rounds - If no evaluation labels are registered, early stopping stays inactive
Learning
lr: default =0.05, type = float - Learning ratebase_score: default =None, type =None| float | sequence of floats -Noneenables automatic regression mean initialization -SingleOutputGBDTresolves one scalar base score -MultiOutputGBDTaccepts either one scalar or one value per output columnreg_l1: default =0.0, type = float - L1 regularization term - The upstream code notes that this is not currently used for sparse split findingreg_l2: default =1.0, type = float - L2 regularization termgamma: default =1e-3, type = float - Minimum objective gain required for a split - Applies to the root split as well as deeper nodessubsample: default =1.0, type = float - Present in the Python defaults for compatibility - The current native implementation does not actively use it
Training call hooks
The public callback hooks live on train(...) rather than inside the params dictionary:
train(num, objective=None, eval_metric=None, maximize=None)- available onSingleOutputGBDTandMultiOutputGBDT-objective(preds, y_true)must return(grad, hess)-eval_metric(preds, y_true)must return a scalar float -maximizecontrols whether larger evaluation metric values are better
Shape rules:
SingleOutputGBDT.train(..., objective=...)uses 1D prediction and label arrays shaped(n_samples,)MultiOutputGBDT.train(..., objective=...)uses 2D prediction and label arrays shaped(n_samples, out_dim)
Custom early stopping:
if
early_stop > 0and evaluation labels are registered on the custom-objective path, theneval_metricandmaximizemust also be providedthe protected
_set_gh(...)plusboost()workflow remains available for advanced manual control
Model-specific notes
MultiOutputGBDTexpects multi-output labels shaped like(n_samples, out_dim)SingleOutputGBDTis best used with one target column at a timefor a comparison with a multi-output baseline using
SingleOutputGBDT, train one model per target column and stack their predictions manuallySingleOutputGBDT.train_multi(...)is a legacy helper for multi-class classification style workflows, not the common baseline path used in this fork’s examples