Python API (sklearn-style) ========================== :class:`~robert.api.RobertModel` runs the full ROBERT workflow (CURATE, GENERATE, VERIFY, PREDICT) and exposes ``fit`` / ``predict`` / ``score`` on :class:`pandas.DataFrame` or :class:`numpy.ndarray` inputs. PREDICT writes CSV columns aligned with the pipeline: - ``{y}_pred``: point prediction from the **selected estimator refit on all training data** (deployment-style mean). - ``{y}_pred_sd``: per-row **standard deviation across repeated cross-validation predictions** (disagreement between refits on overlapping training folds; related to epistemic instability, not a calibrated predictive distribution). - ``{y}_pred_conformal_hw`` (**regression only**): a single **symmetric interval half-width** from split-style conformal calibration (absolute residuals on a held-out calibration slice of the training set when large enough, otherwise residuals vs CV out-of-fold means). Because the reported point predictor is refit on **all** training data, finite-sample **coverage is approximate** at the nominal ``conformal_coverage`` (default 0.9); tune with ``conformal_enable``, ``conformal_calib_frac``, and ``conformal_coverage`` in :class:`~robert.api.RobertModel` kwargs. For **classification**, this column is present but filled with NaN; ``{y}_pred_sd`` reflects **vote spread** across CV refits, not class probabilities. ``predict`` returns ``{y}_pred`` values aligned to input rows. Uncertainty: - ``return_std=True`` is equivalent to ``return_uncertainty="cv_sd"`` and returns ``(y, sd_cv)``. - ``return_uncertainty="conformal"`` (regression only) returns ``(y, half_width)``. - ``return_uncertainty="both"`` returns ``(y, sd_cv, half_width)``. - If both ``return_std`` and ``return_uncertainty`` are set, ``return_uncertainty`` wins and a warning is issued. Pipeline semantics ------------------ - **Single high-level estimator.** Encoding and curation happen in CURATE; training matrices are scaled inside ROBERT (``StandardScaler`` on the design matrix in ``prepare_sets``). This is not a composable sklearn ``Pipeline`` of separate ``TransformerMixin`` steps on :class:`~robert.api.RobertModel` itself. - **Do not** stack another ``StandardScaler`` (or similar) in front of the same raw descriptor table unless you know exactly how it interacts with CURATE outputs; you would usually double-scale or break column semantics. - **Row order.** ``predict`` returns one value per input row, aligned to ``X`` even if ROBERT writes prediction CSVs in a different row order (alignment uses the names column from CURATE). Matplotlib ---------- During ``fit`` and ``predict``, Matplotlib is switched to the non-interactive ``Agg`` backend so plotting does not require a GUI; the prior backend is restored afterward. Figures are still written under ``workdir`` like the CLI workflow. Example ------- .. code-block:: python from robert import RobertModel import pandas as pd df = pd.read_csv("Robert_example.csv") X = df.drop(columns=["Target_values"]) y = df["Target_values"] model = RobertModel( problem_type="reg", workdir="./robert_run", model=["RF"], n_iter=2, init_points=2, ) model.fit(X.iloc[:25], y.iloc[:25]) preds = model.predict(X.iloc[25:]) preds, sd_cv = model.predict(X.iloc[25:], return_std=True) preds2, hw = model.predict(X.iloc[25:], return_uncertainty="conformal") preds3, sd_cv2, hw2 = model.predict(X.iloc[25:], return_uncertainty="both") r2 = model.score(X.iloc[25:], y.iloc[25:]) .. autoclass:: robert.api.RobertModel :members: :no-inherited-members: