Builders and Pipeline

Abacus exposes one public YAML builder and one structured pipeline runner.

Use these surfaces when you want configuration-driven model construction or a staged run directory with machine-readable artefacts.

YAML builder

Import path:

from abacus.mmm.builders.yaml import build_mmm_from_yaml

Signature:

model = build_mmm_from_yaml(
    config_path,
    X=X,
    y=y,
    model_kwargs=None,
    holidays_path=None,
)

Main inputs:

Argument Meaning
config_path YAML file path
X Optional pre-loaded feature data
y Optional pre-loaded target data
model_kwargs Model init overrides
holidays_path Optional holiday CSV override

It returns a built PanelMMM.

The builder orchestrates:

  • model construction
  • optional additive effects
  • holiday augmentation
  • build_model(X, y)
  • optional original_scale_vars
  • optional calibration steps
  • optional inference-data attachment

Structured pipeline runner

Top-level import path:

from abacus.pipeline import PipelineRunConfig, PipelineRunResult, run_pipeline

PipelineRunConfig

PipelineRunConfig is the user-facing run configuration dataclass.

Key fields:

Field Meaning
config_path YAML config file
output_dir Output root for run directories
run_name Optional logical run name
dataset_path Optional combined dataset CSV
x_path / y_path Optional separate feature and target CSVs
holidays_path Optional holiday CSV override
target_column Optional target-column override
prior_samples Prior predictive sample count
draws, tune, chains, cores Sampler overrides
random_seed Global random seed override
curve_samples Curve summary sample count
curve_points Curve summary x-axis resolution

It also exposes:

  • effective_run_name()

run_pipeline(...)

Use run_pipeline(...) to execute the structured runner:

from abacus.pipeline import PipelineRunConfig, run_pipeline

result = run_pipeline(
    PipelineRunConfig(
        config_path="config.yml",
        dataset_path="data.csv",
    )
)

run_pipeline(...):

  • loads the YAML config
  • loads data from the configured or overridden paths
  • resolves sampler overrides
  • creates the run directory and manifest
  • runs the retained stage sequence

The stage sequence is:

  1. metadata
  2. preflight
  3. fit
  4. assessment
  5. decomposition
  6. diagnostics
  7. curves
  8. optimisation

PipelineRunResult

PipelineRunResult is a small dataclass with:

Field Meaning
run_dir Concrete run directory path
manifest_path Manifest JSON path

CLI entry point

The CLI entry point lives in abacus.pipeline.runner:

python -m abacus.pipeline.runner --config config.yml --dataset-path data.csv

For full CLI usage, see CLI Reference.