Quickstart: Pipeline Runner
Use the pipeline runner when you want a full staged run instead of only an in-memory model fit.
The runner writes:
- a run manifest
- copied and resolved config files
- fitted model artefacts
- posterior predictive assessment outputs
- decomposition, diagnostics, and response-curve artefacts
Fastest first run: bundled demo
From the repository root, the quickest way to see a real structured run is the demo launcher:
Other bundled demos are:
geo_panelgeo_brand_panel
List them explicitly with:
runme.py is a convenience wrapper around the structured pipeline. It resolves
the demo config under data/demo/<demo_name>/config.yml and runs the pipeline
for you.
Run the pipeline from Python
The direct Python API is:
If the YAML config already contains data.dataset_path, you do not need to
pass dataset_path again.
Run the thin CLI directly
The pipeline also exposes a thin CLI in abacus.pipeline.runner:
The CLI prints the final run directory when the pipeline completes.
Override data paths
Use one of these patterns:
| Pattern | Arguments |
|---|---|
| Combined dataset override | dataset_path= in Python or --dataset-path in the CLI |
| Separate feature and target files | x_path= and y_path= in Python or --x-path and --y-path in the CLI |
| Target column override | target_column= in Python or --target-column in the CLI |
Configured relative paths are resolved relative to the YAML config directory.
If you want Stage 50 to use different warn/fail cutoffs, add a runner-only
diagnostics.thresholds block to the YAML. See
YAML Configuration.
What you get back
run_pipeline(...) returns a PipelineRunResult with:
run_dirmanifest_path
The output directory contains stage folders such as:
00_run_metadata20_model_fit30_model_assessment50_diagnostics60_response_curves
60_response_curves now includes three complementary curve families:
- saturation-only transformation artefacts
- forward-pass direct contribution artefacts built from scaled observed history
- adstock carryover artefacts
When to use the runner
Choose the runner when you want:
- a reproducible run directory on disk
- structured metadata and manifest files
- staged artefacts for diagnostics and reporting
- a config-driven workflow for repeated runs
If you only need to fit a model interactively in a notebook or script, start with Quickstart: Python API or Quickstart: YAML Builder.