Post-Modeling

Use this section after fitting PanelMMM.

It covers posterior predictive checks, diagnostics, contribution analysis, response curves, efficiency metrics, and the tabular summary surfaces that Abacus exposes from fitted InferenceData.

Pages

  • Posterior Predictive: Sample fitted or future predictions and compare them with observed data where available.
  • Diagnostics: Run design-matrix, MCMC, and predictive diagnostics and export machine-readable reports.
  • Contributions and Decomposition: Inspect channel, baseline, control, seasonality, and event contributions.
  • Response Curves: Sample and summarise posterior saturation and adstock curves, and understand the runner’s forward-pass direct contribution artefacts.
  • ROAS and Metrics: Calculate ROAS, CPA-style metrics, spend tables, and predictive error metrics.
  • Summary and Export: Work with MMMSummaryFactory, HDI settings, time aggregation, and DataFrame export.

Subsections of Post-Modeling

Diagnostics

Abacus exposes diagnostics through mmm.diagnostics.

Use this surface to check the design matrix, posterior sampling quality, and posterior predictive fit. For fitted-value plots and predictive sampling, see Posterior Predictive.

Diagnostic surfaces

mmm.diagnostics provides three groups of checks.

Area Summary method Report method What it covers
Raw input screening design_summary(X) design_report(X) Collinearity, constants, and near-constant regressors on raw input columns
MCMC mcmc_summary() mcmc_report() r_hat, ESS, divergences, BFMI, tree depth, acceptance rate
Predictive predictive_summary() predictive_report() RMSE, MAE, NRMSE, NMAE, CRPS, residual moments

The summary methods return pandas DataFrames. The report methods return typed report objects with a to_dict() method for JSON-ready export.

Raw input screening

Use design_summary(X) on the raw design matrix you want to inspect:

design = mmm.diagnostics.design_summary(X)

By default, Abacus checks:

  • all channel_columns
  • all control_columns, when present

You can limit the check to specific variables:

design = mmm.diagnostics.design_summary(
    X,
    variables=["tv", "search", "price_index"],
    vif_threshold=10.0,
    near_constant_threshold=0.99,
)

The returned table includes:

  • variable
  • mean
  • std
  • n_unique
  • dominant_share
  • is_constant
  • is_near_constant
  • vif
  • high_vif
  • max_abs_corr

design_report(X) returns a compact roll-up with matrix rank, condition number, maximum VIF, maximum absolute correlation, and lists of flagged variables.

Screening requirements

Raw input screening requires:

  • all requested columns to exist in X
  • all checked columns to be numeric

Abacus raises a ValueError if a variable is missing or non-numeric.

The method names stay design_summary() and design_report(), but the pipeline now treats them as raw input screening rather than transformed model geometry.

MCMC diagnostics

Use mcmc_summary() after fitting:

mcmc = mmm.diagnostics.mcmc_summary(
    rhat_threshold=1.01,
    ess_threshold=400.0,
)

The summary comes from arviz.summary(..., kind="diagnostics") and adds flag columns such as:

  • high_rhat
  • low_ess_bulk
  • low_ess_tail

mcmc_report() adds model-level diagnostics, including:

  • divergence_count
  • divergence_rate
  • max_rhat
  • min_ess_bulk
  • min_ess_tail
  • bfmi_mean
  • bfmi_min
  • max_tree_depth_hits
  • max_tree_depth_observed
  • mean_acceptance_rate

If idata is missing, Abacus raises an error and tells you to fit the model first.

Example MCMC diagnostic output:

Trace plot example Trace plot example

Predictive diagnostics

Predictive diagnostics use the observed target and stored posterior predictive samples:

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

predictive = mmm.diagnostics.predictive_summary(original_scale=True)

The predictive summary is a one-row DataFrame with:

  • scale
  • num_observations
  • rmse
  • mae
  • nrmse
  • nmae
  • crps
  • residual_mean
  • residual_std

Abacus aligns target and prediction coordinates before flattening. That includes mixed datetime coordinate dtypes when needed.

Example residual diagnostics:

Residuals over time Residuals over time

Residual histogram Residual histogram

Residuals versus fitted Residuals versus fitted

Residual autocorrelation Residual autocorrelation

Export reports

Use the report objects when you want a compact export format:

import json

report = mmm.diagnostics.mcmc_report()
payload = report.to_dict()

with open("mcmc_report.json", "w", encoding="utf-8") as handle:
    json.dump(payload, handle, indent=2)

The same pattern works for design_report(...) and predictive_report().

Pipeline outputs

The pipeline diagnostics stage uses the same retained diagnostic surfaces to write report tables and text summaries. If you run the pipeline, those stage artefacts should match the behaviour documented here.

In the structured pipeline, the raw-input screening rows in diagnostics_report.csv use the phase label raw_input_screening instead of design so the machine-readable output matches the wording here.

Common pitfalls

  • Running mcmc_summary() or mcmc_report() before fitting
  • Running predictive diagnostics before sampling posterior predictive values
  • Passing non-numeric columns into design_summary(X)
  • Treating predictive diagnostics as a substitute for design or MCMC checks

Posterior Predictive Checks

Use posterior predictive draws to check in-sample fit and to generate predictions for new rows that follow the fitted panel layout.

For diagnostic metrics after sampling, see Diagnostics. For table export, see Summary and Export.

Sample posterior predictive draws

Use PanelMMM.sample_posterior_predictive(...) on a fitted model:

posterior_predictive = mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

sample_posterior_predictive(...):

  • requires X
  • uses the fitted posterior stored on mmm.idata
  • reshapes X into the model’s panel xarray layout
  • runs pymc.sample_posterior_predictive(...)
  • returns an extracted xarray.Dataset

By default, combined=True, so the returned dataset uses a sample dimension. If you want separate chain and draw dimensions, set combined=False.

Store or return only

By default, Abacus also writes the predictive samples back to mmm.idata:

posterior_predictive = mmm.sample_posterior_predictive(
    X=X,
    extend_idata=True,
    random_seed=42,
    progressbar=False,
)

With extend_idata=True, Abacus adds:

  • idata.posterior_predictive
  • idata.posterior_predictive_constant_data

If you only want the returned samples and do not want to update mmm.idata, set extend_idata=False.

Check training-fit values against observed data

For an in-sample check, pass the same design matrix you used for fitting. This is the same pattern used by the pipeline’s Stage 30 training-fit assessment:

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

fit_table = mmm.summary.posterior_predictive(hdi_probs=[0.94])
figure, axes = mmm.plot.posterior_predictive(
    var=[mmm.output_var],
    hdi_prob=0.94,
)

Example posterior predictive output:

Posterior predictive example Posterior predictive example

Fitted versus observed over time Fitted versus observed over time

mmm.summary.posterior_predictive() returns a table with:

  • observed target values
  • posterior predictive mean and median
  • HDI bound columns such as abs_error_94_lower and abs_error_94_upper

You can also access the predictive draws directly:

predictive = mmm.data.get_posterior_predictive(original_scale=True)
errors = mmm.data.get_errors(original_scale=True)

Blocked holdout validation

For the structured pipeline’s Stage 35 validation, Abacus fits a fresh model on the training window and then scores only the holdout dates:

holdout_predictive = validation_mmm.sample_posterior_predictive(
    X=X_holdout,
    include_last_observations=True,
    random_seed=42,
    progressbar=False,
)

That holdout path is different from the in-sample check above:

  • the model is fit on X_train and y_train only
  • the holdout X contains only future dates
  • include_last_observations=True keeps lag history for adstock carryover
  • the returned samples are used to compute holdout metrics such as RMSE, MAE, NRMSE, NMAE, CRPS, bias, and coverage at 50%, 80%, and 94%

The holdout stage is more expensive than the in-sample check because it adds a second fit.

Predict on new dates

For future prediction, pass a new X with the same structural columns as the training data:

future_predictive = mmm.sample_posterior_predictive(
    X=X_future,
    include_last_observations=True,
    random_seed=42,
    progressbar=False,
)

sample_posterior_predictive(...) does not take y. For a holdout or future window, keep the actual target outside the model and align it yourself if you want external evaluation.

Use include_last_observations correctly

Set include_last_observations=True when the forecast window needs lag history for adstock carryover.

When enabled, Abacus:

  • prepends the last adstock.l_max training observations internally
  • samples posterior predictive values on the padded data
  • removes the prepended rows from the returned result

This only works when the input dates do not overlap with the training dates. If they do overlap, Abacus raises a ValueError.

Practical guidance

  • Use the training X for fitted-versus-observed checks.
  • Use future-only dates for forward prediction.
  • Use the training-window refit pattern for blocked holdout validation.
  • Keep combined=True if you want a simpler sample dimension.
  • Use combined=False if you need explicit chain and draw dimensions.
  • Call sample_posterior_predictive(...) before using mmm.diagnostics.predictive_summary() or mmm.summary.posterior_predictive().

Common pitfalls

  • Calling sample_posterior_predictive(...) without X
  • Expecting y to be passed into the predictive method
  • Using include_last_observations=True on dates that overlap with training data
  • Forgetting that the returned object is extracted samples, while the stored idata.posterior_predictive group keeps the native posterior predictive structure

Contributions and Decomposition

Abacus stores additive contribution terms for fitted PanelMMM models and exposes them through the data wrapper, summary tables, and plotting suite.

Use this page to inspect media, baseline, control, seasonality, and event effects. For channel efficiency ratios built from media contributions, see ROAS and Metrics.

Contribution surfaces

You can work with contributions at three levels.

Surface Use it for
mmm.data Raw xarray contribution samples
mmm.summary DataFrames with posterior means, medians, and HDIs
mmm.plot Time-series and waterfall visualisations

Read raw contribution samples

The lowest-level accessor is mmm.data.get_contributions(...):

contributions = mmm.data.get_contributions(
    original_scale=True,
    include_baseline=True,
    include_controls=True,
    include_seasonality=True,
    include_events=True,
)

Depending on the fitted model, the returned dataset can contain:

  • channels
  • baseline
  • controls
  • seasonality
  • events

baseline includes the intercept contribution and any Mundlak contribution when the fitted model uses Mundlak CRE terms.

For media-only contribution samples, use:

channel_contributions = mmm.data.get_channel_contributions(original_scale=True)

Summarise one contribution type

Use mmm.summary.contributions(...) when you want a tidy table with posterior summary statistics:

channel_df = mmm.summary.contributions(
    component="channel",
    hdi_probs=[0.80, 0.94],
)

Supported component values are:

  • channel or channels
  • control or controls
  • seasonality
  • baseline

The returned table includes:

  • identifying columns such as date, channel, control, and any panel dims
  • mean
  • median
  • HDI bound columns such as abs_error_94_lower and abs_error_94_upper

mmm.summary.contributions(...) does not expose event effects. For event effects, use mmm.data.get_contributions(include_events=True) or mmm.summary.mean_contributions_over_time().

Create a wide decomposition table

Use mmm.summary.mean_contributions_over_time(...) when you want one row per time point and panel slice:

decomposition = mmm.summary.mean_contributions_over_time(
    original_scale=True,
)

This table contains posterior means only. It widens the contribution data so that each retained component becomes a column.

Typical output looks like this:

date geo TV Search baseline seasonality
2024-01-01 UK 1240.5 822.1 5110.7 -95.4
2024-01-08 UK 1302.8 801.6 5076.9 22.7

When present, the wide table also includes:

  • control columns
  • event columns named from posterior variables that end with _total_effect

Aggregate total contribution by component

Use mmm.summary.total_contribution(...) when you want one row per date and component type after summing across individual channels or controls:

totals = mmm.summary.total_contribution(frequency="monthly")

This is useful when you want a component-level roll-up, for example total media versus baseline.

Inspect change over time

Use mmm.summary.change_over_time(...) for percentage change in channel contributions between consecutive periods:

changes = mmm.summary.change_over_time(frequency="monthly")

This summary requires a date dimension. Do not use frequency="all_time".

Plot decomposition outputs

Use the plotting suite for visual inspection:

waterfall_figure, waterfall_axes = mmm.plot.waterfall_components_decomposition(
    original_scale=True,
)

area_figure, area_axes = mmm.plot.media_contribution_over_time(
    original_scale=True,
)

Useful plotting methods are:

  • waterfall_components_decomposition(...)
  • media_contribution_over_time(...)
  • contributions_over_time(...)
  • channel_contribution_share_hdi(...)

Example decomposition output:

Waterfall decomposition example Waterfall decomposition example

Media contribution over time Media contribution over time

Practical guidance

  • Use original_scale=True when you want business-unit interpretation.
  • Use mmm.summary.contributions(...) for tidy per-component tables.
  • Use mmm.summary.mean_contributions_over_time() for decomposition exports.
  • Use mmm.summary.total_contribution() when you only need component-level totals.

Common pitfalls

  • Expecting mmm.summary.contributions(...) to include event effects
  • Forgetting that baseline can include more than the intercept when Mundlak CRE is enabled
  • Using frequency="all_time" with mean_contributions_over_time() or change_over_time()

Response Curves

Use response curves to inspect the fitted media transformations directly.

Abacus exposes posterior saturation and adstock curves through both the fitted model and mmm.summary. For decomposition of realised contributions over time, see Contributions and Decomposition.

Sample saturation curves

Use sample_saturation_curve(...) on a fitted PanelMMM:

saturation_curve = mmm.sample_saturation_curve(
    max_value=1.0,
    num_points=100,
    num_samples=500,
    random_state=42,
    original_scale=True,
)

The returned xarray.DataArray contains:

  • the curve axis x
  • channel
  • any panel dims
  • a posterior sample dimension

original_scale=True converts the curve’s y-values to original target units. It does not convert the x-axis. x remains in scaled channel units.

If you want to choose max_value from original channel units, divide by the relevant value from mmm.data.get_channel_scale().

Sample adstock curves

Use sample_adstock_curve(...) to inspect carryover weights:

adstock_curve = mmm.sample_adstock_curve(
    amount=1.0,
    num_samples=500,
    random_state=42,
)

The returned array contains:

  • time since exposure
  • channel
  • any panel dims
  • a posterior sample dimension

The adstock curve is the fitted decay pattern for an impulse of size amount. It does not use an original_scale option because the returned weights are not target-unit contributions.

Runner-generated direct contribution artefacts

If you use the retained pipeline runner, Stage 60_response_curves also writes a forward-pass direct contribution artefact alongside the saturation and adstock transformation curves:

  • forward_pass_contribution_curve.nc
  • forward_pass_contribution_curve_summary.csv
  • forward_pass_contribution_curve.png

This artefact is different from the saturation-only curve:

  • the saturation-only curve shows the fitted saturation transform itself
  • the forward-pass direct contribution curve runs spend through the full fitted model path, including adstock and saturation

The retained Stage 60 forward-pass plot uses one explicit scenario so the curve is interpretable: it rescales the full observed historical spend path from 0% to 200%, then plots total channel spend against total channel contribution in original units. The marker at 100% highlights the fitted total contribution for the observed historical spend path.

Summarise curves as DataFrames

If you want tabular summaries, use mmm.summary:

saturation_df = mmm.summary.saturation_curves(
    hdi_probs=[0.80, 0.94],
    num_points=100,
    num_samples=500,
    random_state=42,
    original_scale=True,
)

adstock_df = mmm.summary.adstock_curves(
    hdi_probs=[0.94],
    amount=1.0,
    num_samples=500,
    random_state=42,
)

These methods return DataFrames with posterior mean, median, and HDI bound columns.

saturation_curves(...) includes an x column. adstock_curves(...) uses time since exposure.

MMMSummaryFactory requirement

Curve summaries need access to both the fitted data and the fitted model transformations.

mmm.summary already satisfies that requirement. If you construct MMMSummaryFactory manually, pass model=mmm:

from abacus.mmm.summary import MMMSummaryFactory

summary = MMMSummaryFactory(mmm.data, model=mmm)
curves = summary.saturation_curves()

If you omit model=mmm, Abacus raises a ValueError.

Plot saturation curves

You can plot sampled curves directly:

curve = mmm.sample_saturation_curve(
    num_points=100,
    random_state=42,
    original_scale=True,
)

figure, axes = mmm.plot.saturation_curves(
    curve=curve,
    original_scale=True,
)

You can also inspect the fitted relationship in the observed data with:

figure, axes = mmm.plot.saturation_scatterplot(original_scale=True)

Example curve output:

Saturation curve example Saturation curve example

Adstock curve example Adstock curve example

Practical guidance

  • Use num_samples to trade off speed against posterior resolution.
  • Use original_scale=True when you want the saturation y-axis in target units.
  • Keep in mind that the saturation x-axis stays on the scaled channel axis.
  • Use the summary methods when you need exportable tables.

Common pitfalls

  • Reading x from saturation curves as original spend units
  • Forgetting to pass model=mmm when manually constructing MMMSummaryFactory
  • Comparing adstock curves across models without matching the amount parameter

ROAS and Metrics

Use this page for channel-efficiency outputs and aggregate predictive metrics.

Abacus separates these into two surfaces:

  • mmm.summary and mmm.data for ROAS and cost-per-target outputs
  • mmm.diagnostics for RMSE, MAE, NRMSE, NMAE, and CRPS

For contribution tables that feed these ratios, see Contributions and Decomposition.

Element-wise ROAS and cost per target

The lowest-level efficiency accessors live on mmm.data:

roas_samples = mmm.data.get_elementwise_roas(original_scale=True)
cost_per_target_samples = mmm.data.get_elementwise_cost_per_target(
    original_scale=True,
)

These are direct ratios built from fitted media contributions and channel spend:

  • ROAS = contribution / spend
  • cost per target = spend / contribution

The arrays are element-wise over time, channel, and any panel dims, with posterior sample dimensions on top.

Abacus returns NaN when it would otherwise divide by zero.

Summarise ROAS

Use mmm.summary.roas(...) for a tidy summary table:

roas_df = mmm.summary.roas(
    hdi_probs=[0.80, 0.94],
    frequency="monthly",
    start_date="2024-01-01",
    end_date="2024-06-30",
)

Abacus applies start_date and end_date before any optional aggregation.

The returned table includes:

  • identifying columns such as date, channel, and any panel dims
  • mean
  • median
  • HDI bound columns such as abs_error_94_lower and abs_error_94_upper

Summarise cost per target

For conversion-style targets, use cost_per_target(...):

cpa_df = mmm.summary.cost_per_target(frequency="monthly")

This is the same retained summary surface that mmm.summary.efficiency() uses for target_type="conversion".

Use the default efficiency metric

Abacus chooses the default efficiency metric from the target type:

target_type mmm.summary.efficiency() returns Label
revenue roas() ROAS
conversion cost_per_target() CPA

You can inspect the selected metric with:

metric_key = mmm.summary.efficiency_metric
metric_label = mmm.summary.efficiency_metric_label

Export channel spend

Use channel_spend() when you want the raw spend table with no posterior aggregation:

spend_df = mmm.summary.channel_spend()

This returns the observed channel spend with columns such as date, channel, panel dims, and channel_data.

Predictive error metrics

Predictive metrics live under mmm.diagnostics.predictive_summary():

mmm.sample_posterior_predictive(
    X=X,
    random_seed=42,
    progressbar=False,
)

predictive_metrics = mmm.diagnostics.predictive_summary()

The returned one-row DataFrame includes:

  • rmse
  • mae
  • nrmse
  • nmae
  • crps
  • residual_mean
  • residual_std

These metrics are calculated from the stored posterior predictive samples and the observed target.

Practical guidance

  • Use roas() for revenue targets.
  • Use cost_per_target() for conversion targets.
  • Use efficiency() when you want target-type-aware reporting.
  • Sample posterior predictive values before using predictive metrics.

Common pitfalls

  • Interpreting Abacus ROAS as something other than contribution divided by spend
  • Forgetting that zero spend or zero contribution produces NaN
  • Using predictive diagnostics before calling sample_posterior_predictive(...)

Summary and Export

mmm.summary is the retained tabular summary surface for fitted PanelMMM models.

It is backed by MMMSummaryFactory and returns pandas or polars DataFrames that you can export with normal DataFrame methods.

For predictive diagnostics and JSON-ready reports, see Diagnostics.

Use mmm.summary

The simplest path is the bound summary factory on the fitted model:

posterior_df = mmm.summary.posterior_predictive()
contributions_df = mmm.summary.contributions(component="channel")
roas_df = mmm.summary.roas(frequency="monthly")

mmm.summary already has access to:

  • mmm.data
  • the fitted PanelMMM
  • the default summary settings

Construct MMMSummaryFactory manually

If you want custom defaults, build the factory yourself:

from abacus.mmm.summary import MMMSummaryFactory

summary = MMMSummaryFactory(
    mmm.data,
    model=mmm,
    hdi_probs=(0.80, 0.94),
    output_format="polars",
)

This is useful when you want one summary object with consistent HDI and output settings across multiple tables.

Common summary methods

Method What it returns
posterior_predictive() Posterior predictive summaries aligned to the wrapped target data
contributions() Tidy contribution summaries by component type
mean_contributions_over_time() Wide decomposition table
roas() / cost_per_target() / efficiency() Efficiency summaries
channel_spend() Raw spend table
saturation_curves() / adstock_curves() Transformation-curve summaries
total_contribution() Component-level totals
change_over_time() Period-on-period percentage change in channel contributions

Choose output format

MMMSummaryFactory supports:

  • output_format="pandas"
  • output_format="polars"

Example:

summary = MMMSummaryFactory(mmm.data, model=mmm, output_format="polars")
roas_df = summary.roas()

If you request polars without Polars installed, Abacus raises an ImportError.

Configure HDI probabilities

Pass HDI probabilities as numbers strictly between 0 and 1:

posterior_df = mmm.summary.posterior_predictive(hdi_probs=[0.80, 0.94])

Do not pass percentages such as 80 or 94.

Summary tables include interval columns named from those probabilities. For example, hdi_probs=[0.94] produces columns such as:

  • abs_error_94_lower
  • abs_error_94_upper

These columns are the current HDI bound columns used by the retained summary surface.

Aggregate over time

Many summary methods accept frequency with one of these values:

  • original
  • weekly
  • monthly
  • quarterly
  • yearly
  • all_time

Example:

monthly = mmm.summary.posterior_predictive(frequency="monthly")
quarterly_roas = mmm.summary.roas(frequency="quarterly")

all_time removes the date dimension. That is useful for fully aggregated tables, but date-dependent summaries still need a date axis.

Do not use all_time with:

  • mean_contributions_over_time()
  • change_over_time()

Export tables

Abacus does not add a separate export wrapper on top of the returned DataFrames. Use the normal DataFrame methods from your selected backend:

posterior_df = mmm.summary.posterior_predictive()
posterior_df.to_csv("posterior_predictive.csv", index=False)

With Polars:

summary = MMMSummaryFactory(mmm.data, model=mmm, output_format="polars")
roas_df = summary.roas()
roas_df.write_csv("roas.csv")

Export diagnostic reports

Diagnostic report objects expose to_dict() for JSON-ready export:

import json

report = mmm.diagnostics.predictive_report()
with open("predictive_report.json", "w", encoding="utf-8") as handle:
    json.dump(report.to_dict(), handle, indent=2)

Common pitfalls

  • Expecting a dedicated file-export API on mmm.summary
  • Passing 94 instead of 0.94 in hdi_probs
  • Using saturation_curves() or adstock_curves() from a manual factory without model=mmm
  • Using all_time on summaries that require a date dimension