Prior Predictive Checks

If you come from classical econometrics, you are used to checking assumptions after estimation: residual plots, heteroskedasticity tests, outlier influence, and maybe out-of-sample fit. Bayesian workflow adds one earlier question:

Before fitting anything, do my priors imply plausible behaviour for the target variable?

That is what prior predictive checking answers.

1. Why parameter-level priors are not enough

A prior can look sensible when you inspect it in isolation and still imply absurd behaviour once it flows through the whole model.

For example:

an intercept prior may look “weakly informative” on paper
a channel coefficient prior may look “reasonably positive”
a likelihood sigma prior may look “safely diffuse”

But jointly, those choices might imply:

weekly revenue that is far above anything you could ever observe
negative conversions for a business where the target is always non-negative
far more volatility than the real series could possibly have

Classical econometrics rarely forces you to check this explicitly because you usually specify penalties or constraints directly on the coefficient space. Bayesian MMM requires one more layer of discipline: inspect the implied distribution of y, not just the configured priors on the parameters.

2. What prior predictive checking does

Prior predictive checking asks:

If the priors were true, what kinds of target series would this model generate before seeing the actual data?

The workflow is:

Build the model with your chosen priors and structure.
Sample from the prior predictive distribution.
Compare those simulated target draws with the scale and shape of the real target series.

This is not a convergence check and it is not a causal test. It is a plausibility check on the model you are about to fit.

3. How Abacus supports it

Abacus exposes prior predictive sampling directly on PanelMMM:

prior = mmm.sample_prior_predictive(
    X=X,
    y=y,
    samples=100,
    random_seed=42,
)

If you want a quick visual check, Abacus also exposes a retained plot surface:

figure, axes = mmm.plot.prior_predictive(var=mmm.output_var)

In the structured runner, this is Stage 10, the preflight stage. The pipeline writes:

10_pre_diagnostics/prior_predictive.nc
10_pre_diagnostics/prior_predictive.png

Abacus currently gives you the sampled draws and the plot. It does not apply an automatic plausibility score or a hard pass/fail gate for you.

4. What to look for

A useful prior predictive check is not about matching the data exactly. That would defeat the point of a prior. The question is whether the implied target behaviour is at least in the right universe.

Look for the following.

Level

Do the simulated draws live on roughly the same order of magnitude as the observed target?

If your historical weekly revenue is in the low millions, prior predictive draws in the billions are a red flag.

Dispersion

Is the implied volatility remotely plausible?

If the prior predictive distribution is much wider than the observed series, your likelihood sigma or contribution priors are probably too loose.

Sign and support

Does the model imply values that violate business reality?

For example:

negative conversions
implausibly negative revenue
large oscillations around zero for a strictly positive KPI

These are often signs that the prior scale is too permissive relative to the data scaling and likelihood choice.

Time pattern

Do the implied trajectories look structurally plausible?

You are not looking for a perfect seasonal pattern before fitting, but you should ask whether the prior predictive draws look like something that could have come from your business rather than from a random-number generator with no economic interpretation.

5. Common failure modes

Several practical pathologies show up repeatedly.

The intercept is too loose

A very wide intercept prior can dominate the prior predictive distribution, especially when the target has been scaled but the intercept prior is still too diffuse for the transformed space.

The likelihood sigma is too loose

If the prior predictive draws look far too noisy, the problem is often not the media priors at all. It is the observation model allowing implausibly large residual variance.

Media transformation priors are too permissive

Adstock and saturation priors that allow unrealistically persistent carryover or unrealistically steep response can imply contributions that are wildly too large before the data has had any say.

Flexible baseline terms are too unconstrained

Time-varying intercepts, seasonality, events, and other additive effects can all inject structure into the prior predictive distribution. If those priors are too loose, the target series can become implausibly volatile or pattern-heavy before fitting.

6. What to do when the prior predictive check looks bad

Do not proceed directly to posterior interpretation. Fix the model first.

Typical remedies:

tighten the intercept prior
tighten the likelihood sigma prior
make media priors more weakly informative in the economically plausible region rather than completely diffuse
reduce unnecessary model flexibility before the data has justified it
check whether your scaling choices make the configured priors too wide or too narrow on the model scale

This is the Bayesian analogue of catching a broken specification before you start arguing about p-values.

7. What prior predictive checks do not tell you

Passing a prior predictive check does not mean:

the model is causally identified
the model will fit well
the posteriors will converge cleanly
the attribution decomposition will be trustworthy

It only means the configured priors do not imply obviously absurd target behaviour before seeing the data.

You still need:

8. Practical recommendation

Treat prior predictive checking as a standard pre-fit step, not as an optional extra for purists.

In Abacus terms, the workflow should usually be:

Specify the model and priors.
Run sample_prior_predictive(...).
Inspect the implied target behaviour.
Revise the priors if needed.
Fit only once the prior predictive behaviour is broadly plausible.

That sequence is usually cheaper than fitting a badly specified Bayesian MMM and then discovering that the posterior is unstable for reasons you could have caught before sampling.