Major updates on utils functions #1927

cetagostini · 2025-09-05T21:07:05Z

Description

Hey team, major updates in the utility functions. Feel free to jump, comment or modify if you think something can be done better!

New Masked Distribution Prior

As users have the multidimensionality capabilities, models can grow without control, for example, currently is possible to make a model with dims (date, country, state, city, brand, product, channel) & shape (720, 50, 90, 10, 100, 12).

Even when is great to bring this flexibility to the users, probably given the huge dimensional space not all data will be perfectly balance, probably some values in the matrix can be missing. The MaskedDist allow the user to add a mask over the prior, modifying the tensor graph behind it to avoid to sample does none existent parameters, optimizing the process.

What it does (in short)

Wraps a Prior that is defined over named dims.
Applies a boolean mask to those dims to:
Build an “active” 1D RV only for unmasked positions (under a new active dim).
Scatter the active RV back to the original full grid and return a deterministic with the original dims.
Can also create masked likelihoods so only active positions contribute to the log-likelihood.

Benefits

Efficient parameterization: optimizes only the active positions, reducing parameter count and gradients.
Performance: faster sampling/optimization and lower memory use when grids are large and sparse.
Shape safety: validates against model coords; handles scalars, broadcastable arrays, and nested priors consistently.
Likelihood masking: excludes masked cells from the logp cleanly (no NaN gymnastics).
Portable: supports to_dict/from-dict for serialization.

import numpy as np
import pymc as pm
from pymc_extras.prior import Prior
from pymc_marketing.pytensor_utils import MaskedDist

# Define a 3x4 grid
coords = {"row": [0, 1, 2], "col": [0, 1, 2, 3]}

# Create mask - only activate positions (0,0), (1,2), (2,3)
mask = np.array(
    [
        [True, False, False, False],
        [False, False, True, False],
        [False, False, False, True],
    ]
)

# Define prior over full grid
prior = Prior("Normal", mu=0, sigma=1, dims=("row", "col"))
masked_dist = MaskedDist(prior, mask)

with pm.Model(coords=coords):
    # Create masked distribution
    coeff = masked_dist.create_variable("coeff")

Users can combine this to modify the params into their functions, such as saturation or adstock.

LogisticSaturation(
    priors={
        "lam": MaskedDist(
            Prior("HalfNormal", sigma=1.0, dims=("country", "region")),
            mask=mask,
        ),
        "beta": MaskedDist(
            Prior("HalfNormal", sigma=1.0, dims=("country", "region")),
            mask=mask,
        ),
    }
)

If from the full grid, some panels are missing they can mask the likelihood as well to not sample on dates or combos dates-region-city which doesn't exist.

mu_prior = Prior("Normal", mu=0.0, sigma=1.0, dims=("date", "geo"))
masked_mu = MaskedDist(mu_prior, mask=mask)

observed = np.zeros((len(coords["date"]), len(coords["geo"])))

likelihood_prior = Prior(
    "Normal",
    sigma=Prior("HalfNormal", sigma=1.0),
    dims=("date", "geo"),
)
masked_lik = MaskedDist(likelihood_prior, mask=mask)

with pm.Model(coords=coords):
    # Masked deterministic mean over full grid (zeros where mask is False)
    mu_full = masked_mu.create_variable("mu_full")

    # Build likelihood as a Prior and mask it, so both mu and observed are masked
    masked_lik.create_likelihood_variable(
        name="y",
        mu=mu_full,
        observed=observed,
    )

Reducing the dimensionality brings huge improvements in efficiency. Already some examples show, even in low dimensional problems a 30% less of computational time given a mask which cuts a few diagonals in our matrix.

Important

Out of sample with only test data can't handle the masking around likelihood. So, in order to make out of sample, you must share the full X_train and X_test to the sample posterior predictive.

Model JAX sampling estimation

Related to the above, I added a helper in JAX to users estimate the minimum sample time from their models. Doing so, users could compare the minimum amount of time that their given model will take.

with pm.Model() as model:
        pm.Normal("x", mu=0.0, sigma=1.0, observed=np.array([0.0]))

estimator = ModelSamplerEstimator(
    tune=100, draws=200, chains=1, sequential_chains=1, seed=123
)

df = estimator.run(model)

This tool is great for comparison, helps to have ideas around how heavy is your model and make changes before even properly sample the first time.

Merge graphical models

Combines multiple independent PyMC Models into a single Model.

As we move forward in causality, and discover new DAGs, a way to collect direct or indirect effects could be create different regressions, every one adjusting by the minimum set of variables. Each regression will respond to the same DAG but will uncover different effects per channel depending on the adjustment.

Nevertheless, we probably want to optimize all regressions together or we want to join the mode to make several operations. But how to? Well let's remember PyMC models are graphical generative model so, each model can be represent as a function graph, and their nodes/variables could be prefixed (e.g., model1_, model2_). Doing so, we could decide which variable or data will be the one to merge_on="some_var" and the variable is kept unprefixed and shared across the merged models. All outputs and coordinates are merged into a single function graph.

Quick example, imagine you have the following DAG:

digraph G {
  rankdir=LR; splines=true; nodesep=0.4; ranksep=0.5;
  node [shape=ellipse, fontsize=12];

  C1 -> X1; C1 -> Y;
  C2 -> X2; C2 -> Y;
  C3 -> X3; C3 -> Y;

  X1 -> X2; X2 -> X3;
  X1 -> Y;  X2 -> Y;  X3 -> Y;

  // styling
  {rank=same; C1; C2; C3}
  {rank=same; X1; X2; X3}
  Y [shape=doublecircle];
}

This structure makes the minimal sufficient sets different for each channel’s direct effect on (Y):

For (X_1): backdoor via (C_1); mediated paths via (X_2(\to X_3)).
Adjust: ({C_1, X_2}) (conditioning on (X_2) also blocks the (X_1 \to X_2 \to X_3 \to Y) path).
For (X_2): backdoors via (C_2) and via (X_1) (since (X_1 \to X_2) and (X_1 \to Y)); mediated path via (X_3).
Adjust: ({C_2, X_1, X_3}).
For (X_3): backdoors via (C_3) and via (X_2) (since (X_2 \to X_3) and (X_2 \to Y)).
Adjust: ({C_3, X_2}).

Meaning, we need three models to uncover the effects:

# Get x1 over Y
mask = np.array([True, False, False])
mmm_x1_effect = MMM(  # (Y ~ X1 + C1 + X2)
    date_column="date",
    target_column="Y",
    channel_columns=[
        "X1",
        "X2",
        "X3",
    ],  # always use all channels, but mask the ones that are not used
    control_columns=["X2", "C1"],
    adstock=NoAdstock(l_max=1),
    saturation=NoSaturation(
        priors={
            "beta": MaskedDist(
                Prior("HalfNormal", sigma=1.0, dims="channel"), mask=mask
            )
        }
    ),
)

# get x2 over Y
# make mask for channels making x1,x2 false
mask2 = np.array([False, True, False])
mmm_x2_effect = MMM(  # (Y ~ X2 + C2 + X1 + X3)
    date_column="date",
    target_column="Y",
    channel_columns=[
        "X1",
        "X2",
        "X3",
    ],  # always use all channels, but mask the ones that are not used
    control_columns=["X1", "C2", "X3"],
    adstock=NoAdstock(l_max=1),
    saturation=NoSaturation(
        priors={
            "beta": MaskedDist(
                Prior("HalfNormal", sigma=1.0, dims="channel"), mask=mask2
            )
        }
    ),
)

# get x3 over Y
# make mask for channels making x1,x2 false
mask3 = np.array([False, False, True])
mmm_x3_effect = MMM(  # (Y ~ X3 + C3 + X2)
    date_column="date",
    target_column="Y",
    channel_columns=[
        "X1",
        "X2",
        "X3",
    ],  # always use all channels, but mask the ones that are not used
    control_columns=["X2", "C3"],
    adstock=NoAdstock(l_max=1),
    saturation=NoSaturation(
        priors={
            "beta": MaskedDist(
                Prior("HalfNormal", sigma=1.0, dims="channel"), mask=mask3
            )
        }
    ),
)

As you can see we should be able to zero out with mask the distributions to avoid calculate the contributions for the channels we don't want to adjust in the regression. Nevertheless, even when channels are zero out given the priors and not consider during the sampling, the data input is the same for all.

We could merge based on it.

optimizable_model1 = MultiDimensionalBudgetOptimizerWrapper(
    model=mmm_x1_effect, start_date="2026-01-01", end_date="2026-01-31"
)

optimizable_model2 = MultiDimensionalBudgetOptimizerWrapper(
    model=mmm_x2_effect, start_date="2026-01-01", end_date="2026-01-31"
)

optimizable_model3 = MultiDimensionalBudgetOptimizerWrapper(
    model=mmm_x3_effect, start_date="2026-01-01", end_date="2026-01-31"
)

multi_model = MultiModelWrapper(
    models=[optimizable_model1, optimizable_model2, optimizable_model3],
    prefixes=["model1", "model2", "model3"],
    merge_on="channel_data",
)

And finally, optimize.

optimizer = BudgetOptimizer(
    num_periods=multi_model.num_periods,
    model=multi_model,
    response_variable="total_media_contribution_original_scale",
)

This opens the door to make optimization of different effects (total, indirects or directs) together. Allows for multi-objective optimization. Every model could have a different target variable but the target depends on same inputs, and consequently we want to optimize A and B together.

Conclusion

These examples are just the tip of the iceberg, but they allow users to scale their models in dimensionality and optimize them as much as possible, even in cases with panel imbalance. Estimate sampling time to decide if MCMC is feasible or VI methods could be a better option. Even for cases where they could end with multiple models, they can merge them together to consolidate their learnings or optimize together mixing different outputs.

Related Issue

Closes #
Related to #

Checklist

Checked that the pre-commit linting/style checks pass. Feel free to comment pre-commit.ci autofix to auto-fix.
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks) using numpydoc format.
If you are a pro: each commit corresponds to a relevant logical change

📚 Documentation preview 📚: https://pymc-marketing--1927.org.readthedocs.build/en/1927/

[examples]

review-notebook-app · 2025-09-05T21:07:10Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

codecov · 2025-09-05T21:10:05Z

Codecov Report

❌ Patch coverage is 77.75120% with 93 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.26%. Comparing base (49ed036) to head (eaae949).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
pymc_marketing/pytensor_utils.py	74.92%	79 Missing ⚠️
pymc_marketing/mmm/budget_optimizer.py	85.26%	14 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1927      +/-   ##
==========================================
- Coverage   91.99%   91.26%   -0.73%     
==========================================
  Files          66       66              
  Lines        7843     8253     +410     
==========================================
+ Hits         7215     7532     +317     
- Misses        628      721      +93

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cetagostini · 2025-09-05T21:10:42Z

Big kudos to @ricardoV94 because all initial draft functions came from him. I'm just a parrot opening the PRs and showing my use cases!

cetagostini · 2025-09-05T21:22:56Z

PS: Very open to change names, I feel the current names are all quite bad 😅

TeemuSailynoja · 2025-09-06T06:22:12Z

Wow! Just, wow! This will take a moment to process, but looks amazing.

pymc_marketing/pytensor_utils.py

williambdean · 2025-09-06T07:11:57Z

Why one giant PR 😓

cetagostini · 2025-09-06T16:39:37Z

Why one giant PR 😓

Sorry for it, but I added good level of tests, maybe try it out over different scenarios (?) and then we can move on! I just wanna to put all together. @williambdean

williambdean · 2025-09-07T05:55:19Z

There might be a version / behavior change with latest mlflow. Can you check the version. Maybe a mlflow 3.0 thing

TeemuSailynoja · 2025-09-07T06:42:44Z

pymc_marketing/pytensor_utils.py

+class MaskedDist:
+    """Create a masked deterministic from a Prior over full dims.
+
+    The foal is to reduce the number of parameters in the model by creating a masked deterministic


Sorry for horsing around, but this should be "The goal" 😉

williambdean · 2025-09-08T06:02:23Z

MLflow stuff should be fixed

cetagostini · 2025-09-08T06:45:18Z

MLflow stuff should be fixed

Thanks a lot, you are the best. I'll follow your guide there, and continue adding tests.

williambdean · 2025-09-08T08:37:26Z

pymc_marketing/pytensor_utils.py

+        full_dims: tuple[str, ...] = (
+            (self.dims if isinstance(self.dims, tuple) else (self.dims,))
+            if self.dims
+            else ()
+        )


Prior class should already do this

cetagostini added 2 commits September 5, 2025 23:25

Major updates on utils functions

10afe21

[examples]

Update pytensor_utils.py

e6e959d

cetagostini self-assigned this Sep 5, 2025

cetagostini added enhancement New feature or request MMM maintenance major API breaking changes API Prior class optimizer multidimensional labels Sep 5, 2025

cetagostini requested a review from juanitorduz September 5, 2025 21:07

cetagostini requested review from PabloRoque and williambdean September 5, 2025 21:07

github-actions bot added docs Improvements or additions to documentation tests labels Sep 5, 2025

williambdean reviewed Sep 6, 2025

View reviewed changes

pymc_marketing/pytensor_utils.py Outdated Show resolved Hide resolved

Adding more tests and new options

2ce1df3

Adjusting test

1bbbf91

github-actions bot added the mlflow label Sep 6, 2025

cetagostini added 3 commits September 6, 2025 20:10

Solving issue!

98489f3

trying something different with ml flow.

6de41dd

Return ml flow stuff

d079924

cetagostini added 2 commits September 6, 2025 21:57

Solving the dups, using equal sampling to all

4eb5b94

Trying to solve ML Flow issues

4816c16

TeemuSailynoja reviewed Sep 7, 2025

View reviewed changes

williambdean and others added 3 commits September 8, 2025 01:57

completely unwrap instead of nest

cb7e536

dont sample unless needed

20f60d8

Merge branch 'main' into cetagostini/major_changes_in_api

eaae949

williambdean reviewed Sep 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Major updates on utils functions #1927

Major updates on utils functions #1927

Uh oh!

cetagostini commented Sep 5, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Sep 5, 2025

Uh oh!

codecov bot commented Sep 5, 2025 •

edited

Loading

Uh oh!

cetagostini commented Sep 5, 2025 •

edited

Loading

Uh oh!

cetagostini commented Sep 5, 2025

Uh oh!

TeemuSailynoja commented Sep 6, 2025

Uh oh!

Uh oh!

williambdean commented Sep 6, 2025

Uh oh!

cetagostini commented Sep 6, 2025

Uh oh!

williambdean commented Sep 7, 2025 •

edited

Loading

Uh oh!

TeemuSailynoja Sep 7, 2025

Uh oh!

williambdean commented Sep 8, 2025

Uh oh!

cetagostini commented Sep 8, 2025

Uh oh!

williambdean Sep 8, 2025

Uh oh!

Uh oh!

Major updates on utils functions #1927

Are you sure you want to change the base?

Major updates on utils functions #1927

Uh oh!

Conversation

cetagostini commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

New Masked Distribution Prior

Model JAX sampling estimation

Merge graphical models

Conclusion

Related Issue

Checklist

Uh oh!

review-notebook-app bot commented Sep 5, 2025

Uh oh!

codecov bot commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cetagostini commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cetagostini commented Sep 5, 2025

Uh oh!

TeemuSailynoja commented Sep 6, 2025

Uh oh!

Uh oh!

williambdean commented Sep 6, 2025

Uh oh!

cetagostini commented Sep 6, 2025

Uh oh!

williambdean commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TeemuSailynoja Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

williambdean commented Sep 8, 2025

Uh oh!

cetagostini commented Sep 8, 2025

Uh oh!

williambdean Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cetagostini commented Sep 5, 2025 •

edited

Loading

codecov bot commented Sep 5, 2025 •

edited

Loading

cetagostini commented Sep 5, 2025 •

edited

Loading

williambdean commented Sep 7, 2025 •

edited

Loading