Narwhals support for CLV aggregation #1809

williambdean · 2025-07-03T15:50:24Z

Description

Still a work in progress.

The LazyFrame like libraries will require a provided observation_end_date. However, that can be found outside of the

Still building out the functionality for the:

remove first observation - mean with nans might differ in the backends

Related Issue

Closes #
Related to #

Checklist

Checked that the pre-commit linting/style checks pass. Feel free to comment pre-commit.ci autofix to auto-fix.
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks) using numpydoc format.
If you are a pro: each commit corresponds to a relevant logical change

📚 Documentation preview 📚: https://pymc-marketing--1809.org.readthedocs.build/en/1809/

codecov · 2025-07-03T15:53:23Z

Codecov Report

Attention: Patch coverage is 26.31579% with 14 lines in your changes missing coverage. Please review.

Project coverage is 40.45%. Comparing base (ca0c420) to head (f4804c8).

Files with missing lines	Patch %	Lines
pymc_marketing/clv/utils.py	26.31%	14 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (ca0c420) and HEAD (f4804c8). Click for more details.

HEAD has 9 uploads less than BASE

Flag BASE (ca0c420) HEAD (f4804c8)

22 13

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1809       +/-   ##
===========================================
- Coverage   92.28%   40.45%   -51.84%     
===========================================
  Files          62       62               
  Lines        7469     7487       +18     
===========================================
- Hits         6893     3029     -3864     
- Misses        576     4458     +3882

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

juanitorduz · 2025-07-12T08:11:04Z

OMG! 100% yes! 🥳

ColtAllen · 2025-07-15T07:57:18Z

OMG! 100% yes! 🥳

Indeed, thanks for starting this!

Do you think the current pandas functions should still be retained for a time even after this is merged? Also, it seems like the PR description in the original message requires more details.

williambdean · 2025-07-24T17:05:51Z

Do you think the current pandas functions should still be retained for a time even after this is merged? Also, it seems like the PR description in the original message requires more details.

I was just doing some comparisons of the two at the moment. However, I think that the new one should just take it's place.

williambdean · 2025-09-08T06:10:29Z

Maybe @ColtAllen is interested in taking this over?

FBruzzesi

Hey @williambdean 👋🏼 I just find out this PR 🔥 left a couple of comments that might help😇

FBruzzesi · 2025-09-11T12:26:06Z

pymc_marketing/clv/utils.py

+    if observation_period_end is None:
+        observation_period_end = transactions[datetime_col].cast(nw.Datetime).max()


I am not sure if this would work/is supported, but you might try to do:

Suggested change

if observation_period_end is None:

observation_period_end = transactions[datetime_col].cast(nw.Datetime).max()

if observation_period_end is None:

observation_period_end = pl.col("max").max()

to get the global max datetime value.

This might also help to avoid this requirement:

The LazyFrame like libraries will require a provided observation_end_date. However, that can be found outside of the

FBruzzesi · 2025-09-11T12:27:19Z

pymc_marketing/clv/utils.py

+) -> IntoFrameT:
+    transactions = nw.from_native(transactions)
+
+    date = nw.col(datetime_col).cast(nw.Datetime)


This is very tempting, but consider creating a new column between operations - I would be afraid that for pandas the casting happens multiple times instead of once

FBruzzesi · 2025-09-11T12:29:52Z

pymc_marketing/clv/utils.py

+
+    customers = (
+        nw.from_native(repeated_transactions)
+        .group_by(customer_id_col)


For some time now, it should be possible to pass an expression so that you can avoid the renaming down in the pipeline, but it's definitely more of a personal preference 😇

Suggested change

.group_by(customer_id_col)

.group_by(nw.col(customer_id_col).alias("customer_id"))

williambdean added 2 commits July 1, 2025 08:36

add to environment

1cd335b

initial pass through

a5bb346

github-actions bot added CLV tests labels Jul 3, 2025

Merge branch 'main' into narwhals-support

ff3087a

Merge branch 'main' into narwhals-support

f4804c8

FBruzzesi reviewed Sep 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Narwhals support for CLV aggregation #1809

Narwhals support for CLV aggregation #1809

Uh oh!

williambdean commented Jul 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

codecov bot commented Jul 3, 2025 •

edited

Loading

Uh oh!

juanitorduz commented Jul 12, 2025

Uh oh!

ColtAllen commented Jul 15, 2025 •

edited

Loading

Uh oh!

williambdean commented Jul 24, 2025

Uh oh!

williambdean commented Sep 8, 2025

Uh oh!

FBruzzesi left a comment

Uh oh!

FBruzzesi Sep 11, 2025

Uh oh!

FBruzzesi Sep 11, 2025

Uh oh!

FBruzzesi Sep 11, 2025

Uh oh!

Uh oh!

		if observation_period_end is None:
		observation_period_end = transactions[datetime_col].cast(nw.Datetime).max()

	.group_by(customer_id_col)
	.group_by(nw.col(customer_id_col).alias("customer_id"))

Narwhals support for CLV aggregation #1809

Are you sure you want to change the base?

Narwhals support for CLV aggregation #1809

Uh oh!

Conversation

williambdean commented Jul 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Checklist

Uh oh!

codecov bot commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

juanitorduz commented Jul 12, 2025

Uh oh!

ColtAllen commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

williambdean commented Jul 24, 2025

Uh oh!

williambdean commented Sep 8, 2025

Uh oh!

FBruzzesi left a comment

Choose a reason for hiding this comment

Uh oh!

FBruzzesi Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

FBruzzesi Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

FBruzzesi Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

williambdean commented Jul 3, 2025 •

edited by github-actions bot

Loading

codecov bot commented Jul 3, 2025 •

edited

Loading

ColtAllen commented Jul 15, 2025 •

edited

Loading