Skip to content

Conversation

Dav1dde
Copy link
Member

@Dav1dde Dav1dde commented Jul 24, 2025

The first big step at supporting inbound filters for logs.

Expansion and normalization of logs must happen in the same stage as inbound filters, since inbound filters must run at the earliest point possible to minimize unnecessary processing/traffic the normalization must also happen then.

This change enables processing of logs in all Relays except Relays running in proxy mode. Proxy mode does not have access to the necessary processing config to apply inbound filters or do full processing.

As a side effect, static and other customer managed Relays will now (correctly) apply PII rules, as they always should've been.

Tests have been improved to always use a Relay chain, size assertions are done on a trusted and untrusted Relay chain.

I also replaced the is_from_trusted_relay with a proper enum to not run into an issue where somewhat important information is passed around as a boolean. Adds some type-safety.

Things to be done in a follow-up:

  • Actually implement inbound filters
  • Split logs processing into more granular stages, normalization and scrubbing. We may want to run scrubbing only after filters to not apply unnecessary work to dropped data.

Closes: #4990
Closes: INGEST-469

@Dav1dde Dav1dde self-assigned this Jul 25, 2025
@Dav1dde Dav1dde force-pushed the dav1d/process-logs-early branch 2 times, most recently from a9df6b2 to 9f30258 Compare July 29, 2025 13:17
@Dav1dde Dav1dde changed the title ref(logs): Process logs in all managed Relays ref(logs): Process logs in all non-proxy Relays Jul 29, 2025
Copy link

linear bot commented Jul 29, 2025

@Dav1dde Dav1dde force-pushed the dav1d/process-logs-early branch 10 times, most recently from 8aed8e2 to 4067cf8 Compare July 30, 2025 11:03
@Dav1dde Dav1dde marked this pull request as ready for review July 30, 2025 11:03
@Dav1dde Dav1dde requested a review from a team as a code owner July 30, 2025 11:03
@Dav1dde Dav1dde requested review from k-fish and Zylphrex July 30, 2025 11:03
@Dav1dde Dav1dde force-pushed the dav1d/process-logs-early branch from 4067cf8 to 0a9d954 Compare July 30, 2025 11:04
if ctx.is_processing() {
let mut logs = process::expand(logs, ctx);
process::process(&mut logs, ctx);
self.limiter.enforce_quotas(&mut logs, ctx).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means quota enforcement now happens after expansion and processing, when it was the other way around before. I assume that's intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet relevant, will be relevant once we have inbound filters. We don't want to count things we're going to drop.

Also technically wrong before.

@@ -247,7 +247,7 @@ pub struct RequestMeta<D = PartialDsn> {
///
/// NOTE: This is internal-only and not exposed to Envelope headers.
#[serde(skip)]
from_internal_relay: bool,
request_trust: Option<RequestTrust>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an actual advantage to making this optional? AFAICT None has the same meaning as Some(Untrusted) in all cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, copy_to will now handle missing correctly, where before it would not. Also technically a bug that existed before and now became obvious with more typing.

@Dav1dde Dav1dde added this pull request to the merge queue Jul 31, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to a conflict with the base branch Jul 31, 2025
@Dav1dde Dav1dde added this pull request to the merge queue Jul 31, 2025
Merged via the queue into master with commit a3aaac2 Jul 31, 2025
29 checks passed
@Dav1dde Dav1dde deleted the dav1d/process-logs-early branch July 31, 2025 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relay PII scrubbing not applied in static Relay
2 participants