Skip to content

Conversation

loewenheim
Copy link
Contributor

@loewenheim loewenheim commented Jun 18, 2025

The FiniteF64 type is intended for f64 values that can't be ±∞ or NaN. It's currently only used for metrics, but measurements ought to work the same way. Therefore, this PR tries to use FiniteF64 in measurements.

  • It moves FiniteF64 to relay-protocol so that all parts of Relay that need it have access to it.
  • It implements a bunch of traits for FiniteF64: FromValue, IntoValue, ProcessValue, Empty, PartialOrd<f64>, PartialEq<f64>, and AddAssign. The only notable thing here is that in order to implement ProcessValue I added a new method process_finite_f64 to the Processor trait, as simply reusing process_f64 wouldn't be sound.

I've hit a roadblock with the statistical functions in normalize/utils.rs. Specifically, calculate_cdf_score returns NaN when the p50 parameter is 0, making it unsound to use with FiniteF64.

How do we deal with this? Do we wrap the calculation in Option and then discard any None (inf or NaN) values we get? As far as I understand, such values never make it to Kafka anyway because of this check in store.rs.

Closes #4817. Closes RELAY-109.

@loewenheim loewenheim force-pushed the sebastian/measurements-finite branch from 461866d to 9284fea Compare June 24, 2025 12:42
@loewenheim loewenheim marked this pull request as ready for review June 24, 2025 12:59
@loewenheim loewenheim requested a review from a team as a code owner June 24, 2025 12:59
@@ -643,80 +641,4 @@ mod tests {
]
"###);
}

#[test]
fn skip_nonfinite_float() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test can't even be expressed now.

@@ -134,6 +134,19 @@ impl<T> Annotated<T> {
})
}

/// Tries to convert a `U` value into `Annotated<T>`.
pub fn try_from<U>(value: U) -> Self
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function's signature is kinda wild, but it's pretty useful for creating Annotated<FiniteF64>.

@loewenheim loewenheim enabled auto-merge (squash) June 24, 2025 13:22
loewenheim added a commit that referenced this pull request Jun 24, 2025
Some fields of `SpanKafkaMessage` were previously deserialized as
`serde_json::Value`, which was partly caused by having to parse and
validate `f64` values in `measurements`. As of #4828 that validation is
no longer necessary. Consequently, we can now use `RawValue` everywhere.
The `inner_produce_protobuf_span` function needs to match on
`serde_json::Value`, but we can deserialize the raw values there. The
same applies for setting the sample rates in `backfill_data`.

This change also removes some unnecessary `Cow`s in `SpanKafkaMessage`
where nothing was ever changed or inserted.
@loewenheim loewenheim merged commit 61dc452 into master Jun 24, 2025
44 of 47 checks passed
@loewenheim loewenheim deleted the sebastian/measurements-finite branch June 24, 2025 14:09
loewenheim added a commit that referenced this pull request Jun 24, 2025
Some fields of `SpanKafkaMessage` were previously deserialized as
`serde_json::Value`, which was partly caused by having to parse and
validate `f64` values in `measurements`. As of #4828 that validation is
no longer necessary. Consequently, we can now use `RawValue` everywhere.
The `inner_produce_protobuf_span` function needs to match on
`serde_json::Value`, but we can deserialize the raw values there. The
same applies for setting the sample rates in `backfill_data`.

This change also removes some unnecessary `Cow`s in `SpanKafkaMessage`
where nothing was ever changed or inserted.
github-merge-queue bot pushed a commit that referenced this pull request Jun 30, 2025
Some fields of `SpanKafkaMessage` were previously deserialized as
`serde_json::Value`, which was partly caused by having to parse and
validate `f64` values in `measurements`. As of #4828 that validation is
no longer necessary. Consequently, we can now use `RawValue` everywhere.
The `inner_produce_protobuf_span` function needs to match on
`serde_json::Value`, but we can deserialize the raw values there. The
same applies for setting the sample rates in `backfill_data`.

#skip-changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use FiniteF64 in Measurement
2 participants