[AI Evaluation] Add dedicated EvaluationMetric property for reason / justification

Currently only `RelevanceTruthAndCompletenessEvaluator` includes the capability to generate a reason / justification for the score represented in an `EvaluationMetric`. The reason / justification is included as an `EvaluationDiagnostic` with `Informational` severity.

However, we are starting to see instances where custom evaluators are including their own special abstractions to represent the reason / justification. We have also heard feedback that it would be useful to display this information specially in the evaluation report (instead of displaying it as one amongst many diagnostics).

This issue tracks the following changes that we would like to introduce -
- [x] Introduce an optional first-class property (with type `string?`) on `EvaluationMetric` to represent the reason / justification.
- [x] Remove the `includeReasoning` option on `RelevanceTruthAndCompletenessEvaluator` and compute the reasons unconditionally since it is almost always useful (and may also lead to better scoring in general since we are asking LLM to *think*).
- [x] Update the generated evaluation report to display the reason / justification. For now, we can display it if available when hovering on the card for a particular metric. But eventually, it should be possible to click on a card for a metric and view reasons, diagnostics etc. associated with this metric in a details section below the cards.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AI Evaluation] Add dedicated EvaluationMetric property for reason / justification #6032

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AI Evaluation] Add dedicated EvaluationMetric property for reason / justification #6032

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions