Skip to content

[AI Evaluation] Add dedicated EvaluationMetric property for reason / justification #6032

@shyamnamboodiripad

Description

@shyamnamboodiripad

Currently only RelevanceTruthAndCompletenessEvaluator includes the capability to generate a reason / justification for the score represented in an EvaluationMetric. The reason / justification is included as an EvaluationDiagnostic with Informational severity.

However, we are starting to see instances where custom evaluators are including their own special abstractions to represent the reason / justification. We have also heard feedback that it would be useful to display this information specially in the evaluation report (instead of displaying it as one amongst many diagnostics).

This issue tracks the following changes that we would like to introduce -

  • Introduce an optional first-class property (with type string?) on EvaluationMetric to represent the reason / justification.
  • Remove the includeReasoning option on RelevanceTruthAndCompletenessEvaluator and compute the reasons unconditionally since it is almost always useful (and may also lead to better scoring in general since we are asking LLM to think).
  • Update the generated evaluation report to display the reason / justification. For now, we can display it if available when hovering on the card for a particular metric. But eventually, it should be possible to click on a card for a metric and view reasons, diagnostics etc. associated with this metric in a details section below the cards.

Metadata

Metadata

Labels

area-ai-evalMicrosoft.Extensions.AI.Evaluation and related

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions