Skip to content

Conversation

stephentoub
Copy link
Member

@stephentoub stephentoub commented Jul 16, 2025

Adds support to M.E.AI.Abstractions for annotations.

Some design notes:

  • Right now, I have the Annotations property on TextContent. We can push it down to the base later without it being a breaking change.
  • Most services only support citations. OpenAI has a broader notion of annotations, and it looks like Anthropic will be adding other kinds of annotations as well. I modeled this by having a base AIAnnotation and a derived CitationAnnotation.
  • Different services are very varied on what they support with citations. I went back and forth on whether to have CitationAnnotation just be a flat set of properties, or to try to model it as further derived types, e.g. UrlCitationAnnotation. I've gone with the former, but could be swayed to the latter if y'all think that's more wise. Note that there's a variety of things OpenAI doesn't support or expose that others do, e.g. Google, Anthropic, and AWS all support including snippets of the cited content; I've included such things here, as well.
Microsoft Reviewers: Open in CodeFlow

@stephentoub stephentoub requested a review from a team as a code owner July 16, 2025 18:55
@github-actions github-actions bot added the area-ai Microsoft.Extensions.AI libraries label Jul 16, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for content annotations across AI abstractions and OpenAI clients, introducing new annotation types, serialization support, client mapping, and tests.

  • Introduce AIAnnotation and derived CitationAnnotation with serialization metadata
  • Add Annotations property to TextContent and update JSON metadata
  • Map provider annotations in OpenAI clients and adjust coalescing logic
  • Extend tests to cover annotation serialization and client integration

Reviewed Changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/AIAnnotation.cs Add base annotation type with JSON polymorphic support
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/CitationAnnotation.cs Define citation-specific annotation properties
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/TextContent.cs Add Annotations property to text content
src/Libraries/Microsoft.Extensions.AI.Abstractions/ChatCompletion/ChatResponseExtensions.cs Update coalescing to skip annotated content
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIResponseChatClient.cs Map service output annotations into TextContent
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIChatClient.cs Attach message-level annotations in chat responses
src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIAssistantChatClient.cs Handle streaming text annotations from assistants
src/Libraries/Microsoft.Extensions.AI.Abstractions/Microsoft.Extensions.AI.Abstractions.json Update API metadata for new annotation types
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Utilities/AIJsonUtilitiesTests.cs Adjust test for null‐ignoring during serialization
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Contents/*.cs Add unit tests for annotation roundtrips
test/Libraries/Microsoft.Extensions.AI.OpenAI.Tests/OpenAIResponseClientIntegrationTests.cs Add integration test for web search annotations
test/Libraries/Microsoft.Extensions.AI.Integration.Tests/ChatClientIntegrationTests.cs Refactor _chatClient field to ChatClient property
Comments suppressed due to low confidence (5)

test/Libraries/Microsoft.Extensions.AI.OpenAI.Tests/OpenAIResponseClientIntegrationTests.cs:23

  • Consider adding a streaming version of this annotation test (e.g., using GetStreamingResponseAsync) to ensure annotations are correctly surfaced in streaming scenarios as well.
    [ConditionalFact]

src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIAssistantChatClient.cs:218

  • Add unit tests for the OpenAIAssistantChatClient to validate that TextAnnotation mappings (placeholder, start/end index, tool names) are correctly applied in streaming responses.
                    if (mcu.TextAnnotation is { } tau)

src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIAssistantChatClient.cs:218

  • [nitpick] The variable name tau is not immediately descriptive. Consider renaming it to textAnnotation for clarity.
                    if (mcu.TextAnnotation is { } tau)

src/Libraries/Microsoft.Extensions.AI.OpenAI/OpenAIChatClient.cs:490

  • [nitpick] Consider replacing the LINQ OfType<>().FirstOrDefault() call with a simple loop to find the first TextContent to avoid potential overhead in high-throughput scenarios.
            TextContent? annotationContent = returnMessage.Contents.OfType<TextContent>().FirstOrDefault();

src/Libraries/Microsoft.Extensions.AI.Abstractions/ChatCompletion/ChatResponseExtensions.cs:236

  • [nitpick] The nested TryAsCoalescable helper improves clarity but could be extracted as a private method or moved outside the loop for readability and easier unit testing.
                static bool TryAsCoalescable(AIContent content, [NotNullWhen(true)] out TContent? coalescable)

Copy link
Member

@SteveSandersonMS SteveSandersonMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I added some questions/comments about bits where I think the meaning may be hard to interpret but the broad concept makes sense and feels valuable.

Note that there's a variety of things OpenAI doesn't support or expose that others do, e.g. Google, Anthropic, and AWS all support including snippets of the cited content; I've included such things here, as well.

That sounds wise to me. Including a "snippet" is extremely helpful to the app developer - maybe OpenAI (or its client library) will add that later. It's good we're not restricted to a pure lowest-common-denominator set of information and instead are modelling what's common across a broad range of providers.

@rogerbarreto
Copy link
Contributor

Worth checking the W3C Annotation Model

Could be helpful on defining the specialized annotations which in this document are named like Selectors

  • TextPositionSelector
  • SvgSelector
  • FragmentSelector
  • CssSelector
  • XPathSelector
  • RangeSelector ...

@stephentoub
Copy link
Member Author

I've addressed the feedback.

@rogerbarreto and @SteveSandersonMS, could you both please take another look? Including at my naming choices as part of addressing Roger's feedback.

@eiriktsarpalis, can you also review please?

Thanks.

@SteveSandersonMS
Copy link
Member

@stephentoub Looks good to me. I see you've ended up going for the most general design. Given that it's a relatively advanced feature, that seems a reasonable tradeoff even if people have to do a bit of casting to access character indices. Moving annotations to AIContent sounds good too if you think that's where it would end up going in the long run anyway.

This was referenced Aug 27, 2025
@github-actions github-actions bot locked and limited conversation to collaborators Aug 28, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-ai Microsoft.Extensions.AI libraries
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants