Standardizing XML Tool Call Parsing for Language Models #15703

bold84 · 2025-08-31T18:57:45Z

bold84
Aug 31, 2025

Hi everyone,

As we continue to expand tool calling support for various language models in llama.cpp, we're encountering significant challenges with XML-based tool calling formats. Developers have been working on implementing support for models like GLM-4.5 and Qwen3-Coder, but we're facing several consistent issues:

Inconsistent XML formats - Each model has slightly different XML structures for tool calls
Parsing complexity - Handling nested parameters, whitespace, and incomplete tags
Template compatibility - Jinja templates need careful adjustment for each model
Type conversion challenges - Converting between string and typed arguments

We believe it's time to establish a more standardized approach to XML tool call parsing that can be extended for different models while maintaining consistency. This would help:

Reduce code duplication across model-specific implementations
Improve reliability of tool calling across different quantizations
Simplify the process of adding support for new XML-based models
Create a foundation for better testing and validation

We'd like to invite the community to discuss:

What components should be standardized in XML tool call parsing?
How can we create extensible parsers that work across different models?
What are the best practices for handling edge cases in XML parsing?
How should we approach type conversion and validation?

Let's work together to create a robust, extensible foundation for XML-based tool calling in llama.cpp. Please share your experiences, ideas, and suggestions below!

ExtReMLapin · 2025-08-31T19:42:50Z

ExtReMLapin
Aug 31, 2025

Beside XML parsing I think it's important to set up requirements in implementations of the tools, for example some tools are missing :

Streaming mode support:
- Misc. bug: Granite chat parser doesn't stream content section #15681
- Model: Seed OSS thinking + tool call support #15552 (comment)
Thinking mode + tool_choice = required
- Qwen3 : Fixes #15247 | Update chat.cpp to support (at least) qwen3 reasoning + tool_choice = required #15248
- Granite, deepseek gpt-oss, : Fixes #15247 | Update chat.cpp to support (at least) qwen3 reasoning + tool_choice = required #15248 (comment)
- Seed OSS : Model: Seed OSS thinking + tool call support #15552 (comment)

Keeping a list of supported models would be nice, maybe with feature level tool support (parallel , single tool calling)

1 reply

pwilkin Sep 1, 2025
Collaborator

I'll fix the tool_choice = required mode for Seed OSS as soon as I get some free time for it.

marceldev89 · 2025-08-31T19:44:34Z

marceldev89
Aug 31, 2025

I think ideally it should parse the XML and transform to JSON so that the (generic) JSON parser can parse that. That would make streaming tool call content work as well. In theory anyway.

From what I gathered while trying to make that work with the XML PR it seems that there's a whole "healing" system implemented for JSON that (seemingly) handles the streaming content for tool calls.

2 replies

pwilkin Sep 1, 2025
Collaborator

Yeah, the healing system should be incorporated here as well. Basically, how the healing marker works is it inserts a certain unique string marker and then the expected approach is that you take the JSON dump string and you cut it to the start of the healing marker.

I didn't want to bother with it since there were a lot of issues anyway, so I just left the tool-calls as non-streaming, but that should ideally be fixed as well (though it's a non-correctness issue, so it's a bit low on the list of priorities).

marceldev89 Sep 1, 2025

(though it's a non-correctness issue, so it's a bit low on the list of priorities).

Yeah definitely, but it would be cool to have working for the fancy. 😛

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardizing XML Tool Call Parsing for Language Models #15703

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Streaming mode support:

Thinking mode + `tool_choice` = required

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Standardizing XML Tool Call Parsing for Language Models #15703

Uh oh!

bold84 Aug 31, 2025

Replies: 2 comments · 3 replies

Uh oh!

Uh oh!

ExtReMLapin Aug 31, 2025

Streaming mode support:

Thinking mode + tool_choice = required

Uh oh!

pwilkin Sep 1, 2025 Collaborator

Uh oh!

Uh oh!

marceldev89 Aug 31, 2025

Uh oh!

pwilkin Sep 1, 2025 Collaborator

Uh oh!

marceldev89 Sep 1, 2025

bold84
Aug 31, 2025

Replies: 2 comments 3 replies

ExtReMLapin
Aug 31, 2025

Thinking mode + `tool_choice` = required

pwilkin Sep 1, 2025
Collaborator

marceldev89
Aug 31, 2025

pwilkin Sep 1, 2025
Collaborator