Skip to content

Conversation

xinyao27
Copy link

Background

Tools that generate media content (screenshots, images, etc.) currently need to return base64-encoded data directly, which makes tool implementations more complex and less efficient. Many tools naturally produce URLs (from CDN, cloud storage, or generation services) but have to download and encode the content themselves to work with models that don't support URLs.

Summary

Added automatic URL to base64 conversion for media content in tool result outputs. When a tool returns a URL in its media content and the model doesn't support URLs, the SDK will automatically:

  • Download the content from the URL
  • Convert it to base64 format
  • Include it in the prompt sent to the model

This allows tools to return URLs instead of base64 data, making tool implementations cleaner while ensuring compatibility with all models.

Manual Verification

  • Created a test tool that returns image URLs in its output
  • Verified that URLs are automatically downloaded when the model doesn't support them
  • Confirmed that base64 conversion happens correctly for downloaded content
  • Tested with both URL-supporting and non-URL-supporting model configurations
  • Verified existing functionality remains unchanged

Tasks

  • Tests have been added / updated (for bug fixes / features)
  • Documentation has been added / updated (for bug fixes / features)
  • A patch changeset for relevant packages has been added (for bug fixes / features - run pnpm changeset in the project root)
  • Formatting issues have been fixed (run pnpm prettier-fix in the project root)
  • I have reviewed this pull request (self-review)

Future Work

Related Issues

Add support for converting URLs to base64 in tool result outputs when the model
doesn't support URLs. This enables tools to return media content as URLs which
will be automatically downloaded and converted to base64 during prompt conversion.

Changes:
- Extract media URLs from tool result content for downloading
- Convert downloaded media to base64 in tool result outputs
- Add test coverage for URL conversion in tool results

This allows tools like screenshot or image generation tools to return URLs
instead of base64 data, making the tool implementation cleaner while ensuring
compatibility with models that don't support URLs.
@lgrammel
Copy link
Collaborator

can you resolve merge conflicts?

Copy link
Collaborator

@gr2m gr2m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some ways to simplify the implementation, it's a bit hard to read. Could you have another look?

Comment on lines +411 to +428
if (result.type === 'media' && result.data && result.mediaType) {
const { data: convertedData, mediaType: convertedMediaType } =
convertToLanguageModelV2DataContent(result.data);
if (convertedData instanceof URL) {
const downloadedFile = downloadedAssets[convertedData.toString()];
if (downloadedFile) {
return {
type: 'media' as const,
data: convertToBase64(downloadedFile.data),
mediaType:
downloadedFile.mediaType ??
convertedMediaType ??
result.mediaType,
};
}
}
}
return result;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid nesting, can you please implement an early return pattern?

const downloadedFiles = await Promise.all(
allUrls.map(async url => ({
url,
data: await downloadImplementation({ url }),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable downloadImplementation is undefined - it should be download to reference the function parameter.

View Details
📝 Patch Details
diff --git a/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts b/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts
index d34044049..21707edb9 100644
--- a/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts
+++ b/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts
@@ -1455,12 +1455,13 @@ describe('convertToLanguageModelMessage', () => {
           ],
         },
         supportedUrls: {},
-        downloadImplementation: async ({ url }) => {
-          expect(url).toEqual(new URL('https://example.com/screenshot.png'));
-          return {
+        download: async (downloads) => {
+          expect(downloads).toHaveLength(1);
+          expect(downloads[0].url).toEqual(new URL('https://example.com/screenshot.png'));
+          return [{
             data: new Uint8Array([137, 80, 78, 71]), // PNG magic bytes
             mediaType: 'image/png',
-          };
+          }];
         },
       });
 
diff --git a/packages/ai/src/prompt/convert-to-language-model-prompt.ts b/packages/ai/src/prompt/convert-to-language-model-prompt.ts
index 13ddda99a..1f7b3a7e1 100644
--- a/packages/ai/src/prompt/convert-to-language-model-prompt.ts
+++ b/packages/ai/src/prompt/convert-to-language-model-prompt.ts
@@ -279,30 +279,35 @@ async function downloadAssets(
     )
     .map(item => item!.data);
 
+  // Extract URLs from plannedDownloads that need to be downloaded
+  const urls = plannedDownloads
+    .filter(download => !download.isUrlSupportedByModel)
+    .map(download => download.url);
+
   const allUrls = [...urls, ...toolUrls];
 
-  // download in parallel:
-  const downloadedFiles = await Promise.all(
-    allUrls.map(async url => ({
-      url,
-      data: await downloadImplementation({ url }),
-    })),
-  );
+  // download in parallel using the batch download function:
+  const downloadRequests = [
+    ...plannedDownloads.filter(download => !download.isUrlSupportedByModel),
+    ...toolUrls.map(url => ({ url, isUrlSupportedByModel: false }))
+  ];
+  
+  const downloadResults = await download(downloadRequests);
 
   return Object.fromEntries(
-    downloadedFiles
-      .filter(
-        (
-          downloadedFile,
-        ): downloadedFile is {
-          mediaType: string | undefined;
-          data: Uint8Array;
-        } => downloadedFile?.data != null,
-      )
-      .map(({ data, mediaType }, index) => [
-        plannedDownloads[index].url.toString(),
-        { data, mediaType },
-      ]),
+    downloadResults
+      .map((result, index) => {
+        if (result?.data != null) {
+          return [
+            downloadRequests[index].url.toString(),
+            { data: result.data, mediaType: result.mediaType },
+          ];
+        }
+        return null;
+      })
+      .filter((entry): entry is [string, { data: Uint8Array; mediaType: string | undefined }] => 
+        entry !== null
+      ),
   );
 }
 

Analysis

Undefined Variables in Download Assets Function

Issue Summary

The downloadAssets function in packages/ai/src/prompt/convert-to-language-model-prompt.ts contains multiple undefined variables that cause TypeScript compilation errors and would result in runtime failures.

Specific Problems

1. Undefined Variable: downloadImplementation (Line 288)

The code calls downloadImplementation({ url }) but this variable does not exist in scope. The function parameter is named download.

// Current (incorrect) code:
data: await downloadImplementation({ url }),

// Should be:
data: await download(plannedDownloads.filter(...)),

2. Undefined Variable: urls (Line 282)

The code references [...urls, ...toolUrls] but urls is never defined. This should be extracting URLs from the plannedDownloads array.

// Current (incorrect) code:
const allUrls = [...urls, ...toolUrls];

// Should extract URLs from plannedDownloads:
const urls = plannedDownloads
  .filter(download => !download.isUrlSupportedByModel)
  .map(download => download.url);

3. Function Signature Mismatch

The DownloadFunction type expects an array of download requests and returns an array of results, but the current code calls it with individual URL objects. This architectural mismatch shows the code expects batch downloads but implements individual calls.

Compilation Impact

These errors prevent the TypeScript build from completing:

packages/ai/src/prompt/convert-to-language-model-prompt.ts(282,23): error TS2552: Cannot find name 'urls'. Did you mean 'URL'?
packages/ai/src/prompt/convert-to-language-model-prompt.ts(288,19): error TS2552: Cannot find name 'downloadImplementation'. Did you mean 'DOMImplementation'?

Runtime Impact

If the TypeScript errors were somehow bypassed, this would cause:

  • ReferenceError: downloadImplementation is not defined when URLs need downloading
  • ReferenceError: urls is not defined when processing URL collections
  • Application crashes when users upload files or images with URLs

Root Cause

This appears to be incomplete refactoring where:

  1. A download parameter was renamed but call sites weren't updated
  2. URL extraction logic was removed but the usage wasn't updated
  3. The function signature was changed from individual to batch processing without updating the implementation

url,
data: await downloadImplementation({ url }),
})),
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The array index mapping is incorrect - plannedDownloads[index] references the wrong array when both regular URLs and tool URLs are processed.

View Details
📝 Patch Details
diff --git a/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts b/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts
index d34044049..00c87dcf3 100644
--- a/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts
+++ b/packages/ai/src/prompt/convert-to-language-model-prompt.test.ts
@@ -1455,15 +1455,17 @@ describe('convertToLanguageModelMessage', () => {
           ],
         },
         supportedUrls: {},
-        downloadImplementation: async ({ url }) => {
-          expect(url).toEqual(new URL('https://example.com/screenshot.png'));
-          return {
-            data: new Uint8Array([137, 80, 78, 71]), // PNG magic bytes
-            mediaType: 'image/png',
+        download: async (downloads) => {
+          return downloads.map(({ url }) => {
+            expect(url).toEqual(new URL('https://example.com/screenshot.png'));
+            return {
+              data: new Uint8Array([137, 80, 78, 71]), // PNG magic bytes
+              mediaType: 'image/png',
           };
+          });
+
         },
       });
-
       expect(result).toEqual([
         {
           role: 'tool',
diff --git a/packages/ai/src/prompt/convert-to-language-model-prompt.ts b/packages/ai/src/prompt/convert-to-language-model-prompt.ts
index 13ddda99a..bb5650a72 100644
--- a/packages/ai/src/prompt/convert-to-language-model-prompt.ts
+++ b/packages/ai/src/prompt/convert-to-language-model-prompt.ts
@@ -279,30 +279,27 @@ async function downloadAssets(
     )
     .map(item => item!.data);
 
-  const allUrls = [...urls, ...toolUrls];
+  const urls = plannedDownloads.map(pd => pd.url);
+  const allDownloads = [
+    ...plannedDownloads.map(pd => ({ url: pd.url, isUrlSupportedByModel: pd.isUrlSupportedByModel })),
+    ...toolUrls.map(url => ({ url, isUrlSupportedByModel: false }))
+  ];
 
   // download in parallel:
-  const downloadedFiles = await Promise.all(
-    allUrls.map(async url => ({
-      url,
-      data: await downloadImplementation({ url }),
-    })),
-  );
+  const downloadedFiles = await download(allDownloads);
 
   return Object.fromEntries(
     downloadedFiles
-      .filter(
-        (
-          downloadedFile,
-        ): downloadedFile is {
-          mediaType: string | undefined;
-          data: Uint8Array;
-        } => downloadedFile?.data != null,
-      )
-      .map(({ data, mediaType }, index) => [
-        plannedDownloads[index].url.toString(),
-        { data, mediaType },
-      ]),
+      .map((downloadedFile, index) => {
+        if (downloadedFile?.data != null) {
+          return [
+            allDownloads[index].url.toString(),
+            { data: downloadedFile.data, mediaType: downloadedFile.mediaType },
+          ];
+        }
+        return null;
+      })
+      .filter(entry => entry !== null) as [string, { data: Uint8Array; mediaType: string | undefined }][],
   );
 }
 

Analysis

Array Index Mapping Bug in URL Download Function

Technical Issue Description

The downloadAssets function in packages/ai/src/prompt/convert-to-language-model-prompt.ts contains multiple critical bugs that prevent compilation and would cause runtime errors:

Primary Issues Identified

  1. Undefined Variable urls (Line 282): The code references urls in const allUrls = [...urls, ...toolUrls]; but this variable is never defined. The TypeScript compiler fails with error TS2552: Cannot find name 'urls'.

  2. Undefined Function downloadImplementation (Line 288): The code calls downloadImplementation({ url }) but this function doesn't exist. The correct parameter is download passed to the function.

  3. Array Index Mismatch (Line 303): As reported, the mapping logic uses plannedDownloads[index].url.toString() but downloadedFiles corresponds to allUrls, not plannedDownloads. When tool URLs are present, this creates an index mismatch that would cause runtime errors.

  4. Type Predicate Error: The filter function expects objects with url property but receives objects with different structure.

Code Analysis

The downloadAssets function:

  • Extracts URLs from user messages into plannedDownloads (objects with url property)
  • Extracts URLs from tool messages into toolUrls (just URL objects)
  • Attempts to combine them in allUrls = [...urls, ...toolUrls] but urls is undefined
  • Downloads files parallel to allUrls array
  • Maps results back using plannedDownloads[index] which is incorrect

Impact

  • Compilation Failure: TypeScript compilation fails with multiple errors
  • Runtime Errors: If somehow compiled, would throw TypeError when accessing undefined array elements
  • Data Corruption: Incorrect URL mapping would return wrong file associations

Root Cause

The code appears to be in an incomplete or partially refactored state where:

  1. urls should be plannedDownloads.map(p => p.url) or similar extraction
  2. downloadImplementation should be the download parameter
  3. The final mapping should use allUrls[index] instead of plannedDownloads[index]

This represents a fundamental logical error in array correspondence that would prevent the feature from working correctly when both user message URLs and tool message URLs are present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants