Skip to content

Conversation

dibarbet
Copy link
Member

@dibarbet dibarbet commented Jan 9, 2025

Resolves dotnet/vscode-csharp#7638
Resolves microsoft/vscode-dotnettools#1932

Related to dotnet/runtime#64707

Problem

System.Uri will throw attempting to parse URIs that have sub-delims in the host name. This is because it does additional host name validation beyond what is defined in the URI RFC spec. See above linked issues for specific examples.

When we get passed these URIs from VSCode, the server crashes because we cannot successfully deserialize the LSP URI string into a System.Uri. We are unable to easily recover from these as it happens before we get to the queue processing.

Solution

The approach I took in this PR is to stop parsing URIs during LSP message deserialization and remove usages of System.Uri from LSP directly. Instead, I created a wrapper type DocumentUri which initially stores just the string representation of the URI (exactly how LSP defines URIs). The System.Uri can be optionally retrieved from this (which will parse it). Only places which need to extract information from the URI should retrieve the System.Uri and must handle failures.
 
This allows URI parsing to be delayed until much later (for example looking at the scheme or file path to find a matching document in the workspace). While we still cannot parse the URI, we can handle the error and provide a misc document instead of crashing the server. These aren't normal files anyway (they don't conform to the file:///<path> format) and would go to misc regardless of if we succeed in parsing it or not.

If at some point the runtime adds a new parsing mode, or we switch to our own URI parser (similar to O#), this still provides value. Deserialization of the URI string and parsing the URI string are two separate logical operations and should not be tied together. Additionally, any runtime improvements here would only be available in .NET 10.

Key areas to look at

  1. DocumentUri
  2. DocumentUriConverter
  3. LspWorkspaceManager
  4. ProtocolConversions

Razor PR - https://github.com/dotnet/razor/pull/11390/files

@ghost ghost added Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead labels Jan 9, 2025
@dibarbet dibarbet force-pushed the uri_crash branch 9 times, most recently from a6bf829 to e0e73bb Compare January 14, 2025 00:28
/// TODO: document.
/// Converts the LSP spec URI string into our custom wrapper for URI strings.
/// We do not convert directly to <see cref="System.Uri"/> as it is unable to handle
/// certain valid RFC spec URIs. We do not want serialization / deserialization to fail if we cannot parse the URI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider linking to the runtime issue.

// Valid URI, but System.Uri cannot parse it.
[InlineData(true, "perforce://@=1454483/some/file/here/source.cs")]
[InlineData(false, "perforce://@=1454483/some/file/here/source.cs")]
public async Task TestOpenDocumentWithInvalidUri(bool mutatingLspWorkspace, string uriString)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new tests for these URIs

// If either of the URIs cannot be parsed, we'll compare the original URI strings.
if (otherUri.ParsedUri is null || this.ParsedUri is null)
{
return this.UriString == otherUri.UriString;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this must be false, right? because we did this exact check above, and would have returned 'true' if it succeeded. so why not return false here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, that should definitely return false. good catch

/// In order to gracefully handle these issues, we defer the parsing of the URI until someone
/// actually asks for it (and can handle the failure).
/// </remarks>
internal sealed class DocumentUri : IEquatable<DocumentUri>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main new wrapper type

/// </summary>
internal class DocumentUriConverter : JsonConverter<Uri>
internal class DocumentUriConverter : JsonConverter<DocumentUri>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new serializer for new wrapper type

private class LspUriComparer : IEqualityComparer<Uri>
{
public static readonly LspUriComparer Instance = new();
public bool Equals(Uri? x, Uri? y)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to the new type

@@ -112,7 +62,7 @@ public int GetHashCode(Uri obj)
/// the URI.
/// <para/> Access to this is guaranteed to be serial by the <see cref="RequestExecutionQueue{RequestContextType}"/>
/// </summary>
private ImmutableDictionary<Uri, (SourceText Text, string LanguageId)> _trackedDocuments = ImmutableDictionary<Uri, (SourceText, string)>.Empty.WithComparers(LspUriComparer.Instance);
private ImmutableDictionary<DocumentUri, (SourceText Text, string LanguageId)> _trackedDocuments = ImmutableDictionary<DocumentUri, (SourceText SourceText, string LanguageId)>.Empty;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest of this file is interesting - contains changes to attempt to parse and handle issues

@dibarbet dibarbet marked this pull request as ready for review January 14, 2025 23:53
@dibarbet dibarbet requested a review from a team as a code owner January 14, 2025 23:53
@jasonmalinowski jasonmalinowski self-requested a review January 15, 2025 02:09
@dibarbet
Copy link
Member Author

As for the pause here - I need to update XAML with these changes, but have been working on infra. Will get back to this shortly

@dibarbet
Copy link
Member Author

dibarbet commented May 5, 2025

As for the pause here - I need to update XAML with these changes, but have been working on infra. Will get back to this shortly

This is ready for re-review. The change is 90% the same, except that I made it backwards compatible by not deleting the original URI properties on the protocol types. Instead they are obsoleted and I added new DocumentUri properties to all the ones consumed by Razor/Xaml. We can then switch them over to the new properties at our own time, instead of managing a triple insertion.

Copy link
Member

@JoeRobich JoeRobich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signing off with the understanding that the Uri compatibility shims are only present in the Protocol types used by Razor/Xaml.

@dibarbet
Copy link
Member Author

dibarbet commented May 6, 2025

@dibarbet dibarbet merged commit 4796e92 into dotnet:main May 6, 2025
25 checks passed
@dibarbet dibarbet deleted the uri_crash branch May 6, 2025 17:44
@dotnet-policy-service dotnet-policy-service bot added this to the Next milestone May 6, 2025
@RikkiGibson RikkiGibson modified the milestones: Next, 18.0 P1 Aug 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-IDE untriaged Issues and PRs which have not yet been triaged by a lead VSCode
Projects
None yet
5 participants