Skip to content

Conversation

bricknerb
Copy link
Contributor

@bricknerb bricknerb commented Jul 10, 2025

Added test coverage to demonstrate name lookup following import of a type of a function parameter.
SemIR changes show that the same decl isn't imported multiple times.

Part of #5533.

…mported.

This is the result of not finding that imported declaration when looking up its name, which is the result of unsupported or badly supported use case.
@zygoloid
Copy link
Contributor

This seems to be adding a diagnostic for a case that used to work; can you explain more about what this is for? In #5787 I made the opposite change for classes of the change you're making here for namespaces. The idea there is that we import the class declaration the first time it's reached when importing anything (including when naming the class directly) and subsequent uses reuse that same imported entity, and we only add it to the scope when it's found by name lookup in that scope.

@zygoloid zygoloid self-requested a review July 10, 2025 20:47
@bricknerb
Copy link
Contributor Author

This seems to be adding a diagnostic for a case that used to work; can you explain more about what this is for? In #5787 I made the opposite change for classes of the change you're making here for namespaces. The idea there is that we import the class declaration the first time it's reached when importing anything (including when naming the class directly) and subsequent uses reuse that same imported entity, and we only add it to the scope when it's found by name lookup in that scope.

Thanks! I wasn't aware of that PR.
My approach was that once we import a Clang declaration and create an instruction, we might as well add its name so we don't have to do Clang lookup.
I can see why if the name isn't used, we could save memory by not adding its name, though I assume the import itself would take significantly more memory.
The main thing I've tried to solve is avoiding double import. I've noticed that if we import a declaration before looking up its name, the name lookup would cause another import. In records we avoid it, but not in namespaces etc.
If will apply the idea of trying to always not add the name to the scope before a name lookup and merge with #5787.

@bricknerb bricknerb changed the title When importing a namespace of a parameter type, make sure to add the namespace name to its parent name scope When importing a looked up name, reuse previously imported declaration instead of importing the same declaration twice Jul 11, 2025
@bricknerb bricknerb changed the title When importing a looked up name, reuse previously imported declaration instead of importing the same declaration twice When importing a looked up name, reuse previously imported declaration instead of importing the same declaration again Jul 11, 2025
@bricknerb
Copy link
Contributor Author

This seems to be adding a diagnostic for a case that used to work; can you explain more about what this is for? In #5787 I made the opposite change for classes of the change you're making here for namespaces. The idea there is that we import the class declaration the first time it's reached when importing anything (including when naming the class directly) and subsequent uses reuse that same imported entity, and we only add it to the scope when it's found by name lookup in that scope.

Thanks! I wasn't aware of that PR. My approach was that once we import a Clang declaration and create an instruction, we might as well add its name so we don't have to do Clang lookup. I can see why if the name isn't used, we could save memory by not adding its name, though I assume the import itself would take significantly more memory. The main thing I've tried to solve is avoiding double import. I've noticed that if we import a declaration before looking up its name, the name lookup would cause another import. In records we avoid it, but not in namespaces etc. If will apply the idea of trying to always not add the name to the scope before a name lookup and merge with #5787.

Done, PTAL.

Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Thinking about this some more, I think there's a strong philosophical argument for not adding names to scopes anywhere other than in the name lookup path. Fundamentally, C++ name lookup is complicated and difficult; if we only add names to Carbon scopes when C++ name lookup has told us that name is in that scope, we have a straightforward way to make the Cpp.* name lookup behavior match C++ name lookup. But if we add names to scopes anywhere else, then we're effectively predicting what we think C++ name lookup would do and prepopulating our name lookup tables with that information, and that invites bugs. For example:

namespace N { struct X {}; }
N::X f();
inline namespace { namespace N {} }

Here, we can reference the function f, and call it from Carbon. But lookup for the name Cpp.N should fail with an ambiguity error. However, if we add the name N to scope when creating the namespace, as part of importing the type X, then I think name lookup for Cpp.N would incorrectly succeed if Cpp.f() were used first, and fail otherwise.

The same kind of concern probably comes up in other guises in the import logic: for example, ImportNamespaceDecl is given a parent_scope_id to use as the parent of the created namespace, and that's probably going to be subtly wrong in some cases (anonymous namespaces, inline namespaces, namespace aliases). Instead, I think we should always use the AsCarbonNamespace logic to look up or add the parent namespace, even when we're adding a namespace that was found by name lookup.

Comment on lines 1026 to 1031
if (SemIR::ClangDeclId clang_decl_id = clang_decls.Lookup(clang_decl);
clang_decl_id.has_value()) {
inst_id = clang_decls.Get(clang_decl_id).inst_id;
} else {
inst_id = ImportNameDecl(context, loc_id, scope_id, name_id, clang_decl);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performing the check at the top level here will mean we do the same hash table lookup multiple times in some cases. For example, for type handling we will now look up a class type twice, once here and once when we reach MapRecordType. Also the map lookup will never succeed for a declaration of a kind that we don't store in the map, such as a typedef or using declaration, so it's a wasted lookup in those cases.

I think due to the above, it'd be more efficient to sink the clang_decls lookups into the individual decl handling functions. That'd mean that instead of callingImportNamespaceDecl directly in ImportNameDecl, we should be calling AsCarbonNamespace, and we should add logic to do the clang_decls lookup to the ImportFunctionDecl code path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, though I think reusing AsCarbonNamespace(), which we use when we only have the namespace declaration but not its scope actually generates another lookup to search for its parent instruction.
Am I missing something?
I could add the scope as another (optional?) parameter so if we have it we can pass it. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another case of trying to predict what clang will do. We don't know what the owning scope of the namespace is, we only know which scope name lookup found the name in, and that's not necessarily the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll leave it as is for now.

I would like to understand this better though.
If we're searching for a name in Cpp.X, and we find it by a Clang lookup for that name in X, we shouldn't necessarily add that name to Cpp.X?
IIUC, if we don't, the next time we will search for Cpp.X, we will have to do another name lookup in Clang.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we're talking about different things.

When we do a lookup for Cpp.X, we should always add the result as the value of X in the Carbon name scope Cpp (and likewise in nested scopes), as a way of caching the result. What I'm arguing is that we should not set that Cpp is the parent scope of X when we create X itself -- that isn't necessarily correct. For example:

inline namespace Y {
  namespace X {
    void F();
  }
}

When Carbon code first names Cpp.X.F, Clang's name lookup will find X inside Cpp, and we should add the name to the Carbon name scope for Cpp. When Carbon code first names Cpp.Y.X.F, Clang's name lookup will find X inside Y, and we should add the name to the Carbon name scope for Cpp.Y. But either way, Carbon's parent scope for X should be Y.

So what I'm thinking is that the parent scope in which we found X should not be passed to AsCarbonNamespace (nor to ImportNameDecl or ImportFunctionDecl), because the scope in which we found the name isn't necessarily related to the scope that semantically owns the function. (We shouldn't assume they're the same, because in C++ in general, they're not.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is super helpful!
I've added a TODO to properly address this as a follow up, hopefully with some tests that catch this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the TODO was handled in #5821.
Adding inline namespace tests in #5826.

@josh11b josh11b removed their request for review July 11, 2025 21:15
@bricknerb bricknerb requested a review from zygoloid July 14, 2025 08:39
Copy link
Contributor

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

clang_decls.Lookup(clang::dyn_cast<clang::Decl>(decl_context));
context_clang_decl_id.has_value()) {
return clang_decls.Get(context_clang_decl_id).inst_id;
// Check if the declaration is already mapped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: I think we can avoid this check entirely if we rotate the loops below so they treat the given decl_context the same at its enclosing contexts.

Copy link
Contributor Author

@bricknerb bricknerb Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added a TODO to try and do this in a separate change to avoid delaying this bug fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO removed with #5821.
Hopefully this makes sense.

@bricknerb bricknerb enabled auto-merge July 16, 2025 11:10
@bricknerb bricknerb added this pull request to the merge queue Jul 16, 2025
Merged via the queue into carbon-language:trunk with commit 4f8d064 Jul 16, 2025
9 checks passed
@bricknerb bricknerb deleted the class3 branch July 16, 2025 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants