Skip to content

Conversation

owenca
Copy link
Contributor

@owenca owenca commented Oct 4, 2025

Fixes #160513

@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2025

@llvm/pr-subscribers-clang-format

Author: owenca (owenca)

Changes

Fixes #160513


Full diff: https://github.com/llvm/llvm-project/pull/161944.diff

2 Files Affected:

  • (modified) clang/lib/Format/TokenAnnotator.cpp (+1-1)
  • (modified) clang/unittests/Format/TokenAnnotatorTest.cpp (+5)
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp index 0c9c88a426c89..99d537f513901 100644 --- a/clang/lib/Format/TokenAnnotator.cpp +++ b/clang/lib/Format/TokenAnnotator.cpp @@ -3802,7 +3802,7 @@ static bool isFunctionDeclarationName(const LangOptions &LangOpts, const auto *Prev = Current.getPreviousNonComment(); assert(Prev); - if (Prev->is(tok::coloncolon)) + if (Prev->is(tok::coloncolon) && Prev->hasWhitespaceBefore()) Prev = Prev->Previous; if (!Prev) diff --git a/clang/unittests/Format/TokenAnnotatorTest.cpp b/clang/unittests/Format/TokenAnnotatorTest.cpp index 4a8f27f656f1d..e004a6f3f02f5 100644 --- a/clang/unittests/Format/TokenAnnotatorTest.cpp +++ b/clang/unittests/Format/TokenAnnotatorTest.cpp @@ -1129,6 +1129,11 @@ TEST_F(TokenAnnotatorTest, UnderstandsOverloadedOperators) { ASSERT_EQ(Tokens.size(), 7u) << Tokens; // Not TT_FunctionDeclarationName. EXPECT_TOKEN(Tokens[3], tok::kw_operator, TT_Unknown); + + Tokens = annotate("SomeAPI::operator()();"); + ASSERT_EQ(Tokens.size(), 9u) << Tokens; + // Not TT_FunctionDeclarationName. + EXPECT_TOKEN(Tokens[2], tok::kw_operator, TT_Unknown); } TEST_F(TokenAnnotatorTest, OverloadedOperatorInTemplate) { 
assert(Prev);

if (Prev->is(tok::coloncolon))
if (Prev->is(tok::coloncolon) && Prev->hasWhitespaceBefore())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But SomeAPI ::operator()(); has space before and still gets miss annotated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But SomeAPI ::operator()(); has space before and still gets miss annotated.

Not necessarily. See e.g.

Tokens = annotate("friend ostream& ::operator<<(ostream& lhs, foo& rhs);");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I feed exactly that string and it got the TT_FunctionDeclarationName.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because it is a function declaration there. So we annotate this kind of constructs as follows:

Foo::operator()(); // function call Bar ::operator<<(); // function declaration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it is not. It's the same, just with a space in the input:

echo "SomeAPI::operator()(); SomeAPI ::operator()();" | clang-format --debug-only=format-token-annotator AnnotatedTokens(L=0, P=0, T=6, C=0): I=0 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=0 Name=identifier N=0 L=7 PPK=2 FakeLParens= FakeRParens=0 II=0x558d281139e0 Text='SomeAPI' I=0 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=23 Name=coloncolon N=0 L=9 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='::' I=0 M=0 C=1 T=Unknown S=0 F=0 B=0 BK=0 P=520 Name=operator N=0 L=17 PPK=2 FakeLParens= FakeRParens=0 II=0x558d281067c8 Text='operator' I=0 M=0 C=0 T=OverloadedOperator S=0 F=0 B=0 BK=0 P=23 Name=l_paren N=0 L=18 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='(' I=0 M=0 C=0 T=OverloadedOperator S=0 F=0 B=0 BK=0 P=140 Name=r_paren N=1 L=19 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')' I=0 M=0 C=0 T=OverloadedOperatorLParen S=0 F=0 B=0 BK=0 P=23 Name=l_paren N=0 L=20 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='(' I=0 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=140 Name=r_paren N=1 L=21 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')' I=0 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=23 Name=semi N=0 L=22 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=';' ---- AnnotatedTokens(L=0, P=0, T=6, C=0): I=0 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=0 Name=identifier N=0 L=7 PPK=2 FakeLParens= FakeRParens=0 II=0x558d281139e0 Text='SomeAPI' I=0 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=23 Name=coloncolon N=0 L=10 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='::' I=0 M=0 C=1 T=FunctionDeclarationName S=0 F=0 B=0 BK=0 P=520 Name=operator N=0 L=18 PPK=2 FakeLParens= FakeRParens=0 II=0x558d281067c8 Text='operator' I=0 M=0 C=0 T=OverloadedOperator S=0 F=0 B=0 BK=0 P=23 Name=l_paren N=0 L=19 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='(' I=0 M=0 C=0 T=OverloadedOperator S=0 F=0 B=0 BK=0 P=140 Name=r_paren N=1 L=20 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')' I=0 M=0 C=0 T=OverloadedOperatorLParen S=0 F=0 B=0 BK=0 P=23 Name=l_paren N=0 L=21 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='(' I=0 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=140 Name=r_paren N=1 L=22 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')' I=0 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=23 Name=semi N=0 L=23 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=';' ---- AnnotatedTokens(L=0, P=0, T=6, C=0): I=0 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=0 Name=eof N=0 L=0 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='' ---- SomeAPI::operator()(); SomeAPI ::operator()(); 
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::operator preceded by a tok::identifier can very well be a function decl. See e.g. Out ::operator<<( in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it can be, that was never the point of any discussion (on my side). But isn't this PR not about to not mark a function call to ::operator as declaration? And I just show, that the same code, just with a space before :: is annotated differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I said we should annotate them as intended by the user.

So we annotate this kind of constructs as follows:

Foo::operator()(); // function call Bar ::operator<<(); // function declaration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why you always bring the up function declaration.
Before your patch:

 auto Tokens = annotate("SomeAPI::operator()();\n" "SomeAPI ::operator()();"); EXPECT_TOKEN(Tokens[2], tok::kw_operator, TT_FunctionDeclarationName); EXPECT_TOKEN(Tokens[10], tok::kw_operator, TT_FunctionDeclarationName);

After your patch:

 auto Tokens = annotate("SomeAPI::operator()();\n" "SomeAPI ::operator()();"); EXPECT_TOKEN(Tokens[2], tok::kw_operator, TT_Unknown); EXPECT_TOKEN(Tokens[10], tok::kw_operator, TT_FunctionDeclarationName);

It only correctly annotates the first line, with no space before the ::.

That's all I say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I give up on trying to explain further as it's a moot point with #164048.

@owenca owenca marked this pull request as draft October 16, 2025 04:09
@owenca
Copy link
Contributor Author

owenca commented Oct 18, 2025

Closing in favor of #164048

@owenca owenca closed this Oct 18, 2025
@owenca owenca deleted the 160513 branch October 18, 2025 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

3 participants