Skip to content

Conversation

@kenballus
Copy link

Starting in version 15, GCC emits a .base64 directive instead of .string or .ascii for char arrays of length >= 3.

See this godbolt link for an example.

This patch adds support for the .base64 directive to AsmParser.cpp, so tools like llvm-mc can process the output of GCC more effectively.

This addresses #165499.

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the llvm:mc Machine (object) code label Oct 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 29, 2025

@llvm/pr-subscribers-llvm-mc

Author: Ben Kallus (kenballus)

Changes

Starting in version 15, GCC emits a .base64 directive instead of .string or .ascii for char arrays of length >= 3.

See this godbolt link for an example.

This patch adds support for the .base64 directive to AsmParser.cpp, so tools like llvm-mc can process the output of GCC more effectively.

This addresses #165499.


Full diff: https://github.com/llvm/llvm-project/pull/165549.diff

1 Files Affected:

  • (modified) llvm/lib/MC/MCParser/AsmParser.cpp (+21)
diff --git a/llvm/lib/MC/MCParser/AsmParser.cpp b/llvm/lib/MC/MCParser/AsmParser.cpp index dd1bc2be5feb4..54bb1451a5a73 100644 --- a/llvm/lib/MC/MCParser/AsmParser.cpp +++ b/llvm/lib/MC/MCParser/AsmParser.cpp @@ -46,6 +46,7 @@ #include "llvm/MC/MCSymbolMachO.h" #include "llvm/MC/MCTargetOptions.h" #include "llvm/MC/MCValue.h" +#include "llvm/Support/Base64.h" #include "llvm/Support/Casting.h" #include "llvm/Support/CommandLine.h" #include "llvm/Support/ErrorHandling.h" @@ -530,6 +531,7 @@ class AsmParser : public MCAsmParser { DK_LTO_SET_CONDITIONAL, DK_CFI_MTE_TAGGED_FRAME, DK_MEMTAG, + DK_BASE64, DK_END }; @@ -552,6 +554,7 @@ class AsmParser : public MCAsmParser { // ".ascii", ".asciz", ".string" bool parseDirectiveAscii(StringRef IDVal, bool ZeroTerminated); + bool parseDirectiveBase64(); // ".base64" bool parseDirectiveReloc(SMLoc DirectiveLoc); // ".reloc" bool parseDirectiveValue(StringRef IDVal, unsigned Size); // ".byte", ".long", ... @@ -1953,6 +1956,8 @@ bool AsmParser::parseStatement(ParseStatementInfo &Info, case DK_ASCIZ: case DK_STRING: return parseDirectiveAscii(IDVal, true); + case DK_BASE64: + return parseDirectiveBase64(); case DK_BYTE: case DK_DC_B: return parseDirectiveValue(IDVal, 1); @@ -3076,6 +3081,21 @@ bool AsmParser::parseDirectiveAscii(StringRef IDVal, bool ZeroTerminated) { return parseMany(parseOp); } +/// parseDirectiveBase64: +// ::= .base64 "string" +bool AsmParser::parseDirectiveBase64() { + std::vector<char> Decoded; + + std::string str; + + if (parseEscapedString(str) || str.empty() || decodeBase64(str, Decoded)) { + return true; + } + + getStreamer().emitBytes(std::string(Decoded.begin(), Decoded.end())); + return false; +} + /// parseDirectiveReloc /// ::= .reloc expression , identifier [ , expression ] bool AsmParser::parseDirectiveReloc(SMLoc DirectiveLoc) { @@ -5345,6 +5365,7 @@ void AsmParser::initializeDirectiveKindMap() { DirectiveKindMap[".asciz"] = DK_ASCIZ; DirectiveKindMap[".string"] = DK_STRING; DirectiveKindMap[".byte"] = DK_BYTE; + DirectiveKindMap[".base64"] = DK_BASE64; DirectiveKindMap[".short"] = DK_SHORT; DirectiveKindMap[".value"] = DK_VALUE; DirectiveKindMap[".2byte"] = DK_2BYTE; 
@lenary
Copy link
Member

lenary commented Oct 29, 2025

Please add tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:mc Machine (object) code

3 participants