[RFC] An opt-in CMake option for 64-bit Source Location

TL;DR: implementing 64-bit source location as an opt-in feature via a CMake option (CLANG_ENABLE_64_BIT_SOURCE_LOCATIONS)


The initial proposal to switch clang entirely to 64-bit SourceLocation raised concerns about performance overhead and maintenance burden during discussion. Based on community feedback, this revised RFC proposes a compromise: an opt-in feature that provides 64-bit support only for those who need it.

Proposal
I propose a new CMake option: CLANG_ENABLE_64_BIT_SOURCE_LOCATIONS:

  • when OFF (Default): clang will continue to be compiled with the 32-bit SourceLocation. This ensures zero impact on performance and memory usage for the majority of users.
  • when ON: clang will be compiled with a 64-bit SourceLocation. This expands the address space, mitigating the exhaustion issue.

Support & Maintenance Plan
To ensure this option is sustainable without burdening the community, the following support plan is proposed:

  • support tier: this option will be supported under LLVM’s Peripheral Support Tier.
  • continuous Integration: we will setup a dedicated buildbot with this option enabled to prevent breakages and ensure the feature remains functional.
  • responsibility: the primary responsibility for maintaining this configuration and fixing bugs specific to it will on the users of the feature.

Feedback and comments are welcome.

1 Like

I don’t have objections to this, but I think it would be nice to repeat here the small maintenance burden on the community you mentioned in Revisiting 64-bit source locations - #42 by hokein

I realize this is a ‘not motivating’/‘trying to re-open the closed door’, but I still disagree with the CAT. I think the performance hit is incredibly minor, and the maintenance overhead of having BOTH of these in our CMake/source code is a much greater impact.

I realize many are not particularly motivated here, but the perf is about equal in the initial proposal, the additional stack space is reasonably minor/etc, and it solves a big problem.

But I’ll shut up now :slight_smile:

To me, the big issue here is that this is a one-way door: if we ship 64-bit slocs, people will rely on them, and we won’t be able to go back. I think the perf hit is minor: if someone was trying to optimize slocs from 64-to-32-bits today, we’d probably declare this wasn’t worth it.

However, in the call we had last week, the overwhelming feedback in the “room” was that there are other implementation limits hiding just behind the sloc limit, so raising this one doesn’t make it meaningfully easier to adopt modules. It just increases our support burden, because now there will be less pressure on users to avoid duplicate textual headers. The sloc limit is a documented implementation limit that pushes users to modularize from the bottom-up in a way that clearly (perhaps it could be more clear) makes it the user’s responsibility to manage their include graph. If we relax the limit, users will instead run into a long tail of declaration merging internal compiler bugs that we are unlikely as a community to come together and fix.

Adding the CMake option gives us a way to gather data on the usefulness of 64-bit slocs for modularization, without forcing everyone to go through the one-way door and eat the performance cost forever.

1 Like

This I definitely agree with.

For me it is more than just enabling modules? But I also have little sympathy for "if we fix bugs/limitations, people will just complain about DIFFERENT limitations. I’ve never decided not to fix an issue because it would enable others to run into other bugs.

Perhaps? I wonder if we could do something similar WITHOUT Cmake and do all of the changes EXCEPT the larger sloc (that is, the proposal is a ton of infrastructure, increases the size of a SourceLocation, and increases the stored size to 42 bits. I wonder if we could do the infrastructure and SourceLocation changes, but set the size to 32 bits and look for regressions that way? (OR MAYBE make the CMake option JUST setting that?).

I very much would like us to just be in a world where I never have to hear about source-locations being limiting because they are 32 bits again in my life :smiley:

1 Like

I agree with Reid here. I think people aren’t comfortable with this yet and this patch is a good path forward to resolving that one way or another.

Unless LLVM is looking to actually perpetually support the option, I recommend making it something more explicit that it is an experimental flag. For example, requiring CLANG_EXPERIMENTAL_ENABLE_64BIT_SOURCE_LOCATIONS=I_ACCEPT_EXPERIMENTAL_TERMS_AND_CONDITIONS.

2 Likes

It is not just enabling modules. It is about enabling an approach to modularization of software stack which failed many, many times across many companies. By far, the clang modules community has the most experience in how to get modularization and speed at the same time. Unfortunately, the solution is painful – you have to work all the way up from libc to your dependencies. That works because your modules now contain only what they need. Pretty similarly to a class having a pointer to an object instead of a copy.

The current requirement (besides boost preprocessor) is the fact that the other modularization strategies will need header copies per module which will never be more performant than referencing them by pointer.

We have proposed several downstream suggestions to be tried out first before talking about lifting limits because that comes at a huge cost for the community. All of them were discarded at that point without sufficient investments of efforts or discussion in my opinion.

Unfortunately in this particular case lifting a limit here does not have a positive connotation. That is because the clang community has found sweet spots across many software stacks for many years. Pretty similarly to why we can’t just randomly change optimization pass thresholds, for example.

Many of us have been burning in hell over a decade to optimize the last bit of modules to make their use feasible at scale since almost a decade. Our recommendation for the modules adopters should be to start from the bottom of their software stack. Otherwise we risk to jeopardize modules adoption because everybody will start putting everything everywhere and complain that things are slow, unusable and buggy with clang.

The problem for the ASTReader is that once you lift the limits of source locations, then you will have to fix other assumptions like offsets within the ast files which will also affect the overall file size which in turn will start hurting the clients that have done the bottom up modularization which in turn hamper the incentive towards going the recommended way…

As I’ve mentioned in the previous RFC there are two ways to approach the particular problem:

  1. Canonicalize the duplicate source files in the ASTReader.
  2. Move all of the middle layer copies into a PCH which serves as a baseline for the modularization.

Of course the 32 bit sloc is not cast in stone. However, I’d still want to see stronger use-case why we should do that.

Thank you for your proposal. I am fairly opposed to multiplying configuration options.

Either we decide as a community to support it, or not support it, but the in-between of “doing both” means, as you noticed, that we will need more CI, more maintenance, etc. And everywhere we do clever bit packing, we will have to assume 64 bits anyway.
(In terms of CI, we probably would need to test at least a few host systems with different address sizes.)

So it would be more burdensome than making a choice, and there would be a performance hit anyway as it would become riskier to do anything “interesting” with SourceLocation (in neither configuration, ie, the assumption that SourceLocation is 42 bits (for example) would be hard to leverage if it can also be 32 bits)

It’s unlikely that we would be able to get resources for this configuration on llvm-compile-time-tracker.com, so it would not collect data on performance.

The group of people convinced that this is the main blocker to module adoption might use it. Other issues and limits will creep up; (I don’t understand how including many copies of the same headers, and subsequently building and preserving long redeclaration chains, will satisfy anyone). I also agree with Reid that there would be no coming back.

I understand that tackling the underlying issue is extremely difficult, and there might not be even much that can be done in clang, which is a deeply unsatisfying answer.

But having an optional band-aid over that problem seems like it would double maintenance efforts while failing to answer the question or satisfy anyone.

So my position remains that we should either collectively accept the status quo (while fixing wastage around macro expansion, duplication, etc, if we can), or decide to make them bigger unconditionally, along the lines of what Erich is suggesting.

I didn’t notice the previous discussion. So I didn’t know the reason why we reject the idea. Due to performance overhead?

I agree that making this an option will introduce more burden to the community, while clearly we’re lacking human sources.

For the problem itself, I feel (not think) it will be better if we can make it 64 bits if we can avoid bugs. Erase limitations sounds always good to me.

Note that I don’t relate this with modules. As @rnk said, it will make people easier to use modules in the unefficient way. A good example is [C++20 modules] Large module compilations run into repeated header include limits. · Issue #127561 · llvm/llvm-project · GitHub . So, in the short term, module users might face more problems since they may make the mistake easily. I just feel it is better if we can erase the limitation for the VERY long term if we can avoid bugs.

Personally, I think this is a reasonable option to support. As Reid said, a full switch to 64-bit source locations is not a decision we’re going to undo once we make it. And as others have pointed out, once we make that switch, we’re now on the hook for all the follow-on issues that happen because the compilation now gets significantly farther along than it did when we’d run out of source locations at 32-bits. It’s those follow-on issues that worry me; the user’s experience is almost certainly going to be poor because we have to actually parse all that code. We already sometimes run into problems with running out of memory/stack space when compiling code which doesn’t hit the 32-bit limit and the community has not organically elected to improve that in a systematic way. I don’t expect anyone will be volunteering to fix bugs that only happen when you run into the need for 64-bit source locations. Even triaging those bugs will be painful because there’s every chance you can’t reduce the code to a small reproducer without causing the issues to disappear.

So I think a CMake option which the community only has to do minimal maintenance for strikes a nice balance. It gives downstreams a way to support their use cases while gathering more field experience with what follow-on problems happen, without a significant burden on the rest of the community. Having a better understanding of the ramifications of the change before making it a permament decision makes me more comfortable.

I hope that, as a community, this time we can reach a conclusion on that we should move to 64-bit source location (or 32-bit source location is the hard limit in clang), given that we’ve already invested significant effort into this topic – this is being the third serious attempt.

However, based on the current feedback, it seems we aren’t yet ready to make a final decision. Some alternative approaches still haven’t been thoroughly explored. These would require entirely new projects and require non-trivial effort. Personally, I don’t have the bandwidth to investigate them further.

That said, while the proposed CMake option isn’t intended as a perfect or final solution, it appears to be a reasonable compromise that allows us to move forward.

I won’t deny that introducing an opt-in CMake option carries some maintenance cost. However, I believe that cost is manageable and reasonable. The likelihood of major changes to the SourceLocation structure seems low. And in practice, when developing new features (example), we’re already accounting for 64-bit locations, so we are arguably already paying this cost.

2 Likes

That is a pity. I’ve started putting together an example of how one can mix pch and pcms. If you are not intending to look at the example I should probably not invest time in this then.

What is the advantage of having this as a separate upstream option and not a downstream solution for the particular set of requirements your project has?

I’d like to respectfully disagree. I don’t find it convincing that we should commit to a one-way switch – especially without first exploring the alternative proposals that have been put forward.

Moreover, you’ve mentioned several times that you currently lack the bandwidth to address adjacent concerns beyond the scope of the current project. If we’re aiming to move the community toward what appears to be a controversial change, we’ll need stronger assurances that someone will be available to follow through on the related issues that may arise.

1 Like

Thanks for offering to help! We’re working on a large monolithic codebase, and our build graph is complex. I’m not sure how this technology could be applied at scale in our case.

The alternative approach of deduplicating headers across modules, I think would address our module-related use cases. As I mentioned in the meeting, it comes at the cost of reduced diagnostic quality – we will lose the include stack information, we might be able to reconstruct it, but that’s an another problem to solve. And I anticipate that a file-path-based deduplication strategy would likely face tricky implementation challenges (symlinks etc).
That said, I think this approach could work, but it’s a big task.

These approaches only addresses the module-related case. They don’t address the “heavy macro” use cases that motivated the earlier RFCs to introduce 64-bit source locations. I understand there hasn’t been a clear conclusion on whether this scenario will be officially supported, so perhaps it’s not a strong argument at this point.

A downstream solution is possible, but it is not ideal:

  • the patch is non-trivial, we’d prefer not to carry a large local patch;
  • this isn’t a problem unique to my project; other users have encountered the same limitation and would benefit from an upstream solution;
  • it ensures the current 64-bit implementation isn’t lost. If the community ever commits to a permanent switch in the future, the work won’t need to be redone for a third time;

This idea was also raised during the previous RFC discussions. In fact, some experienced clang developers assumed that such a configuration switch already existed.

I agree with your point, and I want to clarify – my use of the word “hope” was about a potential, future, and final decision on 64-bit source location. I agree we shouldn’t commit to do it without fully exploring all alternatives at this point.

1 Like

I agree. If we can write a 3 paragraph project proposal I can probably help with finding a young professional to work on this… That’s something that the entire community across companies will benefit from. Would you be willing to invest some time in scoping a project and eventually giving feedback once the work starts?

1 Like

Thanks, that’s great. While I may not be able to commit the time to explore a complete design and write a full proposal, I’m happy to provide feedback, review code, and help test the implementation within our codebase.

Pulled comment from the previous RFC:

There are currently only four people signed up (including me as the host) and the RFC author is not available for any of the discussion slots. Should we hold off on having that meeting until the author is available, should we go forward with scheduling, something else?

CC @efriedma-quic @rnk

2 Likes

(Personal statement, not a Clang Area Team communication.)

Oddly enough, I ran into exactly this situation today while testing Clang’s behavior for some papers in WG14. There’s a proposal to standardize __COUNTER__ and part of that proposal was talking about limits on the value. I went to test Clang’s behavior around those limits because I think we have a bug with our current implementation, but I run out of source locations and cannot hit that bug. Once we allow 64-bit source locations, it exposes users to the fact that __COUNTER__ will wraparound to zero in Clang, so it will start to repeat values (and given that this macro exists largely to help generate unique identifiers, that’s a bad behavior).

Note, it’s not “let’s stick with 32-bit source locations so we don’t have to fix __COUNTER__”, but it is “once we switch to 64-bit source locations, we’ll get more follow-on bugs as a result and those take community resources to triage and maintain, so we need to keep that in mind as part of the decision”.

I don’t have any major updates since our last discussion, so my availability shouldn’t block the meeting. Feel free to hold the meeting without me to keep things moving (I will try to join if possible).

I see it from a different perspective. I believe the current clang’s behavior is undefined when violating the SourceLocation limit. This is only caught by a debug-only assertion, meaning most users on release builds are silently exposed to this potential instability. Addressing these follow-on issues is an improvement that makes clang more robust and its behavior more well-defined.

1 Like

We’re going to go ahead with the meeting and hold it Tue Aug 12 at 1pm ET. You can join the meeting at https://meet.google.com/oun-bocb-ixr