Rust, LLVM, and LLDB

Hello Community
I am in the process of understanding the current state of debugging in Rust. For this I am trying to get a feeling of what the general sentiment is in here for:

  • Using lldb to debug Rust via python formatters (this approach is slow and limits how well you can interact with variables)

  • Creating a native lldb TypeSystem for Rust to replace the large set of python formatters which is current being used (see link in the bottom)

  • Writing lldb source code in Rust (Or other subsystems in Rust)

  • Current collaboration between the Rust core development team and the LLDB development team.

As somewhat of a side note:

  • Writing a debugger from scratch in Rust for example:

A person called “walnut356” wrote an article where he tried to make sense of it would take to make Rust work natively with LLDB:

I believe that there is a lot to gain for both LLDB (and by extension LLVM) and Rust by trying to research, design, and develop a good solution to this issue. I think that there are few Rust developers that are experienced enough in C/C++ to fix these issues themselves (myself included).
Kind regards,
Christian Brunbjerg

1 Like

I would certainly like to see better LLDB support for Rust. (That being said, I am not a maintainer, just a semi-regular contributor to LLVM / LLDB / lldb-dap. So take my word with a grain of salt).

A person called “walnut356” wrote an article where he tried to make sense of it would take to make Rust work natively with LLDB:

He actually is also on this forum - CC @Walnut356 :wink:

Yes, it is a known pain point that LLDB’s extensibility interfaces are not as good as they could be and there is agreement that it would be great to make new languages easier to add. See New Language Support for LLDB - Improving Extensibility for a recent discussion on that.

current state of debugging in Rust

I assume you are aware of @vadimcn’s CodeLLDB? Note that this is more than just a packaged lldb. It is lldb + extensions for debugging Rust.

  • Current collaboration between the Rust core development team and the LLDB development team.

I think any upstream changes to simplify LLDB’s interfaces to make it simpler to add new languages will be gladly accepted.

When it comes to accepting Rust support in up-stream LLDB, see this previous thread: Rust support in LLDB, again. Back then @JDevlieghere wrote

I guess this statement still applies, but not sure. As already said - I am not a maintainer, others will have to make those calls :slightly_smiling_face:

Also besides the question of “accept Rust language upstream?”, the thread Rust support in LLDB, again might be worth a read. Contains a bit of background information on Rust-specific support already in LLDB today, in particular around Rust-specific support for DW_TAG_variant

Howdy!

I’ve been contacted privately regarding debugging by several Rust and LLDB maintainers. I’ve been a bit swamped so I’ve been putting off following up on it, but I can share what I know atm.

I dont want to say too much on anyone’s behalf, but Rust maintainers have expressed some interest.

They are currently looking into overhauling the compiler’s testing suite for debug info because it’s more or less non-functional and very brittle to changes in the debuggers themselves. It’s hard to account for the versions of the ambient tools, and those version changes can cause huge breakages. It’s also hard to replicate CI failures on local machines (or vice versa). Last I checked, CI only ran the tests on Linux, and there was talks of disabling them altogether until they’re fixed. I was unable to get them running on Windows locally at all.

That’s not a direct blocker to debug info changes on Rust’s end, but there’s definitely some hesitence because of it.

The major blockers seem to be lack of experts from those debuggers and/or lack of interest in general. The Windows side of things is especially neglected (for both *-gnu and *-msvc targets). There is some potential for help or information from Microsoft? Take that with a HUGE grain of salt though. The most I know is a Rust maintainer knows a guy who works on Rust UX at MS and was willing to pass on a few of my questions. PDB as a format is also still seemingly getting new features, so any info we can get from them would be awesome.

I’ve been back and forth on if it makes more sense to upstream TypeSystemRust or just add more conditional behavior to TypeSystemClang, but I’m just about done adding PDB support to TypeSystemRust and that has tipped my opinion towards having a dedicated plugin. There’s too many little edge cases, and it would solve a lot of the bigger annoyances with debug info maintenance (e.g. having to have separate Synthetic Providers for gnu and msvc, problems with type lookups, missing template args). Certain problems are hard to solve in TypeSystemClang becuause of the semantic differences between the languages. References arent objects in C++, so ref-to-ref isnt valid, but it is in Rust. Normalizing the type names that Rust generates (e.g. tuple$<u8, ref$<u32>>) to play nice with CDB is also an issue.

Some issues require changes to the output that Rust is generating, and I have some fixes for those, but some are a matter of “do we put in a super hacky fix now and then rip it out later, or do we just wait until it’s unnecessary?” and i havent asked about that yet.

I’ve heard second-hand as recently as a few months ago (through a Rust contributor that works with them) that Greg Clayton expressed interest in more formal Rust support in LLDB, including compiler integration similar to TypeSystemClang, so long as there’s proper testing and such.

One thing I want to point out is that even though codelldb no longer uses a bespoke TypeSystemRust, I’m pretty sure it does still use a slightly-custom version of LLDB. The major tipoff is that codelldb seems to properly handle Rust type names from PDB debug info. Mainline LLDB uses the “unique name” field from the PDB data for some reason (which is just a number) rather than the “name” field. I dont think the branch codelldb uses is public, but it may be worth asking what changes they’ve made and upstreaming them if they’re up for it.

CodeLLDB also has the best Rust<->LLDB bindings that exist right now by far, but it’s exclusively the public SB api iirc. The crates arent on crates.io, but they are in the codelldb repo (see also: weaklink which takes care of dynamic linking to lldb).

I definitely think making good bindings for the TypeSystem and adjacent APIs (DWARF/PDB parsers, lldb_private::Type, erc.) would help a lot for getting contributors. There’s a lot of wrapped LLVM APIs already in the Rust compiler, so it wouldnt be out of place. I’m not an expert on FFI or anything, but it definitely seems do-able.

An lldb-c API similar to llvm-c and clang-c would help a lot since automated binding tools require a lot less intervention with C code. Being able to build LLDB as a static lib would also be pretty nice, since it could be bundled into an lldb-sys FFI crate. That would allow crates.io’s versioning to guarantee specific LLDB versions and isolate from ambient tools on the system (duckdb and rusqlite work this way via a feature flag).

On the opposite end, without bindings it’s not 100% necessary to write Rust code at all to write the TypeSystem. There might be C++ devs willing to work on it if it boils down to “here’s a document of Rust debug info quirks” but all the actual coding is C++ and they dont have to dig through the Rust compiler source.

1 Like

Also reading through this thread was really interesting. The original poster, vadimcn, is the author of CodeLLDB and they did write and maintain a TypeSystemRust for several years. It’s funny that we independently came to the exact same conclusion: TypeSystemClang is too opinionated to use as the “default” TypeSystem, and it seems like there’s a layer missing between DWARF/PDB ast parsing and TypeSystem implementers that exposes the debug info in a language-agnostic way.

With something like a DWARFASTParserLLVM/PDBASTParserLLVM/TypeSystemLLVM/LLVMType that doesnt have the C-centric constraints that TypeSystemClang does, it would make it easier (or entirely unnecessary) to implement a bespoke TypeSystem for every language that wants one. Languages could much more easily get away with just modifying their own generated debug info to get the output they want (as Rust currently does to appear more C-like than it really is).

That would give the SB api more power since you’re interacting with a more-or-less 1:1 reading of the raw debug info. It would also be WAY faster and easier to build a custom TypeSystem ontop of that because you dont necessarily need to know the details of how DWARF and PDB work (PDB especially is a GIGANTIC time sink).

If that’s a direction the LLDB maintainers are interested in, I wouldnt mind putting in some legwork for it.

1 Like

Speaking as a maintainer, I believe that there is long-term value in defining a proper plugin surface for language plugins in LLDB, especially for languages that have compilers implemented on top of either a downstream fork of LLVM, or a stable release of LLVM. For LLDB developers, these workflows make coordination with upstream and testing really hard. It also forces them to implement the plugin in C++, which can make it difficult to find volunteers who can contribute. For users, it means they may need to choose from different versions of LLDB depending on what language they want to debug, which is at the very least, confusing.

The problem with doing this is that the surface area for a language plugin is very large: it includes Language, TypeSystem, LanguageRuntime, Host, Expression, and often some very low-level connection between different SymbolFile plugins and TypeSystems. I am not sure if it should be an extension to the SBAPI. I think it may need to be a separate API that, e.g., can get away with only supporting the last N LLVM releases, so we can continue improving LLDB architecturally and aren’t help back by backwards compatibility.

I mentioned in the other thread that I would like to experiment with a new Swift Lite language plugin, developed in tree, and with no dependencies on the compiler, and then in-tree, develop that into a dynamically loaded plugin to help us determine what the surface area for a functional language plugin would need to be. I’d be excited to coordinate with developers who are interesting in implementing other languages. This is not going to be easy or quickly achievable, but I believe it’s the right move forward.

4 Likes

Apologies for the delay. I have been reading up on different sources that I could find:

I believe that ideally Rust should have a debugger experience that matches the quality of other tools that are available (e.g. rust-analyzer , cargo , rustfmt , etc.). In this light the lldb debugger is often feels a little like a hack, as some things are not intuitive for users without IDE integrations and python formatters.

I assume that there is a lot of different way of proceeding here

Three approaches:

  • Make a native Rust debugging experience entirely in the LLDB repo (pure lldb similar to Swift)
  • Make a hybrid solution with some LLDB and some custom implementation (rust-lldb, codelldb, rudy-lldb)
  • Make a completely new debugger written in Rust itself (like BugStalker, bs)

If I was at work I would say that the most important thing is getting the most relevant actors together (e.i. Rust dev team, LLDB dev team, creators of lldb, RustTypeSystem in LLDB, BugStalker bs, codelldb, and rudy-lldb). Below I have tried to make an initial list of relevant people. Any help in locating more people would be highly appreciated, especially in the Rust core team,.

Stakeholders: Who to contact? Who can sign off? Who can code it? (Incomplete)

Rust Development Team

LLDB Maintainers

  • adrian.prantl*

People that can code the LLDB RustTypeSystem

Other debugger Projects

* as I am new in here I can only mention two people per post…

Two example issues with the current Rust debugging

  • Crate owners do not specify debug printers for types, BugStalker bs would allow us to call the Debug implementation on types in the future. This means that the source code of packages would indirectly tell the debugger how to format the type.
  • At the moment this is how I call a method on Foo in Rust, and it should be noted that if the method was declared in a trait this functionality would currently not be possible (rudy-lldb seems to fix this nicely).
fn main() {	let foo = Foo { data: 3 };	foo.bar(); } struct Foo {	data: u32, } impl Foo {	pub fn bar(&self) -> u32 {	println!("{}", &self.data);	self.data + 4	} } 
(lldb) expr -l rust -- test_lldb_dap::Foo::bar(&foo) 3 (int) $0 = 7 
  • It seems that behind many attempts to improve the debugging experience, it is difficult to tell if the debugger behavior will crash in the future, either due to changes in rustc or in lldb.

Path forward

I am talking with Derevtsov, author of bs, about helping him on the debugger but the primary goal should be to get everyone aligned on what the best path forward is here. We can spend a lot of time developing high quality code that will not be used if we are not aligned.

I believe that filling up the “Stakeholder” section above with the right people is a possible first step and then reach a common understanding of among these where this should be headed. I believe that it would be great if Rust had a high quality debugger, a rust-debugger that worked as well as the one in C#, ideally with the help of LLDB.

Any suggestions and corrections are more than welcome!

Some thoughts on this:

New Debugger in Rust

Make a completely new debugger written in Rust itself (like BugStalker, bs )

I’m skeptical about whether such an effort would be worthwhile. While greenfield projects can be exciting, it would likely take a long time to reach feature parity across all architectures and platforms that LLDB currently supports. Most projects like this run out of steam long before reaching that level. Also, LLDB benefits from the LLVM infrastructure, allowing it to reuse instruction set decoders and other components.


impl Debug

BugStalker bs would allow us to call the Debug implementation on types in the future. This means that the source code of packages would indirectly tell the debugger how to format the type.

Do you mean running target code for impl Debug ? That probably wouldn’t fly in this form:

  • impl Debug may allocate, which won’t work for no_std targets.
  • Even if we came up with a new trait that doesn’t allocate, it might be undesirable or even impossible to emit all that code into the target binary, especially for embedded systems.
  • How would that work for crash dumps?
  • The data being inspected may be uninitialized or in an inconsistent state when a visualizer is invoked. Rust relies on strict data invariants, so this could easily corrupt process state.

I think the visualizer should be sandboxed from the debuggee and, in remote debugging scenarios, run on the host machine.


Developing RustTypeSystem

  • Requires a lot of effort and would only benefit LLDB. Other debuggers would have to re-implement this functionality.
  • Still does not solve semantic visualization of types like Vec or HashMap because it uses low-level DWARF/PDB type descriptions.

IMO, it would make the most sense to develop a common Rust debug info access library, like Rudy, and then provide minimal shims for plugging into specific debuggers. …And always emit DWARF debug info on Windows so we don’t need to deal with PDB quirks.

We should probably also define a DSL for defining high-level visualizers for complex types. This could be lowered into some bytecode format similar to LLDB bytecode. Although I could see DWARF location expressions being used instead, as more “standard”.
Or even skip the DSL altogether and write visualizers in Rust, compile to WASM, etc. This would be much better for types like HashMap , which have very complicated lookups logic.

Requires a lot of effort and would only benefit LLDB. Other debuggers would have to re-implement this functionality.

GDB largely already has, and CDB is in Microsoft’s hands so it’s up to them to handle it. They’ve made some effort towards it via natvis and small modifications to the CodeView format from what I can tell.

If anything, LLDB is the outlier in regards to Rust support.

And always emit DWARF debug info on Windows so we don’t need to deal with PDB quirks.

I could be wrong but i dont think this is an option. Even if Microsoft’s tools (which underpin *-msvc Rust toolchains) could generate DWARF info for MSVC ABI, and i’m not sure they can, using DWARF info would likely break a lot of the MSVC tooling that expects PDB info. The goal of *-msvc is to provide as much of a “it just works (with Visual Studio)” experience as possible, so it’s a dealbreaker if that’s affected.


Rudy is almost certainly the best bang-for-your-buck option. It’s almost as powerful as a TypeSystem, but is LLDB-version agnostic, can be written entirely in whatever language you want, it doesnt need to live in LLVM so less coordination and it can follow the more aggressive patch cycle of Rust, and it also doesnt need to distribute a full build of LLDB to work.

The way it works could be made more “official” with (probably) not much effort.

From the brief look I gave it, it hijacks the custom python commands to embed an entirely custom version of (essentially) DWARFASTParser, TypeSystem, Language, and some expression parsing. It bypasses LLDB internals but still piggybacks on things like breakpoints, exotic target support, etc.

If there was a more “blackbox” option for the visualizer API (e.g. LLDB passes <some form of identifier, debug info node, etc.>, gets back a raw summary string or a json struct with field:value pairs or whatever) it would expose a minimal surface area and allow “custom” backends like this one. You skip all the intermediaries between, say, variable name and SBValue, which is a huge portion of language support.

Still does not solve semantic visualization of types like Vec or HashMap because it uses low-level DWARF/PDB type descriptions.

If the TypeSystem used rustc to power it (with rustc submodule or whatever in LLVM), the same way clang powers TypeSystemClang, i think rustc could be made to take care of this automatically.

Also, with a better testing environment (which is in the works iirc) they might also be a bit more conscientious about changing the internal representation of types since it’ll cause tests to fail. That would force the debug info fix in the same PR as the type change is made. I could also try talking to some maintainers about adding more strict debug info guidelines for changes on core container types.

I’m skeptical about whether such an effort would be worthwhile. While greenfield projects can be exciting, it would likely take a long time to reach feature parity across all architectures and platforms that LLDB currently supports. Most projects like this run out of steam long before reaching that level. Also, LLDB benefits from the LLVM infrastructure, allowing it to reuse instruction set decoders and other components.

@vadimcn That is a good point, it makes me hesitant to spend too much effort to develop on the bs project. To me the end goal should be a native feeling debugger. bs could potentially deliver that given enough time and effort. Being able to use the gdb based rr (deterministic replay of parallel programs) debugger would also be a great for complex concurrent Rust programs.


  • How would that work for crash dumps?
  • The data being inspected may be uninitialized or in an inconsistent state when a visualizer is invoked. Rust relies on strict data invariants, so this could easily corrupt process state.

  • Even if we came up with a new trait that doesn’t allocate, it might be undesirable or even impossible to emit all that code into the target binary, especially for embedded systems.

Could this be solved by the swift approach below? From the link you on LLDB bytecode you could find that swift does the following

@DebugDescription struct Organization: CustomDebugStringConvertible { var id: String var name: String var manager: Person // ... and more var debugDescription: String { "#\(id) \(name) [\(manager.name)]" } } 

which then returns:

(lldb) p myOrg (Organization) myOrg = "`#100 Worldwide Travel [Jonathan Swift]`" 

Could this me implemented using a derive macro in Rust which you could then also implement manually? This would solve the issue of debugging crates like chronos DateTime for example which has an optimized memory layout and extracting the real date requires bit shifting. The source code would then document internally how types should be formatted.

This is a little bit above my understanding but maybe that is what you meant by:?

We should probably also define a DSL for defining high-level visualizers for complex types. This could be lowered into some bytecode format similar to LLDB bytecode. Although I could see DWARF location expressions being used instead, as more “standard”.
Or even skip the DSL altogether and write visualizers in Rust, compile to WASM, etc. This would be much better for types like HashMap , which have very complicated lookups logic.

@vadimcn Just to be completely sure, is it correct that you feel that something like rudy should be the approach going forward?


@Walnut356

Rudy is almost certainly the best bang-for-your-buck option. It’s almost as powerful as a TypeSystem , but is LLDB-version agnostic, can be written entirely in whatever language you want, it doesnt need to live in LLVM so less coordination and it can follow the more aggressive patch cycle of Rust, and it also doesnt need to distribute a full build of LLDB to work.

Based on this and the work that you have done on the RustTypeSystem would you feel confident about talking with official Rust core developers about rudy?

I think that the next step for me would be talking with “Sam Scott” and ideally someone from official Rust about what they think, maybe include some of the people in this thread as well?

What do you guys think about this?

Not sure if this is correct. The main MS tool that -msvc toolchain depends upon is the linker, and even that is mostly because of PDB: at the time LLD could not handle merging of PDB debug info from object files. However, these days LLD is up to this task and Rust is planning to eventually use it on Windows by default?

We could generate DWARF in addition to PDB, or have debug info format chosen by a compiler flag independent of runtime ABI target, or design some other compromise that keeps MS tools that people care about working.

TypeSystemClang still needs type formatters to turn a std::vector into something human-readable. It is my understanding that currently Rudy hard-codes type formatters for some well-known std types, however, this is prone be breaking by std changes and does not help 3rd-party crate developers.

This is actually already possible. Could be improved, for sure, but definitely usable.

Re LLDB bytecode: sure it can perform this sort of light formatting, however, the debugDescription like the one in the blog falls short for more complicated types, like your example of chrono’s DateTime, where one needs to evaluate at least basic expressions. Not to mention types like HashMap or BTreeMap, which need conditionals and loops as well, so basically a Turing-complete language if required.

Mind you, the LLDB bytecode does have all these features. IMO, the problem is how to expose them to developers. Unfortunately, the Swift blog does not explain how they intend to do that. Would they have the developers writing visualizers directly in bytecode assembly? Seems like this would cause a lot of friction. Which is why I suggested higher-level DSL, that is lowered to the bytecode. Though this still does not solve the problem of LLDB bytecode being LLDB-only…

So my final idea was WASM - after all, Rust and most other languages already have a backend for it, so visualizers could be written in the same source language.