[RFC] Software watchpoints support

1. Context and Motivation

Currently, LLDB lacks support for software watchpoints. This gap means that certain debugging functionalities available in GDB are not present in LLDB. As outlined in the GDB Internals Watchpoints documentation, software watchpoints are crucial in several scenarios:

  • The watched memory region is too large for the underlying hardware watchpoint support. For example, each x86 debug register can watch up to 4 bytes of memory, so trying to watch data structures larger than 16 bytes will cause GDB to use software watchpoints.
  • The value of the expression to be watched depends on data held in registers (as opposed to memory).
  • Too many different watchpoints requested. (On some architectures, this situation is impossible to detect until the debugged program is resumed.) Note that x86 debug registers are used for both hardware breakpoints and watchpoints, so setting too many hardware breakpoints might cause watchpoint insertion to fail.
  • No hardware-assisted watchpoints provided by the target implementation.

It is important to emphasize that software watchpoints, as implemented in GDB, incur a severe performance penalty, often slowing down program execution by orders of magnitude (tens to hundreds of times). This is a fundamental trade-off stemming from the single-stepping approach. Therefore, enabling this feature represents a significant and heavy trade-off for the user. We believe that despite this cost, the option should be available for debugging scenarios where convenience is critical and performance is secondary. LLDB must provide clear indications of this active “slow mode” and expose settings to control it.

2. Proposed Implementation

The GDB Remote Serial Protocol (RSP) does not define packets for software watchpoints. Consequently, in GDB, this feature is implemented entirely on the host side, with the gdbserver remaining unaware of it.

We propose a similar approach for LLDB: implementing the logic for software watchpoints entirely on the host side. The core of this implementation is a new ThreadPlan type: WatchpointStepInstructionThreadPlan.

  • Inheritance: This plan inherits from StepInstructionThreadPlan, whose primary function is to execute a single instruction step.
  • Custom Logic: The WatchpointStepInstructionThreadPlan overrides the DoPlanExplainsStop and ShouldStop methods. Within these methods, it checks whether the value of any enabled software watchpoint has changed (indicating a watchpoint hit). If a hit is detected, it sets a WatchpointStopReason for the thread.

Workflow:

  1. Before resuming a thread, if any enabled software watchpoints exist, a WatchpointStepInstructionThreadPlan is pushed onto the top of the thread plan stack (or just after a StepOverBreakpoint plan if one is active).
  2. The thread executes a single instruction and control returns to the debugger.
  3. The debugger walks the thread plan stack, calling DoPlanExplainsStop/ShouldStop. Our plan checks all enabled software watchpoints for modifications.
  4. If a hit occurs: Control is returned to the user with the appropriate watchpoint stop reason.
  5. If no hit occurs and no other thread plans request a stop, execution resumes. In either case, the current WatchpointStepInstructionThreadPlan is popped from the stack.
  6. On the next resume operation (if software watchpoints are still enabled), a new WatchpointStepInstructionThreadPlan is pushed, and the cycle repeats for the next instruction.

3. Rationale and Considerations

This is an intrusive solution as it significantly interacts with the thread plan mechanism. However, for a host-side implementation, it is challenging to envision a fundamentally different approach. Implementing this on the server (e.g., in lldb-server) is less desirable, as it would require substantial changes to the gdb-remote protocol. Such changes could also complicate future support for bare-metal targets.

We acknowledge the potential performance impact of single-stepping. To mitigate this, the implementation should include сlear performance warnings in the documentation.

4. Questions and Alternatives

We would greatly appreciate feedback on this proposal, specifically regarding alternative, less intrusive methods for implementing this functionality within LLDB’s architecture.

Thank you for your time and consideration. We look forward to your thoughts and suggestions.`

This is a big change, and as you say very intrusive. I also suspect that it is going to be too slow in practice to actually be useful. Watchpoints are really needed when the complexity of the program you are debugging outstrips your ability to reason about when variables are getting touched, so if it is only viable in simple programs that would make it hard to justify the maintenance burden of the added complexity. I don’t think “gdb has this not-very useful feature so lldb must” is a particularly compelling argument either.

You seem to have a working implementation based on your previous patches. Can you post some performance numbers for using this in a reasonable sized program. How much does this slow down execution, and were you able to use it to solve some actual problem? You say in vague terms that it is slower, but that doesn’t give a good sense of “slow but still useful” or “not going to hit that watchpoint before the heat death of the universe”.

1 Like

Most architectures have some instructions that can’t be single-stepped over. These instructions tend not to be branches, so lldb won’t try to single step over them in normal usage. Instead, it sets breakpoints on branches and “continues” from branch to branch. But your software-single step approach forces you to step-i over every instruction, and so you may end up either stalling the program or cause it to behave incorrectly.

You can work around that by detecting these instructions and “break next instruction and continue” over them instead. Sometimes these “unstepabble instructions” are in pairs as well. So you might have to be a little more clever than that…

I concur with Jim, having some actual numbers would help weigh the benefits and the costs. We have about 50 watchpoint tests; a good start would be a comparison of the test’s execution time when using hardware and software breakpoints.

My other concern is that, for historical reasons, breakpoints and watchpoints don’t share as much of their implementation despite being very similar conceptually. Unifying the two has been a long-standing goal, and we have to make sure that this change doesn’t diverge them further but instead brings them closer.

The patches that have been submitted previously that included this feature have really been portmanteau patches that add this feature and also fix unrelated issues and have refactorings unrelated to the feature.

There should be no reason why the implementation of software watchpoints should require any changes in the “what happens once we’ve decided a watchpoint was hit” part of lldb - and the changes in those patches that touch these areas bear that out - they seem unrelated.

So far as I can tell you should be able to make a patch for just this feature that doesn’t touch the parts of the Watchpoint that we want to unify with Breakpoints (which includes all the reaction code in StopInfoWatchpoint and StopInfoBreakpoint). So if the patch backing this PR sticks to JUST implementing software watchpoint support, it shouldn’t need to touch any of the areas we want to unify with Breakpoints.

And since we strongly discourage portmanteau patches, we do want a patch that only implements this feature.

Of course, we have no objection to fixing other parts of the watchpoint support, but those should be separate patches and if you were to try your hand at that, we should first have a discussion about how to do that without adding further code duplication between watchpoint reactions and breakpoint ones.

Here are the benchmark results for the testing time I conducted. For a quick overview, I have extracted the 10 longest software watchpoint tests into a comparative table

.
I have also attached the complete logs I obtained hardware_watchpoints.txt (105.6 KB) software_watchpoints.txt (159.3 KB).

For reproduction, you can use the following command if you’re interested: llvm-lit -a --filter="watchpoint" --time-tests build/tools/lldb/test/API.

The results show that in the test scenarios, software watchpoints do not exhibit a severe performance degradation. The total testing time increased approximately threefold.

It’s important to note that alongside the tests for software watchpoints, the tests for hardware watchpoints were also running. This means the total number of tests executed doubled. Therefore, I consider a mere threefold increase in the overall testing time to be more than acceptable.

Frankly, I believe that the performance concerns regarding software watchpoints are not substantial. In GDB, this functionality has its grateful users for the scenarios I described above. Even if processors have hardware triggers required for hardware watchpoints, the number of such triggers is critically small, and some lack them entirely. Note, even in LLDB testing, there was a test unsupported on x86, and now it passes (see TestLargeWatchpoint.py in the table). In these cases, when there are no alternatives, they software watchpoints better than nothing, if the user understands the risks associated with using software watchpoints.

Of course, it is always possible to find a specific test where the execution time would be unacceptable. For instance, during LLDB testing, I had to reduce the number of iterations in the TestUnalignedLargeWatchpoint.py test, otherwise, the test would timeout. However, we are writing a debugger for users who debug a wide variety of programs under vastly different conditions.

When you are comparing timing using the running of lldb test suite tests, remember that pretty much all the time of the test is going to be in Python, setting up the test environment, then in lldb making the target, getting it to start up and run to the first main breakpoint before you start doing any work. Only a tiny proportion of the original test is actually the stepping part. So for instance for TestWatchlocation.py that goes from 14 seconds to 58 seconds, there was likely less than a second spent actually stepping in the original 14 seconds. But all the setup work is that same, so that means that you are actually reporting a slowdown of 1 second → 45 seconds for the part where you were doing real single-stepping. And that 1 second is probably an over-estimate. This is also a test that only has three threads and doesn’t get to run very much code because we stop at the first watchpoint hit and quit the test.

So what that experiment tells me is that running a very small program with three threads for not very long is at least 45x the running time. That could be more, and I can’t tell how this would scale to a complex and long-running program.

Again, watchpoints are not really a crucial feature in small programs or programs that don’t have much concurrency or have very short execution times, where you can usually reason about what’s going on without them. To be an actually useful feature they need to be useable in complex long-running programs where such reasoning is actually hard.

I don’t want to put a lot of work into a feature that is only useful for toys, but more importantly I don’t want to have a feature in lldb that leads someone along, thinking this feature will be useful to them, only to have it choke in most real world situations.

A useful experiment would be a program that makes 100 threads that all do actual work for say 5 minutes before it changes the variable that you are watching. How long would it take for that program to hit the watchpoint through single-stepping vrs. the same run using hardware watchpoints?

It shouldn’t be hard to build a little harness that can vary the number of threads, and the work done, and see how this actually scales. If in the end it’s on the order of 50x, that might be acceptable. It makes hitting a bug in a 5 minute run go to 4 hours, which you might be willing to wait for it the bug was hard enough to figure out. But if it’s 1000x that takes a 5 minute run to 80 some hours, and I doubt anyone would want to use that.

@jingham I don’t believe these performance measurements would be very useful, as it’s obvious that hardware-assisted watchpoints are orders of magnitude faster than a software implementation.

Again, watchpoints are not really a crucial feature in small programs or programs that don’t have much concurrency or have very short execution times, where you can usually reason about what’s going on without them. To be an actually useful feature they need to be useable in complex long-running programs where such reasoning is actually hard.

I don’t fully agree with this statement. The utility of a feature is highly dependent on the specific program and the user’s debugging scenario. “Small programs” still require debugging, and there is nothing wrong with supporting that use case.

Furthermore, not all programs are concurrent or multi-threaded. The LLVM-based compiler (clang) itself is a good example. I believe having this feature available would be beneficial, not harmful.

I don’t want to have a feature in lldb that leads someone along, thinking this feature will be useful to them, only to have it choke in most real world situations.

@jingham, @JDevlieghere This patch does not remove hardware watchpoint support. Instead, it provides the user with an option. If the concern is that users might be misled by the feature’s performance, we could add an additional warning. Moreover this feature is not used unless user explicitly provides an additional flag (“watchpoint set -S”).

Could you please elaborate on your concerns? Are you completely opposed to having this functionality in the LLDB codebase, or is your concern about the quality of the patch itself—specifically that it is quite large and intrusive?

If it is the latter, I can certainly try to split the patch into more digestible chunks.

So far as I can tell you should be able to make a patch for just this feature that doesn’t touch the parts of the Watchpoint that we want to unify with Breakpoints (which includes all the reaction code in StopInfoWatchpoint and StopInfoBreakpoint).

Could you please clarify which specific functionality you are referring to here? Please, note that I’ve already factored-out changes related to refactoring to the separate PR [lldb] refactor watchpoint functionality by dlav-sc · Pull Request #159807 · llvm/llvm-project · GitHub. I’ve outlined the reasons for the refactoring there. The refactoring is mainly explained by the desire to make the LLDB codebase cleaner. @jingham, @JDevlieghere any chance you have the bandwidth to take a look while we are on this topic?

You are correct that for the initial implementation of the software watchpoint feature, I could have avoided touching this logic. However, while working on this feature I just tried to make the codebase easy to understand. In my opinion the original codebase is a little bit messy. I think that this refactoring would not hurt.

Most architectures have some instructions that can’t be single-stepped over.

Yes, I encountered this issue while testing software watchpoints on the RISC-V architecture. Specifically, LLDB did not support single-stepping through lr/sc (Load-Reserved/Store-Conditional) atomic sequences on RISC-V. To fix this, I implemented the necessary logic for RISC-V and created a more convenient interface for handling such special instructions on other architectures https://github.com/llvm/llvm-project/pull/127505.

This solution doesn’t cover all cases, but it can be improved in future.

Just a small aside, last time I looked (YEARS AGO), having lldb stop & resume a process repeatedly was something we could do maybe 2000 times a second. It could be plus or minus an order of magnitude now, but I bet it’s probably close to that still. A modern processor can execute billions of instructions per second, when the process is scheduled on-core and not blocked. It’s an almost unimaginable difference in speed.

From a users’ POV – not commenting on implementation details – I have found software watchpoints useful in gdb in the past, and having them in LLDB seems like a good idea, too.

As mentioned, yes, they’re very slow. So, they’re not often the best choice. But the way I’ve successfully used them is when I have a smaller region of execution (within a large program), where I know that something’s being changed, but I don’t know exactly where the change is occurring (e.g. “Well, it’s modified sometime after breakpoint X but before breakpoint Y”). Sure, I could try to narrow it down more myself…or I could just have the debugger tell me. Even if it takes 2 minutes, that can still be a good option.

I know I said I’d look at the PR but things got away from me sorry about that. Now this thread exists so I’ll reply here.

TLDR: if it’s always opt-in and explained as clearly as GDB does, I think this is a positive for users. The code, can’t offer any intelligent comments on.

We also have open bugs for Arm’s atomic sequences. If software watchpoints are compatible with the special handling then I just consider it a bug that we do not have such handling yet.

If there are instructions/instruction sequences that cannot be handled at all, I’d like to know what they are.

I wonder if the non-determinism that affects RR would also be a problem here? There’s a few things on Arm that cannot be handled by RR purely because of architecture decisions, users just have to be aware of that unfortunately.

This matches with my experience with other debug tools. There’s two extremes:

  • Software watchpoints where you can run any expression written in a normal way.
    • This “works” immediately but you have to wait a while for it to hit.
  • Making the user mentally compile that expression into some chain of complex breakpoint triggers.
    • This requires you to become an expert on the arcane breakpoint configurations hardware has (which I do not recommend to anyone).
    • Adding support for the new arcane thing takes an expert’s time, so it may not be supported in your tools yet.

We tried building “recipes” to encode common logic and little compilers but ultimately even power users just wanted to write conditions in a way they were used to (and we were not compiler experts tbf). It took forever to execute a lot of the time, but it was better than nothing.

(when “a lot of time” was weeks we would step in and help but this was a commercial product where you could pay for that kind of support)

So from a user’s perspective, if it doesn’t get in the way of the usability of the tools, having software watchpoints is a positive.

I do think people will underestimate just how slow this can get, but that was always going to happen and there’s only so much messaging we can do before we say “some people will get the wrong end of the stick and we’re ok with that”.

As you have already said, it should never be the default or a fallback without the user taking extra actions to choose it. That is, passing a specific API flag or adding a command option.

Even in the case where you ran out of watchpoints, the most we should be doing is hinting that software watchpoints exist. Make them go read the help doc to find out how to use them.

In addition to that, you can add a doc/section of a doc on the website that explains how this works and we can cite that. GDB’s page says something like “can be 100s of times slower” and I found that via. Google. So we can make sure lldb has an equivalent.

I am not very familiar with that area of the code but this is my main concern. I can’t add anything constructive here though so I defer to someone who can.

Certainly if you think it can be simplified regardless of adding software watchpoints or not, that would be welcome.

The tough thing here is setting a cut off where it goes from “really clearly warn users” to “this isn’t worth having at all”. That doesn’t mean we should randomly pick one though, make an attempt to measure at least, I agree with that.

I do wonder if we might have some inefficiencies in our single step that make it worse than GDB, that would be worth checking for low hanging fruit.

Not suggesting we optimise for this case but if a software watch is 100x slower with lldb than gdb, that should be investigated.

(GDB’s local debug all being in the same process with no separate debug server might make a difference)

If the argument is that it’s useful for some GDB users so it would be cool to have in LLDB, we could require that it perform roughly the same?

Certainly that’s what I would expect as a user, and this means we benefit from years of GDB setting expectations for performance.

I thought maybe we could use an RR like strategy, where you run freely to checkpoints then restore to an earlier point and single step. However this is trivially broken for example if a variable changes value between checkpoints, but has the same value at each checkpoint.

A server side implementation would probably be a byte code expression, this is what the commercial tools I worked on had (though we never exposed this to users IIRC).

GDB has this Using Agent Expressions (Debugging with GDB) which sounds like it but first time I’m reading about it.

Also this is related to Tracepoints (Debugging with GDB) which are another class of “point” it has. So I don’t know that anyone has compiled software watchpoint expressions into this.

Normally I’d take that as a hint that doing so isn’t a great idea, but then again, GDB has a lot of features someone added at some point for their use case and no one has connected the dots yet.

I think it could be done on dedicated probes or simulator debug stubs but in-kernel stubs would be awkward. Certainly would be nice for it to “just work (slowly)” for all of those.

Also I wonder for hosted, same system debug whether server side would save that much time. Would save some inter process communication time in LLDB’s case I guess. Also depends on how “low level” the expression can get, whether we can get it down to just addresses or does the server have to start parsing debug information?

Another way to think about this:

If software watchpoints were 10,000x slower in LLDB, but 100x slower in GDB, would it be better for users if we just said “if you really want this feature, use GDB”.

For the most basic, Linux distro using user, that would be the thinking. Can we apply that to everyone? No.

  • Anyone on MacOS can’t use GDB (at least on Arm64?)
  • Anyone not given GDB by their vendor can’t complain to their vendor when GDB doesn’t work well
  • Anyone shipping an LLDB with support for their specific hardware may not have an equivalent GDB
  • Some entities cannot / do not want to use GPL software

So ultimately this line of thinking only applies to the broad reputation of LLDB but perhaps worth thinking about for setting a first approximation of “acceptable” performance.

Thanks for collecting these numbers. Although these tests may not represent real world use cases, it confirms that we can use the existing tests to add coverage for the software variant. This addresses my first concern: that the feature is too slow to include in automated testing.

Hold on… If you changed the tests then this isn’t really a fair comparison anymore. Did you change anything else?

I don’t think that’s a conclusion we can draw from the tests which, as Jim points out above, aren’t necessarily representative of real world usage.

To use the terminology used in the book “Working in Public” (1,2), I’m trying to get a sense of whether the addition of software watchpoints is extractive or not.

Extractive contributions are those where the marginal cost of reviewing and merging that contribution is greater than the marginal benefit to the project’s producers.

In other words, the (admittedly subjective) evaluation of cost-vs-reward. Various factors contribute to that, which includes the perceived value and the intrusiveness of the change. I realize that’s not very actionable on your part, so let me try to be more concrete, speaking strictly for myself:

  1. Is this generally too slow to be reasonably usable in a real world scenario? Jim seems to be leaning towards no. David seems to be leaning towards yet. I’m still on the fence.
  2. Can we have automated tests for this? Based on the numbers you collected, it seems that can use the existing watchpoint tests run in both hardware and software mode.
  3. Can we set expectations for our users? This concern is addressed by not making this the default/fallback and with documentation on the website, which we seem to have consensus on.
  4. Does this actually work reliably? The first law of debuggers is that it never lies. What are the expectations in terms of false positives and false negatives?
  5. Does this get us closer or further from the breakpoint/watchpoint unification. If the change is intrusive but gets us closer to something we’re already working towards, you’re more likely to find motivated contributors to help you shepherd this through.
  6. Is this a stopgap solution until RISC-V hardware watchpoint support is added to the kernel, or will you remain motivated to maintain this feature regardless?

I think those are the main things for me, with (2) and (3) already addressed. For the changes to the ThreadPlan machinery, I’ll defer to Jim as that’s very much his area of expertise.

FWIW, I also did a quick search through our internal bug tracker and I can’t find any requests or mentions of software watchpoints.

I’m guessing that the fact that we don’t see any concrete timings - which the running of the test suite cases didn’t provide any insight into - means that the performance probably is as bad as we suspect. I also still suspect this feature is more cursed at than actually used to good purpose. But provided that it does no work if you don’t use it, my suspicion on those grounds shouldn’t be determinative.

I’m also concerned about the effects of the stability of the execution control state machine. I didn’t look at that part in detail in the previous patch because it was a mashup of this and a bunch of other changes that didn’t seem relevant to the proposed feature, which made that core part of it hard to assess.

So I’d really want to look at a patch that just pushes these “single-step-and-check-value” plans onto all the thread plan stacks that are going to continue on any given step, and manages converting “trace” stop reasons to “watchpoint” stop reasons when appropriate.

If done right, that should be a pretty small patch, since it’s playing the same role - albeit in a more intrusive way - as the “step over breakpoint” and the “timed step-over” plans, so you shouldn’t have to invent any new machinery for the purpose. And it should not have any affect either how watchpoints are set (beyond the addition of the command flag), or on the reactions given in the “StopInfoWatchpoint”. The only job of this feature is to (a) make all threads proceed one by one by single-step and (b) respond to the single-step stop that has changed the watched memory region by converting the stop to a Watchpoint stop with the correct “new” and “old” values. If you wanted to also add support for only slowing down the watched threads for thread specific watchpoints, that would be okay to include.

But even things like being more clever about printing the value change for the larger regions you might be watching (which is one of the uses of this approach) should not be in this patch. OTOH, if you are really watching a big region you’ll need to do some kind of diff presentation since printing two 64K buffers as “new” and “old” isn’t terribly helpful. OTOH, many modern hardware watchpoint systems allow you to watch much larger regions already, so that’s not an issue particular to this feature.

If you wanted to also help with refactoring the current Watchpoint support, for instance to reduce code and behavior duplication between Watchpoints & Breakpoints that would be great. But that should all happen in separate patches.

@DavidSpickett,

Not suggesting we optimise for this case but if a software watch is 100x slower with lldb than gdb, that should be investigated.

Out of curiosity, I decided to compare the performance of a software watchpoint in LLDB and GDB. I created a loop where a watchpoint triggers on the 100th iteration:

#include <stdio.h> int main() { int value = 7; for (unsigned idx = 0; idx < 1000; ++idx) { printf("Iteration %u starts...\n", idx); printf("value = %d\n", value); if (idx == 100) value = 42; printf("Iteration %u ends...\n", idx); } } 

I compiled this code with -O0 for x86 and measured the time from resume until the watchpoint hit on my machine:

  • GDB: 1 minute, 1 second
  • LLDB: 1 minute, 50 seconds

LLDB performed slightly worse, but the results are within the same order of magnitude.

@JDevlieghere,

Hold on… If you changed the tests then this isn’t really a fair comparison anymore. Did you change anything else?

No, that is the only modification. Furthermore, it is expected that a software watchpoint would not handle a loop with 16,776,960 iterations.

Is this generally too slow to be reasonably usable in a real world scenario?

It depends entirely on the specific scenario. Here are several key points to consider when using software watchpoints:

  1. Software watchpoints can only be of the ‘modify’ type. They cannot monitor direct accesses to the watched region, meaning a watchpoint will not trigger on reads (read type) or writes of the same value (write type). Therefore, using a software watchpoint to track, for example, reads is not possible, and the debugger should return an error if a non-modify type software watchpoint is attempted to set.
  2. As has been noted repeatedly, the execution speed of software watchpoints is relatively low. For instance, single-stepping through a large loop with a software watchpoint can be time-consuming, and in some cases, if the loop contains many instructions, the time required may be unacceptable. Here, I can only note that I compared the performance of software watchpoints in GDB and LLDB and did not observe a significant difference.
  3. On targets without hardware stepping support (such as ARM, RISC-V, and several others), one should be prepared for even longer execution times with software watchpoints.
  4. Scenarios where software watchpoints can be useful are outlined in the RFC description.

Does this actually work reliably?

Yes, but there is one important caveat. As mentioned above, there is no guarantee that LLDB can single-step through every instruction correctly. Determining the location of the next instruction sometimes requires non-trivial logic, such as when dealing with atomic regions. As I noted previously, I have created a patch that simplifies adding support for such instructions and also implements stepping through atomic regions on RISC-V #127505. We can expand the set of supported special instructions on demand, which would make the work of software watchpoints and step instruction command more robust.

Does this get us closer or further from the breakpoint/watchpoint unification.

Regarding software watchpoints specifically, they are unrelated to this unification. Their implementation did not necessitate a major refactoring of StopInfoWatchpoint and, most likely, the Watchpoint class itself. Initially, I simply wanted to avoid duplicating the watchpoint checking code, which prompted me to start extracting this logic from StopInfoWatchpoint. Subsequently, I decided that making the code more clean would also be beneficial, leading to this more extensive refactoring.

Is this a stopgap solution until RISC-V hardware watchpoint support is added to the kernel, or will you remain motivated to maintain this feature regardless?

Providing watchpoint functionality for targets that do not support hardware watchpoints (whether due to lack of support in the software, such as in the linux api, (e.g. RISC-V case) or due to absence of hardware support) is one of the 4 motivations for adding software watchpoints. All 4 motivations are presented in the RFC above and largely reference GDB’s documentation.

FWIW, I also did a quick search through our internal bug tracker and I can’t find any requests or mentions of software watchpoints.

I also did some searching and found at least these issues: #47102, #31147, #108492. The first two issues concern watchpoints on registers, which software watchpoints could theoretically handle. The last issue is an attempt to use a hardware watchpoint on a platform without the necessary hardware triggers.

@jingham,

If done right, that should be a pretty small patch, since it’s playing the same role.

Yes, you are correct here. In that case, I will strive to create a separate pull request containing only the changes related to the implementation of software watchpoints. It might require duplicating some code from StopInfoWatchpoint in places, but I will aim to keep changes to StopInfoWatchpoint and the Watchpoint class to an absolute minimum. Most likely, they will not need to be modified at all.

But even things like being more clever about printing the value change for the larger regions you might be watching (which is one of the uses of this approach) should not be in this patch.

Yes, I have seen the output for large watched regions in the tests, and it is indeed difficult to understand where exactly the change occurred. If the decision is made to add software watchpoints in lldb, we can then consider how to make the output more readable.

If you wanted to also help with refactoring the current Watchpoint support, for instance to reduce code and behavior duplication between Watchpoints & Breakpoints that would be great. But that should all happen in separate patches.

Yes, I had already attempted to move the refactoring into a separate pull request, which I referenced in my previous message, but I agree it still ended up being quite large. I will try to split it into several smaller, more digestible pull requests.

The next steps I propose are:

  • I will create a pull request with changes pertaining only to software watchpoints, where I will strive not to touch any other logic. A verdict on whether software watchpoints are needed in LLDB can be made based on this PR.
  • I will attempt to break down the refactoring pull request into smaller, more digestible parts. I believe it will make unification of breakpoint/watchpoint easier, therefore, I hope that the part related to refactoring, along with minor fixes, each of which I will also submit as a separate PR, will be accepted regardless of the decision regarding software watchpoints.

Sounds like a good plan.

Be sure to pay attention to how the execution control machinery handles the “ThreadPlanStepOverBreakpoint”. You are going to have to do a similar sort of “last minute” intervention to re-implement whatever the rest of the plan stack thought it was doing. If they can be done using a similar mechanism, that would be cleaner, otherwise they might fight over ordering.

And remember in testing this that it won’t be useful if it only works for continue. It also has to behave correctly when users say step and next etc…