update ManualDWARFIndex::Index to use std::once #165896

zhyty · 2025-10-31T18:10:44Z

Small change to use (what I think is) a better practice -- we were using the m_indexed bool member to make sure we called Index() once, but we should just use std::once! This change shouldn't affect functionality.

This change may also make concurrent access to Index() thread-safe, though the ManualDWARFIndex API isn't completely thread-safe due to Decode(). I'm not sure if ManualDWARFIndex was ever intended to be thread-safe.

Test Plan:

ninja check-lldb

Tested basic debugging workflow of a couple of large projects I had built. Basically:

(lldb) target create <project> (lldb) b main (lldb) r (lldb) step ...

I A/B tested the performance of launching several modules with parallel module loading and didn't observe any performance regressions.

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

llvmbot · 2025-10-31T18:11:28Z

@llvm/pr-subscribers-lldb

Author: Tom Yang (zhyty)

Changes

Small change to use (what I think is) a better practice -- we were using the m_indexed bool member to make sure we called Index() once, but we should just use std::once! This change shouldn't affect functionality.

This change may also make concurrent access to Index() thread-safe, though the ManualDWARFIndex API isn't completely thread-safe due to Decode(). I'm not sure if ManualDWARFIndex was ever intended to be thread-safe.

Test Plan:

ninja check-lldb

Tested basic debugging workflow of a couple of large projects I had built. Basically:

(lldb) target create &lt;project&gt; (lldb) b main (lldb) r (lldb) step ...

I A/B tested the performance of launching several modules with parallel module loading and didn't observe any performance regressions.

Full diff: https://github.com/llvm/llvm-project/pull/165896.diff

2 Files Affected:

(modified) lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp (+100-103)
(modified) lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.h (+1-1)

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp index d90108f687f84..aa30682faed90 100644 --- a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp +++ b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp @@ -22,7 +22,6 @@ #include "lldb/Utility/Stream.h" #include "lldb/Utility/Timer.h" #include "lldb/lldb-private-enumerations.h" -#include "llvm/Support/FormatVariadic.h" #include "llvm/Support/ThreadPool.h" #include <atomic> #include <optional> @@ -33,118 +32,116 @@ using namespace lldb_private::plugin::dwarf; using namespace llvm::dwarf; void ManualDWARFIndex::Index() { - if (m_indexed) - return; - m_indexed = true; - - ElapsedTime elapsed(m_index_time); - LLDB_SCOPED_TIMERF("%p", static_cast<void *>(m_dwarf)); - if (LoadFromCache()) { - m_dwarf->SetDebugInfoIndexWasLoadedFromCache(); - return; - } + std::call_once(m_indexed_flag, [this]() { + ElapsedTime elapsed(m_index_time); + LLDB_SCOPED_TIMERF("%p", static_cast<void *>(m_dwarf)); + if (LoadFromCache()) { + m_dwarf->SetDebugInfoIndexWasLoadedFromCache(); + return; + } - DWARFDebugInfo &main_info = m_dwarf->DebugInfo(); - SymbolFileDWARFDwo *dwp_dwarf = m_dwarf->GetDwpSymbolFile().get(); - DWARFDebugInfo *dwp_info = dwp_dwarf ? &dwp_dwarf->DebugInfo() : nullptr; - - std::vector<DWARFUnit *> units_to_index; - units_to_index.reserve(main_info.GetNumUnits() + - (dwp_info ? dwp_info->GetNumUnits() : 0)); - - // Process all units in the main file, as well as any type units in the dwp - // file. Type units in dwo files are handled when we reach the dwo file in - // IndexUnit. - for (size_t U = 0; U < main_info.GetNumUnits(); ++U) { - DWARFUnit *unit = main_info.GetUnitAtIndex(U); - if (unit && m_units_to_avoid.count(unit->GetOffset()) == 0) - units_to_index.push_back(unit); - } - if (dwp_info && dwp_info->ContainsTypeUnits()) { - for (size_t U = 0; U < dwp_info->GetNumUnits(); ++U) { - if (auto *tu = - llvm::dyn_cast<DWARFTypeUnit>(dwp_info->GetUnitAtIndex(U))) { - if (!m_type_sigs_to_avoid.contains(tu->GetTypeHash())) - units_to_index.push_back(tu); + DWARFDebugInfo &main_info = m_dwarf->DebugInfo(); + SymbolFileDWARFDwo *dwp_dwarf = m_dwarf->GetDwpSymbolFile().get(); + DWARFDebugInfo *dwp_info = dwp_dwarf ? &dwp_dwarf->DebugInfo() : nullptr; + + std::vector<DWARFUnit *> units_to_index; + units_to_index.reserve(main_info.GetNumUnits() + + (dwp_info ? dwp_info->GetNumUnits() : 0)); + + // Process all units in the main file, as well as any type units in the dwp + // file. Type units in dwo files are handled when we reach the dwo file in + // IndexUnit. + for (size_t U = 0; U < main_info.GetNumUnits(); ++U) { + DWARFUnit *unit = main_info.GetUnitAtIndex(U); + if (unit && m_units_to_avoid.count(unit->GetOffset()) == 0) + units_to_index.push_back(unit); + } + if (dwp_info && dwp_info->ContainsTypeUnits()) { + for (size_t U = 0; U < dwp_info->GetNumUnits(); ++U) { + if (auto *tu = + llvm::dyn_cast<DWARFTypeUnit>(dwp_info->GetUnitAtIndex(U))) { + if (!m_type_sigs_to_avoid.contains(tu->GetTypeHash())) + units_to_index.push_back(tu); + } } } - } - if (units_to_index.empty()) - return; - - StreamString module_desc; - m_module.GetDescription(module_desc.AsRawOstream(), - lldb::eDescriptionLevelBrief); - - // Include 2 passes per unit to index for extracting DIEs from the unit and - // indexing the unit, and then extra entries for finalizing each index in the - // set. - const auto indices = IndexSet<NameToDIE>::Indices(); - const uint64_t total_progress = units_to_index.size() * 2 + indices.size(); - Progress progress("Manually indexing DWARF", module_desc.GetData(), - total_progress, /*debugger=*/nullptr, - Progress::kDefaultHighFrequencyReportTime); - - // Share one thread pool across operations to avoid the overhead of - // recreating the threads. - llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool()); - const size_t num_threads = Debugger::GetThreadPool().getMaxConcurrency(); - - // Run a function for each compile unit in parallel using as many threads as - // are available. This is significantly faster than submiting a new task for - // each unit. - auto for_each_unit = [&](auto &&fn) { - std::atomic<size_t> next_cu_idx = 0; - auto wrapper = [&fn, &next_cu_idx, &units_to_index, - &progress](size_t worker_id) { - size_t cu_idx; - while ((cu_idx = next_cu_idx.fetch_add(1, std::memory_order_relaxed)) < - units_to_index.size()) { - fn(worker_id, cu_idx, units_to_index[cu_idx]); - progress.Increment(); - } - }; + if (units_to_index.empty()) + return; - for (size_t i = 0; i < num_threads; ++i) - task_group.async(wrapper, i); + StreamString module_desc; + m_module.GetDescription(module_desc.AsRawOstream(), + lldb::eDescriptionLevelBrief); + + // Include 2 passes per unit to index for extracting DIEs from the unit and + // indexing the unit, and then extra entries for finalizing each index in + // the set. + const auto indices = IndexSet<NameToDIE>::Indices(); + const uint64_t total_progress = units_to_index.size() * 2 + indices.size(); + Progress progress("Manually indexing DWARF", module_desc.GetData(), + total_progress, /*debugger=*/nullptr, + Progress::kDefaultHighFrequencyReportTime); + + // Share one thread pool across operations to avoid the overhead of + // recreating the threads. + llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool()); + const size_t num_threads = Debugger::GetThreadPool().getMaxConcurrency(); + + // Run a function for each compile unit in parallel using as many threads as + // are available. This is significantly faster than submiting a new task for + // each unit. + auto for_each_unit = [&](auto &&fn) { + std::atomic<size_t> next_cu_idx = 0; + auto wrapper = [&fn, &next_cu_idx, &units_to_index, + &progress](size_t worker_id) { + size_t cu_idx; + while ((cu_idx = next_cu_idx.fetch_add(1, std::memory_order_relaxed)) < + units_to_index.size()) { + fn(worker_id, cu_idx, units_to_index[cu_idx]); + progress.Increment(); + } + }; - task_group.wait(); - }; - - // Extract dies for all DWARFs unit in parallel. Figure out which units - // didn't have their DIEs already parsed and remember this. If no DIEs were - // parsed prior to this index function call, we are going to want to clear the - // CU dies after we are done indexing to make sure we don't pull in all DWARF - // dies, but we need to wait until all units have been indexed in case a DIE - // in one unit refers to another and the indexes accesses those DIEs. - std::vector<std::optional<DWARFUnit::ScopedExtractDIEs>> clear_cu_dies( - units_to_index.size()); - for_each_unit([&clear_cu_dies](size_t, size_t idx, DWARFUnit *unit) { - clear_cu_dies[idx] = unit->ExtractDIEsScoped(); - }); + for (size_t i = 0; i < num_threads; ++i) + task_group.async(wrapper, i); - // Now index all DWARF unit in parallel. - std::vector<IndexSet<NameToDIE>> sets(num_threads); - for_each_unit( - [this, dwp_dwarf, &sets](size_t worker_id, size_t, DWARFUnit *unit) { - IndexUnit(*unit, dwp_dwarf, sets[worker_id]); - }); + task_group.wait(); + }; - // Merge partial indexes into a single index. Process each index in a set in - // parallel. - for (NameToDIE IndexSet<NameToDIE>::*index : indices) { - task_group.async([this, &sets, index, &progress]() { - NameToDIE &result = m_set.*index; - for (auto &set : sets) - result.Append(set.*index); - result.Finalize(); - progress.Increment(); + // Extract dies for all DWARFs unit in parallel. Figure out which units + // didn't have their DIEs already parsed and remember this. If no DIEs were + // parsed prior to this index function call, we are going to want to clear + // the CU dies after we are done indexing to make sure we don't pull in all + // DWARF dies, but we need to wait until all units have been indexed in case + // a DIE in one unit refers to another and the indexes accesses those DIEs. + std::vector<std::optional<DWARFUnit::ScopedExtractDIEs>> clear_cu_dies( + units_to_index.size()); + for_each_unit([&clear_cu_dies](size_t, size_t idx, DWARFUnit *unit) { + clear_cu_dies[idx] = unit->ExtractDIEsScoped(); }); - } - task_group.wait(); - SaveToCache(); + // Now index all DWARF unit in parallel. + std::vector<IndexSet<NameToDIE>> sets(num_threads); + for_each_unit( + [this, dwp_dwarf, &sets](size_t worker_id, size_t, DWARFUnit *unit) { + IndexUnit(*unit, dwp_dwarf, sets[worker_id]); + }); + + // Merge partial indexes into a single index. Process each index in a set in + // parallel. + for (NameToDIE IndexSet<NameToDIE>::*index : indices) { + task_group.async([this, &sets, index, &progress]() { + NameToDIE &result = m_set.*index; + for (auto &set : sets) + result.Append(set.*index); + result.Finalize(); + progress.Increment(); + }); + } + task_group.wait(); + + SaveToCache(); + }); } void ManualDWARFIndex::IndexUnit(DWARFUnit &unit, SymbolFileDWARFDwo *dwp, diff --git a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.h b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.h index 0b5b2f3e84309..f7d45b5a24990 100644 --- a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.h +++ b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.h @@ -170,7 +170,7 @@ class ManualDWARFIndex : public DWARFIndex { llvm::DenseSet<uint64_t> m_type_sigs_to_avoid; IndexSet<NameToDIE> m_set; - bool m_indexed = false; + std::once_flag m_indexed_flag; }; } // namespace dwarf } // namespace lldb_private::plugin

JDevlieghere

To avoid the extra indentation, can you rename the current method to IndexImpl, make it private, and then call that from inside the call_once in ::Index?

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

JDevlieghere

LGTM!

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

jeffreytan81

Much more readable!

Small change to use (what I think is) a better practice -- we were using the `m_indexed` bool member to make sure we called `Index()` once, but we should just use `std::once`! This change shouldn't affect functionality. This change may also make concurrent access to `Index()` thread-safe, though the ManualDWARFIndex API isn't completely thread-safe due to `Decode()`. I'm not sure if ManualDWARFIndex was ever intended to be thread-safe. Test Plan: `ninja check-lldb` Tested basic debugging workflow of a couple of large projects I had built. Basically: ``` (lldb) target create <project> (lldb) b main (lldb) r (lldb) step ... ``` I A/B tested the performance of launching several modules with parallel module loading and didn't observe any performance regressions. --------- Co-authored-by: Tom Yang <toyang@fb.com>

Tom Yang added 2 commits October 31, 2025 10:38

update ManualDWARFIndex::Index to use std::once

fa605ef

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

remove TODO comment

bf2c756

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

zhyty requested review from clayborg and jeffreytan81 October 31, 2025 18:10

zhyty requested a review from JDevlieghere as a code owner October 31, 2025 18:10

llvmbot added the lldb label Oct 31, 2025

JDevlieghere reviewed Oct 31, 2025

View reviewed changes

reduce indentation by moving body of call_once into IndexImpl

7348607

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

JDevlieghere approved these changes Oct 31, 2025

View reviewed changes

Tom Yang and others added 2 commits October 31, 2025 14:07

undo some comment formatting changes

48d9eeb

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

Merge branch 'main' into upstream-manual-dwarf-index-once-flag

eb6357c

jeffreytan81 approved these changes Oct 31, 2025

View reviewed changes

zhyty merged commit 74d4870 into llvm:main Oct 31, 2025
8 of 9 checks passed

zhyty deleted the upstream-manual-dwarf-index-once-flag branch October 31, 2025 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update ManualDWARFIndex::Index to use std::once #165896

update ManualDWARFIndex::Index to use std::once #165896

zhyty commented Oct 31, 2025

llvmbot commented Oct 31, 2025

JDevlieghere left a comment

JDevlieghere left a comment

jeffreytan81 left a comment

Uh oh!

Labels

4 participants

update ManualDWARFIndex::Index to use std::once #165896

update ManualDWARFIndex::Index to use std::once #165896

Conversation

zhyty commented Oct 31, 2025

llvmbot commented Oct 31, 2025

JDevlieghere left a comment

Choose a reason for hiding this comment

JDevlieghere left a comment

Choose a reason for hiding this comment

jeffreytan81 left a comment

Choose a reason for hiding this comment

Uh oh!

Labels

4 participants