Skip to content

Conversation

@yxsamliu
Copy link
Collaborator

The tool was crashing on Windows with ROCm 7.1 due to two issues: misuse of hipDeviceGet which should not be used (it worked before by accident but was undefined behavior), and ABI incompatibility from hipDeviceProp_t struct layout changes between HIP versions where the gcnArchName offset changed from 396 to 1160 bytes.

The fix removes hipDeviceGet and queries properties directly by device index. It defines separate struct layouts for R0600 (HIP 6.x+) and R0000 (legacy) to handle the different memory layouts correctly.

An automatic API fallback mechanism tries R0600, then R0000, then the unversioned API until one succeeds, ensuring compatibility across different HIP runtime versions. A new --hip-api-version option allows manually selecting the API version when needed.

Additional improvements include enhanced error handling with hipGetErrorString, verbose logging throughout the detection process, and runtime version detection using hipRuntimeGetVersion when available. The versioned API functions provide stable ABI across HIP versions.

Fixes: SWDEV-564272

The tool was crashing on Windows with ROCm 7.1 due to two issues: misuse of hipDeviceGet which should not be used (it worked before by accident but was undefined behavior), and ABI incompatibility from hipDeviceProp_t struct layout changes between HIP versions where the gcnArchName offset changed from 396 to 1160 bytes. The fix removes hipDeviceGet and queries properties directly by device index. It defines separate struct layouts for R0600 (HIP 6.x+) and R0000 (legacy) to handle the different memory layouts correctly. An automatic API fallback mechanism tries R0600, then R0000, then the unversioned API until one succeeds, ensuring compatibility across different HIP runtime versions. A new --hip-api-version option allows manually selecting the API version when needed. Additional improvements include enhanced error handling with hipGetErrorString, verbose logging throughout the detection process, and runtime version detection using hipRuntimeGetVersion when available. The versioned API functions provide stable ABI across HIP versions. Fixes: SWDEV-564272
@yxsamliu yxsamliu requested review from Artem-B and jhuber6 November 12, 2025 14:31
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU labels Nov 12, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 12, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

The tool was crashing on Windows with ROCm 7.1 due to two issues: misuse of hipDeviceGet which should not be used (it worked before by accident but was undefined behavior), and ABI incompatibility from hipDeviceProp_t struct layout changes between HIP versions where the gcnArchName offset changed from 396 to 1160 bytes.

The fix removes hipDeviceGet and queries properties directly by device index. It defines separate struct layouts for R0600 (HIP 6.x+) and R0000 (legacy) to handle the different memory layouts correctly.

An automatic API fallback mechanism tries R0600, then R0000, then the unversioned API until one succeeds, ensuring compatibility across different HIP runtime versions. A new --hip-api-version option allows manually selecting the API version when needed.

Additional improvements include enhanced error handling with hipGetErrorString, verbose logging throughout the detection process, and runtime version detection using hipRuntimeGetVersion when available. The versioned API functions provide stable ABI across HIP versions.

Fixes: SWDEV-564272


Full diff: https://github.com/llvm/llvm-project/pull/167695.diff

2 Files Affected:

  • (modified) clang/tools/offload-arch/AMDGPUArchByHIP.cpp (+166-24)
  • (modified) clang/tools/offload-arch/OffloadArch.cpp (+3-1)
diff --git a/clang/tools/offload-arch/AMDGPUArchByHIP.cpp b/clang/tools/offload-arch/AMDGPUArchByHIP.cpp index ff39a85d15628..7ca94ec42f2e9 100644 --- a/clang/tools/offload-arch/AMDGPUArchByHIP.cpp +++ b/clang/tools/offload-arch/AMDGPUArchByHIP.cpp @@ -32,22 +32,54 @@ using namespace llvm; -typedef struct { +// R0600 struct layout (HIP 6.x+) +typedef struct alignas(8) { + char padding[1160]; + char gcnArchName[256]; + char padding2[56]; +} hipDeviceProp_tR0600; + +// R0000 struct layout (legacy) +typedef struct alignas(8) { char padding[396]; char gcnArchName[256]; char padding2[1024]; -} hipDeviceProp_t; +} hipDeviceProp_tR0000; typedef enum { hipSuccess = 0, } hipError_t; typedef hipError_t (*hipGetDeviceCount_t)(int *); -typedef hipError_t (*hipDeviceGet_t)(int *, int); -typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); +typedef hipError_t (*hipGetDevicePropertiesR0600_t)(hipDeviceProp_tR0600 *, + int); +typedef hipError_t (*hipGetDevicePropertiesR0000_t)(hipDeviceProp_tR0000 *, + int); +typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_tR0000 *, int); +typedef hipError_t (*hipRuntimeGetVersion_t)(int *); +typedef const char *(*hipGetErrorString_t)(hipError_t); extern cl::opt<bool> Verbose; +cl::OptionCategory AMDGPUArchByHIPCategory("amdgpu-arch (HIP) options"); + +enum class HipApiVersion { + Auto, // Automatic fallback (R0600 -> R0000 -> unversioned) + R0600, // Force R0600 API (HIP 6.x+) + R0000, // Force R0000 API (legacy HIP) + Unversioned // Force unversioned API (very old HIP) +}; + +static cl::opt<HipApiVersion> HipApi( + "hip-api-version", cl::desc("Select HIP API version for device properties"), + cl::values(clEnumValN(HipApiVersion::Auto, "auto", + "Auto-detect (R0600 -> R0000 -> unversioned)"), + clEnumValN(HipApiVersion::R0600, "r0600", "Force R0600 API"), + clEnumValN(HipApiVersion::R0000, "r0000", "Force R0000 API"), + clEnumValN(HipApiVersion::Unversioned, "unversioned", + "Force unversioned API")), + cl::init(HipApiVersion::Auto), cl::cat(AMDGPUArchByHIPCategory)); + #ifdef _WIN32 static std::vector<std::string> getSearchPaths() { std::vector<std::string> Paths; @@ -177,6 +209,10 @@ int printGPUsByHIP() { return 1; } + if (Verbose) { + outs() << "Successfully loaded HIP runtime library\n"; + } + #define DYNAMIC_INIT_HIP(SYMBOL) \ { \ void *SymbolPtr = DynlibHandle->getAddressOfSymbol(#SYMBOL); \ @@ -184,42 +220,148 @@ int printGPUsByHIP() { llvm::errs() << "Failed to find symbol " << #SYMBOL << '\n'; \ return 1; \ } \ + if (Verbose) \ + outs() << "Found symbol: " << #SYMBOL << '\n'; \ SYMBOL = reinterpret_cast<decltype(SYMBOL)>(SymbolPtr); \ } hipGetDeviceCount_t hipGetDeviceCount; - hipDeviceGet_t hipDeviceGet; - hipGetDeviceProperties_t hipGetDeviceProperties; + hipRuntimeGetVersion_t hipRuntimeGetVersion = nullptr; + hipGetDevicePropertiesR0600_t hipGetDevicePropertiesR0600 = nullptr; + hipGetDevicePropertiesR0000_t hipGetDevicePropertiesR0000 = nullptr; + hipGetDeviceProperties_t hipGetDeviceProperties = nullptr; + hipGetErrorString_t hipGetErrorString = nullptr; DYNAMIC_INIT_HIP(hipGetDeviceCount); - DYNAMIC_INIT_HIP(hipDeviceGet); - DYNAMIC_INIT_HIP(hipGetDeviceProperties); #undef DYNAMIC_INIT_HIP - int deviceCount; - hipError_t err = hipGetDeviceCount(&deviceCount); - if (err != hipSuccess) { - llvm::errs() << "Failed to get device count\n"; + auto LoadSymbol = [&](const char *Name, auto &FuncPtr, + const char *Desc = "") { + void *Sym = DynlibHandle->getAddressOfSymbol(Name); + if (Sym) { + FuncPtr = reinterpret_cast<decltype(FuncPtr)>(Sym); + if (Verbose) + outs() << "Found symbol: " << Name << (Desc[0] ? " " : "") << Desc + << '\n'; + return true; + } + return false; + }; + + LoadSymbol("hipGetErrorString", hipGetErrorString); + + if (LoadSymbol("hipRuntimeGetVersion", hipRuntimeGetVersion)) { + int RuntimeVersion = 0; + if (hipRuntimeGetVersion(&RuntimeVersion) == hipSuccess) { + int Major = RuntimeVersion / 10000000; + int Minor = (RuntimeVersion / 100000) % 100; + int Patch = RuntimeVersion % 100000; + if (Verbose) + outs() << "HIP Runtime Version: " << Major << "." << Minor << "." + << Patch << '\n'; + } + } + + LoadSymbol("hipGetDevicePropertiesR0600", hipGetDevicePropertiesR0600, + "(HIP 6.x+ API)"); + LoadSymbol("hipGetDevicePropertiesR0000", hipGetDevicePropertiesR0000, + "(legacy API)"); + if (!hipGetDevicePropertiesR0600 && !hipGetDevicePropertiesR0000) + LoadSymbol("hipGetDeviceProperties", hipGetDeviceProperties, + "(unversioned legacy API)"); + + int DeviceCount; + if (Verbose) + outs() << "Calling hipGetDeviceCount...\n"; + hipError_t Err = hipGetDeviceCount(&DeviceCount); + if (Err != hipSuccess) { + llvm::errs() << "Failed to get device count"; + if (hipGetErrorString) { + llvm::errs() << ": " << hipGetErrorString(Err); + } + llvm::errs() << " (error code: " << Err << ")\n"; return 1; } - for (int i = 0; i < deviceCount; ++i) { - int deviceId; - err = hipDeviceGet(&deviceId, i); - if (err != hipSuccess) { - llvm::errs() << "Failed to get device id for ordinal " << i << '\n'; - return 1; + if (Verbose) + outs() << "Found " << DeviceCount << " device(s)\n"; + + auto TryGetProperties = [&](auto *ApiFunc, auto *DummyProp, const char *Name, + int DeviceId) -> std::string { + if (!ApiFunc) + return ""; + + if (Verbose) + outs() << "Using " << Name << "...\n"; + + using PropType = std::remove_pointer_t<decltype(DummyProp)>; + PropType Prop; + hipError_t Err = ApiFunc(&Prop, DeviceId); + + if (Err == hipSuccess) { + if (Verbose) { + outs() << Name << " struct: sizeof = " << sizeof(PropType) + << " bytes, offsetof(gcnArchName) = " + << offsetof(PropType, gcnArchName) << " bytes\n"; + } + return Prop.gcnArchName; } - hipDeviceProp_t prop; - err = hipGetDeviceProperties(&prop, deviceId); - if (err != hipSuccess) { - llvm::errs() << "Failed to get device properties for device " << deviceId - << '\n'; + if (Verbose) + llvm::errs() << Name << " failed (error code: " << Err << ")\n"; + return ""; + }; + + for (int i = 0; i < DeviceCount; ++i) { + if (Verbose) + outs() << "Processing device " << i << "...\n"; + + std::string ArchName; + auto TryR0600 = [&](int Dev) -> bool { + if (!hipGetDevicePropertiesR0600) + return false; + ArchName = TryGetProperties(hipGetDevicePropertiesR0600, + (hipDeviceProp_tR0600 *)nullptr, + "R0600 API (HIP 6.x+)", Dev); + return !ArchName.empty(); + }; + auto TryR0000 = [&](int Dev) -> bool { + if (!hipGetDevicePropertiesR0000) + return false; + ArchName = TryGetProperties(hipGetDevicePropertiesR0000, + (hipDeviceProp_tR0000 *)nullptr, + "R0000 API (legacy HIP)", Dev); + return !ArchName.empty(); + }; + auto TryUnversioned = [&](int Dev) -> bool { + if (!hipGetDeviceProperties) + return false; + ArchName = TryGetProperties(hipGetDeviceProperties, + (hipDeviceProp_tR0000 *)nullptr, + "unversioned API (very old HIP)", Dev); + return !ArchName.empty(); + }; + + if (HipApi == HipApiVersion::Auto) { + (TryR0600(i) || TryR0000(i) || TryUnversioned(i)); + } else if (HipApi == HipApiVersion::R0600) { + (void)TryR0600(i); + } else if (HipApi == HipApiVersion::R0000) { + (void)TryR0000(i); + } else { + (void)TryUnversioned(i); + } + + if (ArchName.empty()) { + llvm::errs() << "Failed to get device properties for device " << i + << " - no APIs available or all failed\n"; return 1; } - llvm::outs() << prop.gcnArchName << '\n'; + + if (Verbose) + outs() << "Device " << i << " arch name: "; + llvm::outs() << ArchName << '\n'; } return 0; diff --git a/clang/tools/offload-arch/OffloadArch.cpp b/clang/tools/offload-arch/OffloadArch.cpp index 3c5131eb7c06c..91493676918cb 100644 --- a/clang/tools/offload-arch/OffloadArch.cpp +++ b/clang/tools/offload-arch/OffloadArch.cpp @@ -17,6 +17,8 @@ static cl::opt<bool> Help("h", cl::desc("Alias for -help"), cl::Hidden); // Mark all our options with this category. static cl::OptionCategory OffloadArchCategory("offload-arch options"); +extern cl::OptionCategory AMDGPUArchByHIPCategory; + enum VendorName { all, amdgpu, @@ -62,7 +64,7 @@ const std::array<std::pair<VendorName, function_ref<int()>>, 3> VendorTable{ {VendorName::intel, printIntel}}}; int main(int argc, char *argv[]) { - cl::HideUnrelatedOptions(OffloadArchCategory); + cl::HideUnrelatedOptions({&OffloadArchCategory, &AMDGPUArchByHIPCategory}); cl::SetVersionPrinter(PrintVersion); cl::ParseCommandLineOptions( 
@llvmbot
Copy link
Member

llvmbot commented Nov 12, 2025

@llvm/pr-subscribers-clang

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

The tool was crashing on Windows with ROCm 7.1 due to two issues: misuse of hipDeviceGet which should not be used (it worked before by accident but was undefined behavior), and ABI incompatibility from hipDeviceProp_t struct layout changes between HIP versions where the gcnArchName offset changed from 396 to 1160 bytes.

The fix removes hipDeviceGet and queries properties directly by device index. It defines separate struct layouts for R0600 (HIP 6.x+) and R0000 (legacy) to handle the different memory layouts correctly.

An automatic API fallback mechanism tries R0600, then R0000, then the unversioned API until one succeeds, ensuring compatibility across different HIP runtime versions. A new --hip-api-version option allows manually selecting the API version when needed.

Additional improvements include enhanced error handling with hipGetErrorString, verbose logging throughout the detection process, and runtime version detection using hipRuntimeGetVersion when available. The versioned API functions provide stable ABI across HIP versions.

Fixes: SWDEV-564272


Full diff: https://github.com/llvm/llvm-project/pull/167695.diff

2 Files Affected:

  • (modified) clang/tools/offload-arch/AMDGPUArchByHIP.cpp (+166-24)
  • (modified) clang/tools/offload-arch/OffloadArch.cpp (+3-1)
diff --git a/clang/tools/offload-arch/AMDGPUArchByHIP.cpp b/clang/tools/offload-arch/AMDGPUArchByHIP.cpp index ff39a85d15628..7ca94ec42f2e9 100644 --- a/clang/tools/offload-arch/AMDGPUArchByHIP.cpp +++ b/clang/tools/offload-arch/AMDGPUArchByHIP.cpp @@ -32,22 +32,54 @@ using namespace llvm; -typedef struct { +// R0600 struct layout (HIP 6.x+) +typedef struct alignas(8) { + char padding[1160]; + char gcnArchName[256]; + char padding2[56]; +} hipDeviceProp_tR0600; + +// R0000 struct layout (legacy) +typedef struct alignas(8) { char padding[396]; char gcnArchName[256]; char padding2[1024]; -} hipDeviceProp_t; +} hipDeviceProp_tR0000; typedef enum { hipSuccess = 0, } hipError_t; typedef hipError_t (*hipGetDeviceCount_t)(int *); -typedef hipError_t (*hipDeviceGet_t)(int *, int); -typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); +typedef hipError_t (*hipGetDevicePropertiesR0600_t)(hipDeviceProp_tR0600 *, + int); +typedef hipError_t (*hipGetDevicePropertiesR0000_t)(hipDeviceProp_tR0000 *, + int); +typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_tR0000 *, int); +typedef hipError_t (*hipRuntimeGetVersion_t)(int *); +typedef const char *(*hipGetErrorString_t)(hipError_t); extern cl::opt<bool> Verbose; +cl::OptionCategory AMDGPUArchByHIPCategory("amdgpu-arch (HIP) options"); + +enum class HipApiVersion { + Auto, // Automatic fallback (R0600 -> R0000 -> unversioned) + R0600, // Force R0600 API (HIP 6.x+) + R0000, // Force R0000 API (legacy HIP) + Unversioned // Force unversioned API (very old HIP) +}; + +static cl::opt<HipApiVersion> HipApi( + "hip-api-version", cl::desc("Select HIP API version for device properties"), + cl::values(clEnumValN(HipApiVersion::Auto, "auto", + "Auto-detect (R0600 -> R0000 -> unversioned)"), + clEnumValN(HipApiVersion::R0600, "r0600", "Force R0600 API"), + clEnumValN(HipApiVersion::R0000, "r0000", "Force R0000 API"), + clEnumValN(HipApiVersion::Unversioned, "unversioned", + "Force unversioned API")), + cl::init(HipApiVersion::Auto), cl::cat(AMDGPUArchByHIPCategory)); + #ifdef _WIN32 static std::vector<std::string> getSearchPaths() { std::vector<std::string> Paths; @@ -177,6 +209,10 @@ int printGPUsByHIP() { return 1; } + if (Verbose) { + outs() << "Successfully loaded HIP runtime library\n"; + } + #define DYNAMIC_INIT_HIP(SYMBOL) \ { \ void *SymbolPtr = DynlibHandle->getAddressOfSymbol(#SYMBOL); \ @@ -184,42 +220,148 @@ int printGPUsByHIP() { llvm::errs() << "Failed to find symbol " << #SYMBOL << '\n'; \ return 1; \ } \ + if (Verbose) \ + outs() << "Found symbol: " << #SYMBOL << '\n'; \ SYMBOL = reinterpret_cast<decltype(SYMBOL)>(SymbolPtr); \ } hipGetDeviceCount_t hipGetDeviceCount; - hipDeviceGet_t hipDeviceGet; - hipGetDeviceProperties_t hipGetDeviceProperties; + hipRuntimeGetVersion_t hipRuntimeGetVersion = nullptr; + hipGetDevicePropertiesR0600_t hipGetDevicePropertiesR0600 = nullptr; + hipGetDevicePropertiesR0000_t hipGetDevicePropertiesR0000 = nullptr; + hipGetDeviceProperties_t hipGetDeviceProperties = nullptr; + hipGetErrorString_t hipGetErrorString = nullptr; DYNAMIC_INIT_HIP(hipGetDeviceCount); - DYNAMIC_INIT_HIP(hipDeviceGet); - DYNAMIC_INIT_HIP(hipGetDeviceProperties); #undef DYNAMIC_INIT_HIP - int deviceCount; - hipError_t err = hipGetDeviceCount(&deviceCount); - if (err != hipSuccess) { - llvm::errs() << "Failed to get device count\n"; + auto LoadSymbol = [&](const char *Name, auto &FuncPtr, + const char *Desc = "") { + void *Sym = DynlibHandle->getAddressOfSymbol(Name); + if (Sym) { + FuncPtr = reinterpret_cast<decltype(FuncPtr)>(Sym); + if (Verbose) + outs() << "Found symbol: " << Name << (Desc[0] ? " " : "") << Desc + << '\n'; + return true; + } + return false; + }; + + LoadSymbol("hipGetErrorString", hipGetErrorString); + + if (LoadSymbol("hipRuntimeGetVersion", hipRuntimeGetVersion)) { + int RuntimeVersion = 0; + if (hipRuntimeGetVersion(&RuntimeVersion) == hipSuccess) { + int Major = RuntimeVersion / 10000000; + int Minor = (RuntimeVersion / 100000) % 100; + int Patch = RuntimeVersion % 100000; + if (Verbose) + outs() << "HIP Runtime Version: " << Major << "." << Minor << "." + << Patch << '\n'; + } + } + + LoadSymbol("hipGetDevicePropertiesR0600", hipGetDevicePropertiesR0600, + "(HIP 6.x+ API)"); + LoadSymbol("hipGetDevicePropertiesR0000", hipGetDevicePropertiesR0000, + "(legacy API)"); + if (!hipGetDevicePropertiesR0600 && !hipGetDevicePropertiesR0000) + LoadSymbol("hipGetDeviceProperties", hipGetDeviceProperties, + "(unversioned legacy API)"); + + int DeviceCount; + if (Verbose) + outs() << "Calling hipGetDeviceCount...\n"; + hipError_t Err = hipGetDeviceCount(&DeviceCount); + if (Err != hipSuccess) { + llvm::errs() << "Failed to get device count"; + if (hipGetErrorString) { + llvm::errs() << ": " << hipGetErrorString(Err); + } + llvm::errs() << " (error code: " << Err << ")\n"; return 1; } - for (int i = 0; i < deviceCount; ++i) { - int deviceId; - err = hipDeviceGet(&deviceId, i); - if (err != hipSuccess) { - llvm::errs() << "Failed to get device id for ordinal " << i << '\n'; - return 1; + if (Verbose) + outs() << "Found " << DeviceCount << " device(s)\n"; + + auto TryGetProperties = [&](auto *ApiFunc, auto *DummyProp, const char *Name, + int DeviceId) -> std::string { + if (!ApiFunc) + return ""; + + if (Verbose) + outs() << "Using " << Name << "...\n"; + + using PropType = std::remove_pointer_t<decltype(DummyProp)>; + PropType Prop; + hipError_t Err = ApiFunc(&Prop, DeviceId); + + if (Err == hipSuccess) { + if (Verbose) { + outs() << Name << " struct: sizeof = " << sizeof(PropType) + << " bytes, offsetof(gcnArchName) = " + << offsetof(PropType, gcnArchName) << " bytes\n"; + } + return Prop.gcnArchName; } - hipDeviceProp_t prop; - err = hipGetDeviceProperties(&prop, deviceId); - if (err != hipSuccess) { - llvm::errs() << "Failed to get device properties for device " << deviceId - << '\n'; + if (Verbose) + llvm::errs() << Name << " failed (error code: " << Err << ")\n"; + return ""; + }; + + for (int i = 0; i < DeviceCount; ++i) { + if (Verbose) + outs() << "Processing device " << i << "...\n"; + + std::string ArchName; + auto TryR0600 = [&](int Dev) -> bool { + if (!hipGetDevicePropertiesR0600) + return false; + ArchName = TryGetProperties(hipGetDevicePropertiesR0600, + (hipDeviceProp_tR0600 *)nullptr, + "R0600 API (HIP 6.x+)", Dev); + return !ArchName.empty(); + }; + auto TryR0000 = [&](int Dev) -> bool { + if (!hipGetDevicePropertiesR0000) + return false; + ArchName = TryGetProperties(hipGetDevicePropertiesR0000, + (hipDeviceProp_tR0000 *)nullptr, + "R0000 API (legacy HIP)", Dev); + return !ArchName.empty(); + }; + auto TryUnversioned = [&](int Dev) -> bool { + if (!hipGetDeviceProperties) + return false; + ArchName = TryGetProperties(hipGetDeviceProperties, + (hipDeviceProp_tR0000 *)nullptr, + "unversioned API (very old HIP)", Dev); + return !ArchName.empty(); + }; + + if (HipApi == HipApiVersion::Auto) { + (TryR0600(i) || TryR0000(i) || TryUnversioned(i)); + } else if (HipApi == HipApiVersion::R0600) { + (void)TryR0600(i); + } else if (HipApi == HipApiVersion::R0000) { + (void)TryR0000(i); + } else { + (void)TryUnversioned(i); + } + + if (ArchName.empty()) { + llvm::errs() << "Failed to get device properties for device " << i + << " - no APIs available or all failed\n"; return 1; } - llvm::outs() << prop.gcnArchName << '\n'; + + if (Verbose) + outs() << "Device " << i << " arch name: "; + llvm::outs() << ArchName << '\n'; } return 0; diff --git a/clang/tools/offload-arch/OffloadArch.cpp b/clang/tools/offload-arch/OffloadArch.cpp index 3c5131eb7c06c..91493676918cb 100644 --- a/clang/tools/offload-arch/OffloadArch.cpp +++ b/clang/tools/offload-arch/OffloadArch.cpp @@ -17,6 +17,8 @@ static cl::opt<bool> Help("h", cl::desc("Alias for -help"), cl::Hidden); // Mark all our options with this category. static cl::OptionCategory OffloadArchCategory("offload-arch options"); +extern cl::OptionCategory AMDGPUArchByHIPCategory; + enum VendorName { all, amdgpu, @@ -62,7 +64,7 @@ const std::array<std::pair<VendorName, function_ref<int()>>, 3> VendorTable{ {VendorName::intel, printIntel}}}; int main(int argc, char *argv[]) { - cl::HideUnrelatedOptions(OffloadArchCategory); + cl::HideUnrelatedOptions({&OffloadArchCategory, &AMDGPUArchByHIPCategory}); cl::SetVersionPrinter(PrintVersion); cl::ParseCommandLineOptions( 
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable, unfortunate that the ABI broke like this

Comment on lines +212 to +214
if (Verbose) {
outs() << "Successfully loaded HIP runtime library\n";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (Verbose) {
outs() << "Successfully loaded HIP runtime library\n";
}
if (Verbose)
outs() << "Successfully loaded HIP runtime library\n";
return "";
};

for (int i = 0; i < DeviceCount; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (int i = 0; i < DeviceCount; ++i) {
for (int i = 0; i < DeviceCount; ++i) {

Fix case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely optional nit: For enumerating things I grew to like for(auto i: llvm::seq<N>()). It's both concise, and appears to express the enumeration as the primary purpose better than a plain loop.

if (HipApi == HipApiVersion::Auto) {
(TryR0600(i) || TryR0000(i) || TryUnversioned(i));
} else if (HipApi == HipApiVersion::R0600) {
(void)TryR0600(i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer [[maybe_unused]] for C++17

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird () around the whole statement

return "";
};

for (int i = 0; i < DeviceCount; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely optional nit: For enumerating things I grew to like for(auto i: llvm::seq<N>()). It's both concise, and appears to express the enumeration as the primary purpose better than a plain loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang Clang issues not falling into any other category

5 participants