Skip to content

Conversation

@mshockwave
Copy link
Member

This patch creates a new syntax intended for -mtune, such that we can write

-mtune=xyz:foo,no-bar 

to reuse the tune CPU definition of xyz but "customize" it by enabling feature foo while disabling feature bar. The "foo,no-bar" is the new tune feature string we're introducing here.

The grammar is simple:

directive ::= ("no-")? featureName tuneFeatureStr ::= directive ("," directive)* 

Where featureName is one of the subtarget features that are marked by a special TableGen class RISCVTuneFeature.

Directives with a "no-" prefix are considered negative ones -- they are supposed to only subtract subtarget features from the base tune CPU. Otherwise, a directive is considered a positive one, which adds a new subtarget feature.

None of the subtarget features generated from positive directives -- either explicitly or implicitly (i.e. implied features) -- should overlaps with those generated from the negative directives. This is a deliberated decision to simplify the logics and prevent it from becoming unmanageable in the future.


Credit to Sam for the rules and semantics describes in #162716 (comment) of which this PR builds on top.

I still need to add documentation. There will also be another follow-up patch to integrate this into Clang.

Co-Authored-By: Sam Elliott <aelliott@qti.qualcomm.com>
@llvmbot
Copy link
Member

llvmbot commented Nov 15, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

This patch creates a new syntax intended for -mtune, such that we can write

-mtune=xyz:foo,no-bar 

to reuse the tune CPU definition of xyz but "customize" it by enabling feature foo while disabling feature bar. The "foo,no-bar" is the new tune feature string we're introducing here.

The grammar is simple:

directive ::= ("no-")? featureName tuneFeatureStr ::= directive ("," directive)* 

Where featureName is one of the subtarget features that are marked by a special TableGen class RISCVTuneFeature.

Directives with a "no-" prefix are considered negative ones -- they are supposed to only subtract subtarget features from the base tune CPU. Otherwise, a directive is considered a positive one, which adds a new subtarget feature.

None of the subtarget features generated from positive directives -- either explicitly or implicitly (i.e. implied features) -- should overlaps with those generated from the negative directives. This is a deliberated decision to simplify the logics and prevent it from becoming unmanageable in the future.


Credit to Sam for the rules and semantics describes in #162716 (comment) of which this PR builds on top.

I still need to add documentation. There will also be another follow-up patch to integrate this into Clang.


Patch is 21.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168160.diff

5 Files Affected:

  • (modified) llvm/include/llvm/TargetParser/RISCVTargetParser.h (+4)
  • (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+26-22)
  • (modified) llvm/lib/TargetParser/RISCVTargetParser.cpp (+99)
  • (modified) llvm/unittests/TargetParser/RISCVTargetParserTest.cpp (+88)
  • (modified) llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp (+48-1)
diff --git a/llvm/include/llvm/TargetParser/RISCVTargetParser.h b/llvm/include/llvm/TargetParser/RISCVTargetParser.h index 2ac58a539d5ee..2c6a59a59a993 100644 --- a/llvm/include/llvm/TargetParser/RISCVTargetParser.h +++ b/llvm/include/llvm/TargetParser/RISCVTargetParser.h @@ -16,6 +16,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/Support/Compiler.h" +#include "llvm/Support/Error.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" @@ -54,6 +55,9 @@ static constexpr unsigned RVVBytesPerBlock = RVVBitsPerBlock / 8; LLVM_ABI void getFeaturesForCPU(StringRef CPU, SmallVectorImpl<std::string> &EnabledFeatures, bool NeedPlus = false); +LLVM_ABI void getAllTuneFeatures(SmallVectorImpl<StringRef> &TuneFeatures); +LLVM_ABI Error parseTuneFeatureString( + StringRef TFString, SmallVectorImpl<std::string> &TuneFeatures); LLVM_ABI bool parseCPU(StringRef CPU, bool IsRV64); LLVM_ABI bool parseTuneCPU(StringRef CPU, bool IsRV64); LLVM_ABI StringRef getMArchFromMcpu(StringRef CPU); diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td index 0b964c4808d8a..ef5203445b55c 100644 --- a/llvm/lib/Target/RISCV/RISCVFeatures.td +++ b/llvm/lib/Target/RISCV/RISCVFeatures.td @@ -1740,6 +1740,10 @@ def HasVendorXSMTVDot // LLVM specific features and extensions //===----------------------------------------------------------------------===// +class RISCVTuneFeature<string name, string field_name, string value, + string description, list<SubtargetFeature> implied = []> + : SubtargetFeature<name, field_name, value, description, implied>; + // Feature32Bit exists to mark CPUs that support RV32 to distinguish them from // tuning CPU names. def Feature32Bit @@ -1788,46 +1792,46 @@ def FeatureUnalignedVectorMem "loads and stores">; def TuneNLogNVRGather - : SubtargetFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", + : RISCVTuneFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", "Has vrgather.vv with LMUL*log2(LMUL) latency">; -def TunePostRAScheduler : SubtargetFeature<"use-postra-scheduler", +def TunePostRAScheduler : RISCVTuneFeature<"use-postra-scheduler", "UsePostRAScheduler", "true", "Schedule again after register allocation">; -def TuneDisableMISchedLoadClustering : SubtargetFeature<"disable-misched-load-clustering", +def TuneDisableMISchedLoadClustering : RISCVTuneFeature<"disable-misched-load-clustering", "EnableMISchedLoadClustering", "false", "Disable load clustering in the machine scheduler">; -def TuneDisableMISchedStoreClustering : SubtargetFeature<"disable-misched-store-clustering", +def TuneDisableMISchedStoreClustering : RISCVTuneFeature<"disable-misched-store-clustering", "EnableMISchedStoreClustering", "false", "Disable store clustering in the machine scheduler">; -def TuneDisablePostMISchedLoadClustering : SubtargetFeature<"disable-postmisched-load-clustering", +def TuneDisablePostMISchedLoadClustering : RISCVTuneFeature<"disable-postmisched-load-clustering", "EnablePostMISchedLoadClustering", "false", "Disable PostRA load clustering in the machine scheduler">; -def TuneDisablePostMISchedStoreClustering : SubtargetFeature<"disable-postmisched-store-clustering", +def TuneDisablePostMISchedStoreClustering : RISCVTuneFeature<"disable-postmisched-store-clustering", "EnablePostMISchedStoreClustering", "false", "Disable PostRA store clustering in the machine scheduler">; def TuneDisableLatencySchedHeuristic - : SubtargetFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", + : RISCVTuneFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", "Disable latency scheduling heuristic">; def TunePredictableSelectIsExpensive - : SubtargetFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", + : RISCVTuneFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", "Prefer likely predicted branches over selects">; def TuneOptimizedZeroStrideLoad - : SubtargetFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", + : RISCVTuneFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", "true", "Optimized (perform fewer memory operations)" "zero-stride vector load">; foreach nf = {2-8} in def TuneOptimizedNF#nf#SegmentLoadStore : - SubtargetFeature<"optimized-nf"#nf#"-segment-load-store", + RISCVTuneFeature<"optimized-nf"#nf#"-segment-load-store", "HasOptimizedNF"#nf#"SegmentLoadStore", "true", "vlseg"#nf#"eN.v and vsseg"#nf#"eN.v are " "implemented as a wide memory op and shuffle">; def TuneVLDependentLatency - : SubtargetFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", + : RISCVTuneFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", "Latency of vector instructions is dependent on the " "dynamic value of vl">; @@ -1839,50 +1843,50 @@ def Experimental // and instead split over multiple cycles. DLEN refers to the datapath width // that can be done in parallel. def TuneDLenFactor2 - : SubtargetFeature<"dlen-factor-2", "DLenFactor2", "true", + : RISCVTuneFeature<"dlen-factor-2", "DLenFactor2", "true", "Vector unit DLEN(data path width) is half of VLEN">; def TuneNoDefaultUnroll - : SubtargetFeature<"no-default-unroll", "EnableDefaultUnroll", "false", + : RISCVTuneFeature<"no-default-unroll", "EnableDefaultUnroll", "false", "Disable default unroll preference.">; // SiFive 7 is able to fuse integer ALU operations with a preceding branch // instruction. def TuneShortForwardBranchOpt - : SubtargetFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", + : RISCVTuneFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", "true", "Enable short forward branch optimization">; def HasShortForwardBranchOpt : Predicate<"Subtarget->hasShortForwardBranchOpt()">; def NoShortForwardBranchOpt : Predicate<"!Subtarget->hasShortForwardBranchOpt()">; def TuneShortForwardBranchIMinMax - : SubtargetFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", + : RISCVTuneFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", "true", "Enable short forward branch optimization for min,max instructions in Zbb", [TuneShortForwardBranchOpt]>; def TuneShortForwardBranchIMul - : SubtargetFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", + : RISCVTuneFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", "true", "Enable short forward branch optimization for mul instruction", [TuneShortForwardBranchOpt]>; // Some subtargets require a S2V transfer buffer to move scalars into vectors. // FIXME: Forming .vx/.vf/.wx/.wf can reduce register pressure. def TuneNoSinkSplatOperands - : SubtargetFeature<"no-sink-splat-operands", "SinkSplatOperands", + : RISCVTuneFeature<"no-sink-splat-operands", "SinkSplatOperands", "false", "Disable sink splat operands to enable .vx, .vf," ".wx, and .wf instructions">; def TunePreferWInst - : SubtargetFeature<"prefer-w-inst", "PreferWInst", "true", + : RISCVTuneFeature<"prefer-w-inst", "PreferWInst", "true", "Prefer instructions with W suffix">; def TuneConditionalCompressedMoveFusion - : SubtargetFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", + : RISCVTuneFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", "true", "Enable branch+c.mv fusion">; def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">; def NoConditionalMoveFusion : Predicate<"!Subtarget->hasConditionalMoveFusion()">; def TuneHasSingleElementVecFP64 - : SubtargetFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", + : RISCVTuneFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", "Certain vector FP64 operations produce a single result " "element per cycle">; @@ -1899,11 +1903,11 @@ def TuneVentanaVeyron : SubtargetFeature<"ventana-veyron", "RISCVProcFamily", "V def TuneAndes45 : SubtargetFeature<"andes45", "RISCVProcFamily", "Andes45", "Andes 45-Series processors">; -def TuneVXRMPipelineFlush : SubtargetFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", +def TuneVXRMPipelineFlush : RISCVTuneFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", "true", "VXRM writes causes pipeline flush">; def TunePreferVsetvliOverReadVLENB - : SubtargetFeature<"prefer-vsetvli-over-read-vlenb", + : RISCVTuneFeature<"prefer-vsetvli-over-read-vlenb", "PreferVsetvliOverReadVLENB", "true", "Prefer vsetvli over read vlenb CSR to calculate VLEN">; diff --git a/llvm/lib/TargetParser/RISCVTargetParser.cpp b/llvm/lib/TargetParser/RISCVTargetParser.cpp index 5ea63a973ea37..aa2039a00a550 100644 --- a/llvm/lib/TargetParser/RISCVTargetParser.cpp +++ b/llvm/lib/TargetParser/RISCVTargetParser.cpp @@ -12,7 +12,11 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/SetOperations.h" +#include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/StringSet.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/TargetParser/RISCVISAInfo.h" @@ -145,6 +149,101 @@ void getFeaturesForCPU(StringRef CPU, EnabledFeatures.push_back(F.substr(1)); } +using RISCVImpliedTuneFeature = std::pair<const char *, const char *>; + +#define GET_TUNE_FEATURES +#define GET_IMPLIED_TUNE_FEATURES +#include "llvm/TargetParser/RISCVTargetParserDef.inc" + +void getAllTuneFeatures(SmallVectorImpl<StringRef> &Features) { + Features.assign(std::begin(TuneFeatures), std::end(TuneFeatures)); +} + +Error parseTuneFeatureString(StringRef TFString, + SmallVectorImpl<std::string> &ResFeatures) { + const StringSet<> AllTuneFeatureSet(llvm::from_range, TuneFeatures); + using SmallStringSet = SmallSet<StringRef, 4>; + + TFString = TFString.trim(); + // Note: StringSet is not really ergnomic to use in this case here. + SmallStringSet PositiveFeatures; + SmallStringSet NegativeFeatures; + // Phase 1: Collect explicit features. + StringRef FeatureStr; + do { + std::tie(FeatureStr, TFString) = TFString.split(","); + if (AllTuneFeatureSet.count(FeatureStr)) { + if (!PositiveFeatures.insert(FeatureStr).second) + return createStringError(inconvertibleErrorCode(), + "cannot specify more than one instance of '" + + Twine(FeatureStr) + "'"); + } else if (FeatureStr.starts_with("no-")) { + // Check if this is a negative feature, like `no-foo` for `foo`. + StringRef ActualFeature = FeatureStr.drop_front(3); + if (AllTuneFeatureSet.count(ActualFeature)) { + if (!NegativeFeatures.insert(ActualFeature).second) + return createStringError( + inconvertibleErrorCode(), + "cannot specify more than one instance of '" + Twine(FeatureStr) + + "'"); + } + } else { + return createStringError(inconvertibleErrorCode(), + "unrecognized tune feature directive '" + + Twine(FeatureStr) + "'"); + } + } while (!TFString.empty()); + + auto Intersection = + llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' cannot appear in both " + "positive and negative directives"); + } + + // Phase 2: Derive implied features. + StringMap<SmallVector<StringRef, 2>> ImpliedFeatureMap; + StringMap<SmallVector<StringRef, 2>> InverseImpliedFeatureMap; + for (auto [Feature, ImpliedFeature] : ImpliedTuneFeatures) { + ImpliedFeatureMap[Feature].push_back(ImpliedFeature); + InverseImpliedFeatureMap[ImpliedFeature].push_back(Feature); + } + + for (StringRef PF : PositiveFeatures) { + auto ItFeatures = ImpliedFeatureMap.find(PF); + if (ItFeatures != ImpliedFeatureMap.end()) + PositiveFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + for (StringRef NF : NegativeFeatures) { + auto ItFeatures = InverseImpliedFeatureMap.find(NF); + if (ItFeatures != InverseImpliedFeatureMap.end()) + NegativeFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + + Intersection = llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' were implied by both " + "positive and negative directives"); + } + + // Export the result. + const std::string PosPrefix("+"); + const std::string NegPrefix("-"); + for (StringRef PF : PositiveFeatures) + ResFeatures.emplace_back(PosPrefix + PF.str()); + for (StringRef NF : NegativeFeatures) + ResFeatures.emplace_back(NegPrefix + NF.str()); + + return Error::success(); +} } // namespace RISCV namespace RISCVVType { diff --git a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp index 63ac8f993ecdc..e85b08a5df5ff 100644 --- a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp +++ b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp @@ -7,6 +7,8 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/SmallVector.h" #include "gtest/gtest.h" using namespace llvm; @@ -30,4 +32,90 @@ TEST(RISCVVType, CheckSameRatioLMUL) { EXPECT_EQ(RISCVVType::LMUL_F2, RISCVVType::getSameRatioLMUL(8, RISCVVType::LMUL_F4, 16)); } + +TEST(RISCVTuneFeature, AllTuneFeatures) { + SmallVector<StringRef> AllTuneFeatures; + RISCV::getAllTuneFeatures(AllTuneFeatures); + EXPECT_EQ(AllTuneFeatures.size(), 28U); + for (auto F : {"conditional-cmv-fusion", + "dlen-factor-2", + "disable-latency-sched-heuristic", + "disable-misched-load-clustering", + "disable-misched-store-clustering", + "disable-postmisched-load-clustering", + "disable-postmisched-store-clustering", + "single-element-vec-fp64", + "log-vrgather", + "no-default-unroll", + "no-sink-splat-operands", + "optimized-nf2-segment-load-store", + "optimized-nf3-segment-load-store", + "optimized-nf4-segment-load-store", + "optimized-nf5-segment-load-store", + "optimized-nf6-segment-load-store", + "optimized-nf7-segment-load-store", + "optimized-nf8-segment-load-store", + "optimized-zero-stride-load", + "use-postra-scheduler", + "predictable-select-expensive", + "prefer-vsetvli-over-read-vlenb", + "prefer-w-inst", + "short-forward-branch-i-minmax", + "short-forward-branch-i-mul", + "short-forward-branch-opt", + "vl-dependent-latency", + "vxrm-pipeline-flush"}) + EXPECT_TRUE(is_contained(AllTuneFeatures, F)); +} + +TEST(RISCVTuneFeature, LegalTuneFeatureStrings) { + SmallVector<std::string> Result; + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "log-vrgather,no-short-forward-branch-opt,vl-dependent-latency", + Result))); + EXPECT_TRUE(is_contained(Result, "+log-vrgather")); + EXPECT_TRUE(is_contained(Result, "+vl-dependent-latency")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-short-forward-branch-i-mul,short-forward-branch-i-minmax", Result))); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-no-default-unroll,no-sink-splat-operands", Result))); + EXPECT_TRUE(is_contained(Result, "+no-sink-splat-operands")); + EXPECT_TRUE(is_contained(Result, "-no-default-unroll")); +} + +TEST(RISCVTuneFeature, UnrecognizedTuneFeature) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("32bit", Result)), + "unrecognized tune feature directive '32bit'"); +} + +TEST(RISCVTuneFeature, DuplicatedFeatures) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("log-vrgather,log-vrgather", + Result)), + "cannot specify more than one instance of 'log-vrgather'"); + + EXPECT_EQ(toString(RISCV::parseTuneFeatureString( + "log-vrgather,no-log-vrgather,short-forward-branch-i-mul,no-" + "short-forward-branch-i-mul", + Result)), + "Feature(s) 'log-vrgather', 'short-forward-branch-i-mul' cannot " + "appear in both positive and negative directives"); + + EXPECT_EQ( + toString(RISCV::parseTuneFeatureString( + "short-forward-branch-i-mul,no-short-forward-branch-opt", Result)), + "Feature(s) 'short-forward-branch-i-mul', 'short-forward-branch-opt' " + "were implied by both positive and negative directives"); +} } // namespace diff --git a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp index f7959376adc4a..7420fa439ecd4 100644 --- a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp +++ b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp @@ -14,6 +14,7 @@ #include "llvm/ADT/DenseSet.h" #include "llvm/Support/Format.h" #include "llvm/Support/RISCVISAUtils.h" +#include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" #include "llvm/TableGen/TableGenBackend.h" @@ -259,7 +260,52 @@ static void emitRISCVExtensionBitmask(const RecordKeeper &RK, raw_ostream &OS) { << "},\n"; } OS << "};\n"; - OS << "#endif\n"; + OS << "#endif\n\n"; +} + +static void emitRISCVTuneFeatures(const RecordKeeper &RK, raw_ostream &OS) { + std::vector<const Record *> TuneFeatureRecords = + RK.getAllDerivedDefinitionsIfDefined("RISCVTuneFeature"); + + SmallVector<StringRef> TuneFeatures; + // A list of {Feature -> Implied Feature} + SmallVector<std::pair<StringRef, StringRef>> ImpliedFeatures; + for (const auto *R : TuneFeatureRecords) { + StringRef FeatureName = R->getValueAsString("Name"); + TuneFeatures.push_back(FeatureName); + std::vector<const Record *> Implies = R->getValueAsListOfDefs("Implies"); + for (const auto *ImpliedRecord : Implies) { + if (!ImpliedRecord->isSubClassOf("RISCVTuneFeature") || + ImpliedRecord == R) { + PrintError(ImpliedRecord,... [truncated] 
@llvmbot
Copy link
Member

llvmbot commented Nov 15, 2025

@llvm/pr-subscribers-tablegen

Author: Min-Yih Hsu (mshockwave)

Changes

This patch creates a new syntax intended for -mtune, such that we can write

-mtune=xyz:foo,no-bar 

to reuse the tune CPU definition of xyz but "customize" it by enabling feature foo while disabling feature bar. The "foo,no-bar" is the new tune feature string we're introducing here.

The grammar is simple:

directive ::= ("no-")? featureName tuneFeatureStr ::= directive ("," directive)* 

Where featureName is one of the subtarget features that are marked by a special TableGen class RISCVTuneFeature.

Directives with a "no-" prefix are considered negative ones -- they are supposed to only subtract subtarget features from the base tune CPU. Otherwise, a directive is considered a positive one, which adds a new subtarget feature.

None of the subtarget features generated from positive directives -- either explicitly or implicitly (i.e. implied features) -- should overlaps with those generated from the negative directives. This is a deliberated decision to simplify the logics and prevent it from becoming unmanageable in the future.


Credit to Sam for the rules and semantics describes in #162716 (comment) of which this PR builds on top.

I still need to add documentation. There will also be another follow-up patch to integrate this into Clang.


Patch is 21.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168160.diff

5 Files Affected:

  • (modified) llvm/include/llvm/TargetParser/RISCVTargetParser.h (+4)
  • (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+26-22)
  • (modified) llvm/lib/TargetParser/RISCVTargetParser.cpp (+99)
  • (modified) llvm/unittests/TargetParser/RISCVTargetParserTest.cpp (+88)
  • (modified) llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp (+48-1)
diff --git a/llvm/include/llvm/TargetParser/RISCVTargetParser.h b/llvm/include/llvm/TargetParser/RISCVTargetParser.h index 2ac58a539d5ee..2c6a59a59a993 100644 --- a/llvm/include/llvm/TargetParser/RISCVTargetParser.h +++ b/llvm/include/llvm/TargetParser/RISCVTargetParser.h @@ -16,6 +16,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/Support/Compiler.h" +#include "llvm/Support/Error.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" @@ -54,6 +55,9 @@ static constexpr unsigned RVVBytesPerBlock = RVVBitsPerBlock / 8; LLVM_ABI void getFeaturesForCPU(StringRef CPU, SmallVectorImpl<std::string> &EnabledFeatures, bool NeedPlus = false); +LLVM_ABI void getAllTuneFeatures(SmallVectorImpl<StringRef> &TuneFeatures); +LLVM_ABI Error parseTuneFeatureString( + StringRef TFString, SmallVectorImpl<std::string> &TuneFeatures); LLVM_ABI bool parseCPU(StringRef CPU, bool IsRV64); LLVM_ABI bool parseTuneCPU(StringRef CPU, bool IsRV64); LLVM_ABI StringRef getMArchFromMcpu(StringRef CPU); diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td index 0b964c4808d8a..ef5203445b55c 100644 --- a/llvm/lib/Target/RISCV/RISCVFeatures.td +++ b/llvm/lib/Target/RISCV/RISCVFeatures.td @@ -1740,6 +1740,10 @@ def HasVendorXSMTVDot // LLVM specific features and extensions //===----------------------------------------------------------------------===// +class RISCVTuneFeature<string name, string field_name, string value, + string description, list<SubtargetFeature> implied = []> + : SubtargetFeature<name, field_name, value, description, implied>; + // Feature32Bit exists to mark CPUs that support RV32 to distinguish them from // tuning CPU names. def Feature32Bit @@ -1788,46 +1792,46 @@ def FeatureUnalignedVectorMem "loads and stores">; def TuneNLogNVRGather - : SubtargetFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", + : RISCVTuneFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", "Has vrgather.vv with LMUL*log2(LMUL) latency">; -def TunePostRAScheduler : SubtargetFeature<"use-postra-scheduler", +def TunePostRAScheduler : RISCVTuneFeature<"use-postra-scheduler", "UsePostRAScheduler", "true", "Schedule again after register allocation">; -def TuneDisableMISchedLoadClustering : SubtargetFeature<"disable-misched-load-clustering", +def TuneDisableMISchedLoadClustering : RISCVTuneFeature<"disable-misched-load-clustering", "EnableMISchedLoadClustering", "false", "Disable load clustering in the machine scheduler">; -def TuneDisableMISchedStoreClustering : SubtargetFeature<"disable-misched-store-clustering", +def TuneDisableMISchedStoreClustering : RISCVTuneFeature<"disable-misched-store-clustering", "EnableMISchedStoreClustering", "false", "Disable store clustering in the machine scheduler">; -def TuneDisablePostMISchedLoadClustering : SubtargetFeature<"disable-postmisched-load-clustering", +def TuneDisablePostMISchedLoadClustering : RISCVTuneFeature<"disable-postmisched-load-clustering", "EnablePostMISchedLoadClustering", "false", "Disable PostRA load clustering in the machine scheduler">; -def TuneDisablePostMISchedStoreClustering : SubtargetFeature<"disable-postmisched-store-clustering", +def TuneDisablePostMISchedStoreClustering : RISCVTuneFeature<"disable-postmisched-store-clustering", "EnablePostMISchedStoreClustering", "false", "Disable PostRA store clustering in the machine scheduler">; def TuneDisableLatencySchedHeuristic - : SubtargetFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", + : RISCVTuneFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", "Disable latency scheduling heuristic">; def TunePredictableSelectIsExpensive - : SubtargetFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", + : RISCVTuneFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", "Prefer likely predicted branches over selects">; def TuneOptimizedZeroStrideLoad - : SubtargetFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", + : RISCVTuneFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", "true", "Optimized (perform fewer memory operations)" "zero-stride vector load">; foreach nf = {2-8} in def TuneOptimizedNF#nf#SegmentLoadStore : - SubtargetFeature<"optimized-nf"#nf#"-segment-load-store", + RISCVTuneFeature<"optimized-nf"#nf#"-segment-load-store", "HasOptimizedNF"#nf#"SegmentLoadStore", "true", "vlseg"#nf#"eN.v and vsseg"#nf#"eN.v are " "implemented as a wide memory op and shuffle">; def TuneVLDependentLatency - : SubtargetFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", + : RISCVTuneFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", "Latency of vector instructions is dependent on the " "dynamic value of vl">; @@ -1839,50 +1843,50 @@ def Experimental // and instead split over multiple cycles. DLEN refers to the datapath width // that can be done in parallel. def TuneDLenFactor2 - : SubtargetFeature<"dlen-factor-2", "DLenFactor2", "true", + : RISCVTuneFeature<"dlen-factor-2", "DLenFactor2", "true", "Vector unit DLEN(data path width) is half of VLEN">; def TuneNoDefaultUnroll - : SubtargetFeature<"no-default-unroll", "EnableDefaultUnroll", "false", + : RISCVTuneFeature<"no-default-unroll", "EnableDefaultUnroll", "false", "Disable default unroll preference.">; // SiFive 7 is able to fuse integer ALU operations with a preceding branch // instruction. def TuneShortForwardBranchOpt - : SubtargetFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", + : RISCVTuneFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", "true", "Enable short forward branch optimization">; def HasShortForwardBranchOpt : Predicate<"Subtarget->hasShortForwardBranchOpt()">; def NoShortForwardBranchOpt : Predicate<"!Subtarget->hasShortForwardBranchOpt()">; def TuneShortForwardBranchIMinMax - : SubtargetFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", + : RISCVTuneFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", "true", "Enable short forward branch optimization for min,max instructions in Zbb", [TuneShortForwardBranchOpt]>; def TuneShortForwardBranchIMul - : SubtargetFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", + : RISCVTuneFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", "true", "Enable short forward branch optimization for mul instruction", [TuneShortForwardBranchOpt]>; // Some subtargets require a S2V transfer buffer to move scalars into vectors. // FIXME: Forming .vx/.vf/.wx/.wf can reduce register pressure. def TuneNoSinkSplatOperands - : SubtargetFeature<"no-sink-splat-operands", "SinkSplatOperands", + : RISCVTuneFeature<"no-sink-splat-operands", "SinkSplatOperands", "false", "Disable sink splat operands to enable .vx, .vf," ".wx, and .wf instructions">; def TunePreferWInst - : SubtargetFeature<"prefer-w-inst", "PreferWInst", "true", + : RISCVTuneFeature<"prefer-w-inst", "PreferWInst", "true", "Prefer instructions with W suffix">; def TuneConditionalCompressedMoveFusion - : SubtargetFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", + : RISCVTuneFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", "true", "Enable branch+c.mv fusion">; def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">; def NoConditionalMoveFusion : Predicate<"!Subtarget->hasConditionalMoveFusion()">; def TuneHasSingleElementVecFP64 - : SubtargetFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", + : RISCVTuneFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", "Certain vector FP64 operations produce a single result " "element per cycle">; @@ -1899,11 +1903,11 @@ def TuneVentanaVeyron : SubtargetFeature<"ventana-veyron", "RISCVProcFamily", "V def TuneAndes45 : SubtargetFeature<"andes45", "RISCVProcFamily", "Andes45", "Andes 45-Series processors">; -def TuneVXRMPipelineFlush : SubtargetFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", +def TuneVXRMPipelineFlush : RISCVTuneFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", "true", "VXRM writes causes pipeline flush">; def TunePreferVsetvliOverReadVLENB - : SubtargetFeature<"prefer-vsetvli-over-read-vlenb", + : RISCVTuneFeature<"prefer-vsetvli-over-read-vlenb", "PreferVsetvliOverReadVLENB", "true", "Prefer vsetvli over read vlenb CSR to calculate VLEN">; diff --git a/llvm/lib/TargetParser/RISCVTargetParser.cpp b/llvm/lib/TargetParser/RISCVTargetParser.cpp index 5ea63a973ea37..aa2039a00a550 100644 --- a/llvm/lib/TargetParser/RISCVTargetParser.cpp +++ b/llvm/lib/TargetParser/RISCVTargetParser.cpp @@ -12,7 +12,11 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/SetOperations.h" +#include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/StringSet.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/TargetParser/RISCVISAInfo.h" @@ -145,6 +149,101 @@ void getFeaturesForCPU(StringRef CPU, EnabledFeatures.push_back(F.substr(1)); } +using RISCVImpliedTuneFeature = std::pair<const char *, const char *>; + +#define GET_TUNE_FEATURES +#define GET_IMPLIED_TUNE_FEATURES +#include "llvm/TargetParser/RISCVTargetParserDef.inc" + +void getAllTuneFeatures(SmallVectorImpl<StringRef> &Features) { + Features.assign(std::begin(TuneFeatures), std::end(TuneFeatures)); +} + +Error parseTuneFeatureString(StringRef TFString, + SmallVectorImpl<std::string> &ResFeatures) { + const StringSet<> AllTuneFeatureSet(llvm::from_range, TuneFeatures); + using SmallStringSet = SmallSet<StringRef, 4>; + + TFString = TFString.trim(); + // Note: StringSet is not really ergnomic to use in this case here. + SmallStringSet PositiveFeatures; + SmallStringSet NegativeFeatures; + // Phase 1: Collect explicit features. + StringRef FeatureStr; + do { + std::tie(FeatureStr, TFString) = TFString.split(","); + if (AllTuneFeatureSet.count(FeatureStr)) { + if (!PositiveFeatures.insert(FeatureStr).second) + return createStringError(inconvertibleErrorCode(), + "cannot specify more than one instance of '" + + Twine(FeatureStr) + "'"); + } else if (FeatureStr.starts_with("no-")) { + // Check if this is a negative feature, like `no-foo` for `foo`. + StringRef ActualFeature = FeatureStr.drop_front(3); + if (AllTuneFeatureSet.count(ActualFeature)) { + if (!NegativeFeatures.insert(ActualFeature).second) + return createStringError( + inconvertibleErrorCode(), + "cannot specify more than one instance of '" + Twine(FeatureStr) + + "'"); + } + } else { + return createStringError(inconvertibleErrorCode(), + "unrecognized tune feature directive '" + + Twine(FeatureStr) + "'"); + } + } while (!TFString.empty()); + + auto Intersection = + llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' cannot appear in both " + "positive and negative directives"); + } + + // Phase 2: Derive implied features. + StringMap<SmallVector<StringRef, 2>> ImpliedFeatureMap; + StringMap<SmallVector<StringRef, 2>> InverseImpliedFeatureMap; + for (auto [Feature, ImpliedFeature] : ImpliedTuneFeatures) { + ImpliedFeatureMap[Feature].push_back(ImpliedFeature); + InverseImpliedFeatureMap[ImpliedFeature].push_back(Feature); + } + + for (StringRef PF : PositiveFeatures) { + auto ItFeatures = ImpliedFeatureMap.find(PF); + if (ItFeatures != ImpliedFeatureMap.end()) + PositiveFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + for (StringRef NF : NegativeFeatures) { + auto ItFeatures = InverseImpliedFeatureMap.find(NF); + if (ItFeatures != InverseImpliedFeatureMap.end()) + NegativeFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + + Intersection = llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' were implied by both " + "positive and negative directives"); + } + + // Export the result. + const std::string PosPrefix("+"); + const std::string NegPrefix("-"); + for (StringRef PF : PositiveFeatures) + ResFeatures.emplace_back(PosPrefix + PF.str()); + for (StringRef NF : NegativeFeatures) + ResFeatures.emplace_back(NegPrefix + NF.str()); + + return Error::success(); +} } // namespace RISCV namespace RISCVVType { diff --git a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp index 63ac8f993ecdc..e85b08a5df5ff 100644 --- a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp +++ b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp @@ -7,6 +7,8 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/SmallVector.h" #include "gtest/gtest.h" using namespace llvm; @@ -30,4 +32,90 @@ TEST(RISCVVType, CheckSameRatioLMUL) { EXPECT_EQ(RISCVVType::LMUL_F2, RISCVVType::getSameRatioLMUL(8, RISCVVType::LMUL_F4, 16)); } + +TEST(RISCVTuneFeature, AllTuneFeatures) { + SmallVector<StringRef> AllTuneFeatures; + RISCV::getAllTuneFeatures(AllTuneFeatures); + EXPECT_EQ(AllTuneFeatures.size(), 28U); + for (auto F : {"conditional-cmv-fusion", + "dlen-factor-2", + "disable-latency-sched-heuristic", + "disable-misched-load-clustering", + "disable-misched-store-clustering", + "disable-postmisched-load-clustering", + "disable-postmisched-store-clustering", + "single-element-vec-fp64", + "log-vrgather", + "no-default-unroll", + "no-sink-splat-operands", + "optimized-nf2-segment-load-store", + "optimized-nf3-segment-load-store", + "optimized-nf4-segment-load-store", + "optimized-nf5-segment-load-store", + "optimized-nf6-segment-load-store", + "optimized-nf7-segment-load-store", + "optimized-nf8-segment-load-store", + "optimized-zero-stride-load", + "use-postra-scheduler", + "predictable-select-expensive", + "prefer-vsetvli-over-read-vlenb", + "prefer-w-inst", + "short-forward-branch-i-minmax", + "short-forward-branch-i-mul", + "short-forward-branch-opt", + "vl-dependent-latency", + "vxrm-pipeline-flush"}) + EXPECT_TRUE(is_contained(AllTuneFeatures, F)); +} + +TEST(RISCVTuneFeature, LegalTuneFeatureStrings) { + SmallVector<std::string> Result; + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "log-vrgather,no-short-forward-branch-opt,vl-dependent-latency", + Result))); + EXPECT_TRUE(is_contained(Result, "+log-vrgather")); + EXPECT_TRUE(is_contained(Result, "+vl-dependent-latency")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-short-forward-branch-i-mul,short-forward-branch-i-minmax", Result))); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-no-default-unroll,no-sink-splat-operands", Result))); + EXPECT_TRUE(is_contained(Result, "+no-sink-splat-operands")); + EXPECT_TRUE(is_contained(Result, "-no-default-unroll")); +} + +TEST(RISCVTuneFeature, UnrecognizedTuneFeature) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("32bit", Result)), + "unrecognized tune feature directive '32bit'"); +} + +TEST(RISCVTuneFeature, DuplicatedFeatures) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("log-vrgather,log-vrgather", + Result)), + "cannot specify more than one instance of 'log-vrgather'"); + + EXPECT_EQ(toString(RISCV::parseTuneFeatureString( + "log-vrgather,no-log-vrgather,short-forward-branch-i-mul,no-" + "short-forward-branch-i-mul", + Result)), + "Feature(s) 'log-vrgather', 'short-forward-branch-i-mul' cannot " + "appear in both positive and negative directives"); + + EXPECT_EQ( + toString(RISCV::parseTuneFeatureString( + "short-forward-branch-i-mul,no-short-forward-branch-opt", Result)), + "Feature(s) 'short-forward-branch-i-mul', 'short-forward-branch-opt' " + "were implied by both positive and negative directives"); +} } // namespace diff --git a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp index f7959376adc4a..7420fa439ecd4 100644 --- a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp +++ b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp @@ -14,6 +14,7 @@ #include "llvm/ADT/DenseSet.h" #include "llvm/Support/Format.h" #include "llvm/Support/RISCVISAUtils.h" +#include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" #include "llvm/TableGen/TableGenBackend.h" @@ -259,7 +260,52 @@ static void emitRISCVExtensionBitmask(const RecordKeeper &RK, raw_ostream &OS) { << "},\n"; } OS << "};\n"; - OS << "#endif\n"; + OS << "#endif\n\n"; +} + +static void emitRISCVTuneFeatures(const RecordKeeper &RK, raw_ostream &OS) { + std::vector<const Record *> TuneFeatureRecords = + RK.getAllDerivedDefinitionsIfDefined("RISCVTuneFeature"); + + SmallVector<StringRef> TuneFeatures; + // A list of {Feature -> Implied Feature} + SmallVector<std::pair<StringRef, StringRef>> ImpliedFeatures; + for (const auto *R : TuneFeatureRecords) { + StringRef FeatureName = R->getValueAsString("Name"); + TuneFeatures.push_back(FeatureName); + std::vector<const Record *> Implies = R->getValueAsListOfDefs("Implies"); + for (const auto *ImpliedRecord : Implies) { + if (!ImpliedRecord->isSubClassOf("RISCVTuneFeature") || + ImpliedRecord == R) { + PrintError(ImpliedRecord,... [truncated] 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

2 participants