- Notifications
You must be signed in to change notification settings - Fork 15.2k
[RISCV] Introduce a new tune feature string syntax and its parser #168160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-Authored-By: Sam Elliott <aelliott@qti.qualcomm.com>
| @llvm/pr-subscribers-backend-risc-v Author: Min-Yih Hsu (mshockwave) ChangesThis patch creates a new syntax intended for to reuse the tune CPU definition of The grammar is simple: Where Directives with a "no-" prefix are considered negative ones -- they are supposed to only subtract subtarget features from the base tune CPU. Otherwise, a directive is considered a positive one, which adds a new subtarget feature. None of the subtarget features generated from positive directives -- either explicitly or implicitly (i.e. implied features) -- should overlaps with those generated from the negative directives. This is a deliberated decision to simplify the logics and prevent it from becoming unmanageable in the future. Credit to Sam for the rules and semantics describes in #162716 (comment) of which this PR builds on top. I still need to add documentation. There will also be another follow-up patch to integrate this into Clang. Patch is 21.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168160.diff 5 Files Affected:
diff --git a/llvm/include/llvm/TargetParser/RISCVTargetParser.h b/llvm/include/llvm/TargetParser/RISCVTargetParser.h index 2ac58a539d5ee..2c6a59a59a993 100644 --- a/llvm/include/llvm/TargetParser/RISCVTargetParser.h +++ b/llvm/include/llvm/TargetParser/RISCVTargetParser.h @@ -16,6 +16,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/Support/Compiler.h" +#include "llvm/Support/Error.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" @@ -54,6 +55,9 @@ static constexpr unsigned RVVBytesPerBlock = RVVBitsPerBlock / 8; LLVM_ABI void getFeaturesForCPU(StringRef CPU, SmallVectorImpl<std::string> &EnabledFeatures, bool NeedPlus = false); +LLVM_ABI void getAllTuneFeatures(SmallVectorImpl<StringRef> &TuneFeatures); +LLVM_ABI Error parseTuneFeatureString( + StringRef TFString, SmallVectorImpl<std::string> &TuneFeatures); LLVM_ABI bool parseCPU(StringRef CPU, bool IsRV64); LLVM_ABI bool parseTuneCPU(StringRef CPU, bool IsRV64); LLVM_ABI StringRef getMArchFromMcpu(StringRef CPU); diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td index 0b964c4808d8a..ef5203445b55c 100644 --- a/llvm/lib/Target/RISCV/RISCVFeatures.td +++ b/llvm/lib/Target/RISCV/RISCVFeatures.td @@ -1740,6 +1740,10 @@ def HasVendorXSMTVDot // LLVM specific features and extensions //===----------------------------------------------------------------------===// +class RISCVTuneFeature<string name, string field_name, string value, + string description, list<SubtargetFeature> implied = []> + : SubtargetFeature<name, field_name, value, description, implied>; + // Feature32Bit exists to mark CPUs that support RV32 to distinguish them from // tuning CPU names. def Feature32Bit @@ -1788,46 +1792,46 @@ def FeatureUnalignedVectorMem "loads and stores">; def TuneNLogNVRGather - : SubtargetFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", + : RISCVTuneFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", "Has vrgather.vv with LMUL*log2(LMUL) latency">; -def TunePostRAScheduler : SubtargetFeature<"use-postra-scheduler", +def TunePostRAScheduler : RISCVTuneFeature<"use-postra-scheduler", "UsePostRAScheduler", "true", "Schedule again after register allocation">; -def TuneDisableMISchedLoadClustering : SubtargetFeature<"disable-misched-load-clustering", +def TuneDisableMISchedLoadClustering : RISCVTuneFeature<"disable-misched-load-clustering", "EnableMISchedLoadClustering", "false", "Disable load clustering in the machine scheduler">; -def TuneDisableMISchedStoreClustering : SubtargetFeature<"disable-misched-store-clustering", +def TuneDisableMISchedStoreClustering : RISCVTuneFeature<"disable-misched-store-clustering", "EnableMISchedStoreClustering", "false", "Disable store clustering in the machine scheduler">; -def TuneDisablePostMISchedLoadClustering : SubtargetFeature<"disable-postmisched-load-clustering", +def TuneDisablePostMISchedLoadClustering : RISCVTuneFeature<"disable-postmisched-load-clustering", "EnablePostMISchedLoadClustering", "false", "Disable PostRA load clustering in the machine scheduler">; -def TuneDisablePostMISchedStoreClustering : SubtargetFeature<"disable-postmisched-store-clustering", +def TuneDisablePostMISchedStoreClustering : RISCVTuneFeature<"disable-postmisched-store-clustering", "EnablePostMISchedStoreClustering", "false", "Disable PostRA store clustering in the machine scheduler">; def TuneDisableLatencySchedHeuristic - : SubtargetFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", + : RISCVTuneFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", "Disable latency scheduling heuristic">; def TunePredictableSelectIsExpensive - : SubtargetFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", + : RISCVTuneFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", "Prefer likely predicted branches over selects">; def TuneOptimizedZeroStrideLoad - : SubtargetFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", + : RISCVTuneFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", "true", "Optimized (perform fewer memory operations)" "zero-stride vector load">; foreach nf = {2-8} in def TuneOptimizedNF#nf#SegmentLoadStore : - SubtargetFeature<"optimized-nf"#nf#"-segment-load-store", + RISCVTuneFeature<"optimized-nf"#nf#"-segment-load-store", "HasOptimizedNF"#nf#"SegmentLoadStore", "true", "vlseg"#nf#"eN.v and vsseg"#nf#"eN.v are " "implemented as a wide memory op and shuffle">; def TuneVLDependentLatency - : SubtargetFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", + : RISCVTuneFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", "Latency of vector instructions is dependent on the " "dynamic value of vl">; @@ -1839,50 +1843,50 @@ def Experimental // and instead split over multiple cycles. DLEN refers to the datapath width // that can be done in parallel. def TuneDLenFactor2 - : SubtargetFeature<"dlen-factor-2", "DLenFactor2", "true", + : RISCVTuneFeature<"dlen-factor-2", "DLenFactor2", "true", "Vector unit DLEN(data path width) is half of VLEN">; def TuneNoDefaultUnroll - : SubtargetFeature<"no-default-unroll", "EnableDefaultUnroll", "false", + : RISCVTuneFeature<"no-default-unroll", "EnableDefaultUnroll", "false", "Disable default unroll preference.">; // SiFive 7 is able to fuse integer ALU operations with a preceding branch // instruction. def TuneShortForwardBranchOpt - : SubtargetFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", + : RISCVTuneFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", "true", "Enable short forward branch optimization">; def HasShortForwardBranchOpt : Predicate<"Subtarget->hasShortForwardBranchOpt()">; def NoShortForwardBranchOpt : Predicate<"!Subtarget->hasShortForwardBranchOpt()">; def TuneShortForwardBranchIMinMax - : SubtargetFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", + : RISCVTuneFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", "true", "Enable short forward branch optimization for min,max instructions in Zbb", [TuneShortForwardBranchOpt]>; def TuneShortForwardBranchIMul - : SubtargetFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", + : RISCVTuneFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", "true", "Enable short forward branch optimization for mul instruction", [TuneShortForwardBranchOpt]>; // Some subtargets require a S2V transfer buffer to move scalars into vectors. // FIXME: Forming .vx/.vf/.wx/.wf can reduce register pressure. def TuneNoSinkSplatOperands - : SubtargetFeature<"no-sink-splat-operands", "SinkSplatOperands", + : RISCVTuneFeature<"no-sink-splat-operands", "SinkSplatOperands", "false", "Disable sink splat operands to enable .vx, .vf," ".wx, and .wf instructions">; def TunePreferWInst - : SubtargetFeature<"prefer-w-inst", "PreferWInst", "true", + : RISCVTuneFeature<"prefer-w-inst", "PreferWInst", "true", "Prefer instructions with W suffix">; def TuneConditionalCompressedMoveFusion - : SubtargetFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", + : RISCVTuneFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", "true", "Enable branch+c.mv fusion">; def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">; def NoConditionalMoveFusion : Predicate<"!Subtarget->hasConditionalMoveFusion()">; def TuneHasSingleElementVecFP64 - : SubtargetFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", + : RISCVTuneFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", "Certain vector FP64 operations produce a single result " "element per cycle">; @@ -1899,11 +1903,11 @@ def TuneVentanaVeyron : SubtargetFeature<"ventana-veyron", "RISCVProcFamily", "V def TuneAndes45 : SubtargetFeature<"andes45", "RISCVProcFamily", "Andes45", "Andes 45-Series processors">; -def TuneVXRMPipelineFlush : SubtargetFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", +def TuneVXRMPipelineFlush : RISCVTuneFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", "true", "VXRM writes causes pipeline flush">; def TunePreferVsetvliOverReadVLENB - : SubtargetFeature<"prefer-vsetvli-over-read-vlenb", + : RISCVTuneFeature<"prefer-vsetvli-over-read-vlenb", "PreferVsetvliOverReadVLENB", "true", "Prefer vsetvli over read vlenb CSR to calculate VLEN">; diff --git a/llvm/lib/TargetParser/RISCVTargetParser.cpp b/llvm/lib/TargetParser/RISCVTargetParser.cpp index 5ea63a973ea37..aa2039a00a550 100644 --- a/llvm/lib/TargetParser/RISCVTargetParser.cpp +++ b/llvm/lib/TargetParser/RISCVTargetParser.cpp @@ -12,7 +12,11 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/SetOperations.h" +#include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/StringSet.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/TargetParser/RISCVISAInfo.h" @@ -145,6 +149,101 @@ void getFeaturesForCPU(StringRef CPU, EnabledFeatures.push_back(F.substr(1)); } +using RISCVImpliedTuneFeature = std::pair<const char *, const char *>; + +#define GET_TUNE_FEATURES +#define GET_IMPLIED_TUNE_FEATURES +#include "llvm/TargetParser/RISCVTargetParserDef.inc" + +void getAllTuneFeatures(SmallVectorImpl<StringRef> &Features) { + Features.assign(std::begin(TuneFeatures), std::end(TuneFeatures)); +} + +Error parseTuneFeatureString(StringRef TFString, + SmallVectorImpl<std::string> &ResFeatures) { + const StringSet<> AllTuneFeatureSet(llvm::from_range, TuneFeatures); + using SmallStringSet = SmallSet<StringRef, 4>; + + TFString = TFString.trim(); + // Note: StringSet is not really ergnomic to use in this case here. + SmallStringSet PositiveFeatures; + SmallStringSet NegativeFeatures; + // Phase 1: Collect explicit features. + StringRef FeatureStr; + do { + std::tie(FeatureStr, TFString) = TFString.split(","); + if (AllTuneFeatureSet.count(FeatureStr)) { + if (!PositiveFeatures.insert(FeatureStr).second) + return createStringError(inconvertibleErrorCode(), + "cannot specify more than one instance of '" + + Twine(FeatureStr) + "'"); + } else if (FeatureStr.starts_with("no-")) { + // Check if this is a negative feature, like `no-foo` for `foo`. + StringRef ActualFeature = FeatureStr.drop_front(3); + if (AllTuneFeatureSet.count(ActualFeature)) { + if (!NegativeFeatures.insert(ActualFeature).second) + return createStringError( + inconvertibleErrorCode(), + "cannot specify more than one instance of '" + Twine(FeatureStr) + + "'"); + } + } else { + return createStringError(inconvertibleErrorCode(), + "unrecognized tune feature directive '" + + Twine(FeatureStr) + "'"); + } + } while (!TFString.empty()); + + auto Intersection = + llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' cannot appear in both " + "positive and negative directives"); + } + + // Phase 2: Derive implied features. + StringMap<SmallVector<StringRef, 2>> ImpliedFeatureMap; + StringMap<SmallVector<StringRef, 2>> InverseImpliedFeatureMap; + for (auto [Feature, ImpliedFeature] : ImpliedTuneFeatures) { + ImpliedFeatureMap[Feature].push_back(ImpliedFeature); + InverseImpliedFeatureMap[ImpliedFeature].push_back(Feature); + } + + for (StringRef PF : PositiveFeatures) { + auto ItFeatures = ImpliedFeatureMap.find(PF); + if (ItFeatures != ImpliedFeatureMap.end()) + PositiveFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + for (StringRef NF : NegativeFeatures) { + auto ItFeatures = InverseImpliedFeatureMap.find(NF); + if (ItFeatures != InverseImpliedFeatureMap.end()) + NegativeFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + + Intersection = llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' were implied by both " + "positive and negative directives"); + } + + // Export the result. + const std::string PosPrefix("+"); + const std::string NegPrefix("-"); + for (StringRef PF : PositiveFeatures) + ResFeatures.emplace_back(PosPrefix + PF.str()); + for (StringRef NF : NegativeFeatures) + ResFeatures.emplace_back(NegPrefix + NF.str()); + + return Error::success(); +} } // namespace RISCV namespace RISCVVType { diff --git a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp index 63ac8f993ecdc..e85b08a5df5ff 100644 --- a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp +++ b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp @@ -7,6 +7,8 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/SmallVector.h" #include "gtest/gtest.h" using namespace llvm; @@ -30,4 +32,90 @@ TEST(RISCVVType, CheckSameRatioLMUL) { EXPECT_EQ(RISCVVType::LMUL_F2, RISCVVType::getSameRatioLMUL(8, RISCVVType::LMUL_F4, 16)); } + +TEST(RISCVTuneFeature, AllTuneFeatures) { + SmallVector<StringRef> AllTuneFeatures; + RISCV::getAllTuneFeatures(AllTuneFeatures); + EXPECT_EQ(AllTuneFeatures.size(), 28U); + for (auto F : {"conditional-cmv-fusion", + "dlen-factor-2", + "disable-latency-sched-heuristic", + "disable-misched-load-clustering", + "disable-misched-store-clustering", + "disable-postmisched-load-clustering", + "disable-postmisched-store-clustering", + "single-element-vec-fp64", + "log-vrgather", + "no-default-unroll", + "no-sink-splat-operands", + "optimized-nf2-segment-load-store", + "optimized-nf3-segment-load-store", + "optimized-nf4-segment-load-store", + "optimized-nf5-segment-load-store", + "optimized-nf6-segment-load-store", + "optimized-nf7-segment-load-store", + "optimized-nf8-segment-load-store", + "optimized-zero-stride-load", + "use-postra-scheduler", + "predictable-select-expensive", + "prefer-vsetvli-over-read-vlenb", + "prefer-w-inst", + "short-forward-branch-i-minmax", + "short-forward-branch-i-mul", + "short-forward-branch-opt", + "vl-dependent-latency", + "vxrm-pipeline-flush"}) + EXPECT_TRUE(is_contained(AllTuneFeatures, F)); +} + +TEST(RISCVTuneFeature, LegalTuneFeatureStrings) { + SmallVector<std::string> Result; + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "log-vrgather,no-short-forward-branch-opt,vl-dependent-latency", + Result))); + EXPECT_TRUE(is_contained(Result, "+log-vrgather")); + EXPECT_TRUE(is_contained(Result, "+vl-dependent-latency")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-short-forward-branch-i-mul,short-forward-branch-i-minmax", Result))); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-no-default-unroll,no-sink-splat-operands", Result))); + EXPECT_TRUE(is_contained(Result, "+no-sink-splat-operands")); + EXPECT_TRUE(is_contained(Result, "-no-default-unroll")); +} + +TEST(RISCVTuneFeature, UnrecognizedTuneFeature) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("32bit", Result)), + "unrecognized tune feature directive '32bit'"); +} + +TEST(RISCVTuneFeature, DuplicatedFeatures) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("log-vrgather,log-vrgather", + Result)), + "cannot specify more than one instance of 'log-vrgather'"); + + EXPECT_EQ(toString(RISCV::parseTuneFeatureString( + "log-vrgather,no-log-vrgather,short-forward-branch-i-mul,no-" + "short-forward-branch-i-mul", + Result)), + "Feature(s) 'log-vrgather', 'short-forward-branch-i-mul' cannot " + "appear in both positive and negative directives"); + + EXPECT_EQ( + toString(RISCV::parseTuneFeatureString( + "short-forward-branch-i-mul,no-short-forward-branch-opt", Result)), + "Feature(s) 'short-forward-branch-i-mul', 'short-forward-branch-opt' " + "were implied by both positive and negative directives"); +} } // namespace diff --git a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp index f7959376adc4a..7420fa439ecd4 100644 --- a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp +++ b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp @@ -14,6 +14,7 @@ #include "llvm/ADT/DenseSet.h" #include "llvm/Support/Format.h" #include "llvm/Support/RISCVISAUtils.h" +#include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" #include "llvm/TableGen/TableGenBackend.h" @@ -259,7 +260,52 @@ static void emitRISCVExtensionBitmask(const RecordKeeper &RK, raw_ostream &OS) { << "},\n"; } OS << "};\n"; - OS << "#endif\n"; + OS << "#endif\n\n"; +} + +static void emitRISCVTuneFeatures(const RecordKeeper &RK, raw_ostream &OS) { + std::vector<const Record *> TuneFeatureRecords = + RK.getAllDerivedDefinitionsIfDefined("RISCVTuneFeature"); + + SmallVector<StringRef> TuneFeatures; + // A list of {Feature -> Implied Feature} + SmallVector<std::pair<StringRef, StringRef>> ImpliedFeatures; + for (const auto *R : TuneFeatureRecords) { + StringRef FeatureName = R->getValueAsString("Name"); + TuneFeatures.push_back(FeatureName); + std::vector<const Record *> Implies = R->getValueAsListOfDefs("Implies"); + for (const auto *ImpliedRecord : Implies) { + if (!ImpliedRecord->isSubClassOf("RISCVTuneFeature") || + ImpliedRecord == R) { + PrintError(ImpliedRecord,... [truncated] |
| @llvm/pr-subscribers-tablegen Author: Min-Yih Hsu (mshockwave) ChangesThis patch creates a new syntax intended for to reuse the tune CPU definition of The grammar is simple: Where Directives with a "no-" prefix are considered negative ones -- they are supposed to only subtract subtarget features from the base tune CPU. Otherwise, a directive is considered a positive one, which adds a new subtarget feature. None of the subtarget features generated from positive directives -- either explicitly or implicitly (i.e. implied features) -- should overlaps with those generated from the negative directives. This is a deliberated decision to simplify the logics and prevent it from becoming unmanageable in the future. Credit to Sam for the rules and semantics describes in #162716 (comment) of which this PR builds on top. I still need to add documentation. There will also be another follow-up patch to integrate this into Clang. Patch is 21.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/168160.diff 5 Files Affected:
diff --git a/llvm/include/llvm/TargetParser/RISCVTargetParser.h b/llvm/include/llvm/TargetParser/RISCVTargetParser.h index 2ac58a539d5ee..2c6a59a59a993 100644 --- a/llvm/include/llvm/TargetParser/RISCVTargetParser.h +++ b/llvm/include/llvm/TargetParser/RISCVTargetParser.h @@ -16,6 +16,7 @@ #include "llvm/ADT/StringRef.h" #include "llvm/Support/Compiler.h" +#include "llvm/Support/Error.h" #include "llvm/Support/MathExtras.h" #include "llvm/Support/raw_ostream.h" @@ -54,6 +55,9 @@ static constexpr unsigned RVVBytesPerBlock = RVVBitsPerBlock / 8; LLVM_ABI void getFeaturesForCPU(StringRef CPU, SmallVectorImpl<std::string> &EnabledFeatures, bool NeedPlus = false); +LLVM_ABI void getAllTuneFeatures(SmallVectorImpl<StringRef> &TuneFeatures); +LLVM_ABI Error parseTuneFeatureString( + StringRef TFString, SmallVectorImpl<std::string> &TuneFeatures); LLVM_ABI bool parseCPU(StringRef CPU, bool IsRV64); LLVM_ABI bool parseTuneCPU(StringRef CPU, bool IsRV64); LLVM_ABI StringRef getMArchFromMcpu(StringRef CPU); diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td index 0b964c4808d8a..ef5203445b55c 100644 --- a/llvm/lib/Target/RISCV/RISCVFeatures.td +++ b/llvm/lib/Target/RISCV/RISCVFeatures.td @@ -1740,6 +1740,10 @@ def HasVendorXSMTVDot // LLVM specific features and extensions //===----------------------------------------------------------------------===// +class RISCVTuneFeature<string name, string field_name, string value, + string description, list<SubtargetFeature> implied = []> + : SubtargetFeature<name, field_name, value, description, implied>; + // Feature32Bit exists to mark CPUs that support RV32 to distinguish them from // tuning CPU names. def Feature32Bit @@ -1788,46 +1792,46 @@ def FeatureUnalignedVectorMem "loads and stores">; def TuneNLogNVRGather - : SubtargetFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", + : RISCVTuneFeature<"log-vrgather", "RISCVVRGatherCostModel", "NLog2N", "Has vrgather.vv with LMUL*log2(LMUL) latency">; -def TunePostRAScheduler : SubtargetFeature<"use-postra-scheduler", +def TunePostRAScheduler : RISCVTuneFeature<"use-postra-scheduler", "UsePostRAScheduler", "true", "Schedule again after register allocation">; -def TuneDisableMISchedLoadClustering : SubtargetFeature<"disable-misched-load-clustering", +def TuneDisableMISchedLoadClustering : RISCVTuneFeature<"disable-misched-load-clustering", "EnableMISchedLoadClustering", "false", "Disable load clustering in the machine scheduler">; -def TuneDisableMISchedStoreClustering : SubtargetFeature<"disable-misched-store-clustering", +def TuneDisableMISchedStoreClustering : RISCVTuneFeature<"disable-misched-store-clustering", "EnableMISchedStoreClustering", "false", "Disable store clustering in the machine scheduler">; -def TuneDisablePostMISchedLoadClustering : SubtargetFeature<"disable-postmisched-load-clustering", +def TuneDisablePostMISchedLoadClustering : RISCVTuneFeature<"disable-postmisched-load-clustering", "EnablePostMISchedLoadClustering", "false", "Disable PostRA load clustering in the machine scheduler">; -def TuneDisablePostMISchedStoreClustering : SubtargetFeature<"disable-postmisched-store-clustering", +def TuneDisablePostMISchedStoreClustering : RISCVTuneFeature<"disable-postmisched-store-clustering", "EnablePostMISchedStoreClustering", "false", "Disable PostRA store clustering in the machine scheduler">; def TuneDisableLatencySchedHeuristic - : SubtargetFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", + : RISCVTuneFeature<"disable-latency-sched-heuristic", "DisableLatencySchedHeuristic", "true", "Disable latency scheduling heuristic">; def TunePredictableSelectIsExpensive - : SubtargetFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", + : RISCVTuneFeature<"predictable-select-expensive", "PredictableSelectIsExpensive", "true", "Prefer likely predicted branches over selects">; def TuneOptimizedZeroStrideLoad - : SubtargetFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", + : RISCVTuneFeature<"optimized-zero-stride-load", "HasOptimizedZeroStrideLoad", "true", "Optimized (perform fewer memory operations)" "zero-stride vector load">; foreach nf = {2-8} in def TuneOptimizedNF#nf#SegmentLoadStore : - SubtargetFeature<"optimized-nf"#nf#"-segment-load-store", + RISCVTuneFeature<"optimized-nf"#nf#"-segment-load-store", "HasOptimizedNF"#nf#"SegmentLoadStore", "true", "vlseg"#nf#"eN.v and vsseg"#nf#"eN.v are " "implemented as a wide memory op and shuffle">; def TuneVLDependentLatency - : SubtargetFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", + : RISCVTuneFeature<"vl-dependent-latency", "HasVLDependentLatency", "true", "Latency of vector instructions is dependent on the " "dynamic value of vl">; @@ -1839,50 +1843,50 @@ def Experimental // and instead split over multiple cycles. DLEN refers to the datapath width // that can be done in parallel. def TuneDLenFactor2 - : SubtargetFeature<"dlen-factor-2", "DLenFactor2", "true", + : RISCVTuneFeature<"dlen-factor-2", "DLenFactor2", "true", "Vector unit DLEN(data path width) is half of VLEN">; def TuneNoDefaultUnroll - : SubtargetFeature<"no-default-unroll", "EnableDefaultUnroll", "false", + : RISCVTuneFeature<"no-default-unroll", "EnableDefaultUnroll", "false", "Disable default unroll preference.">; // SiFive 7 is able to fuse integer ALU operations with a preceding branch // instruction. def TuneShortForwardBranchOpt - : SubtargetFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", + : RISCVTuneFeature<"short-forward-branch-opt", "HasShortForwardBranchOpt", "true", "Enable short forward branch optimization">; def HasShortForwardBranchOpt : Predicate<"Subtarget->hasShortForwardBranchOpt()">; def NoShortForwardBranchOpt : Predicate<"!Subtarget->hasShortForwardBranchOpt()">; def TuneShortForwardBranchIMinMax - : SubtargetFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", + : RISCVTuneFeature<"short-forward-branch-i-minmax", "HasShortForwardBranchIMinMax", "true", "Enable short forward branch optimization for min,max instructions in Zbb", [TuneShortForwardBranchOpt]>; def TuneShortForwardBranchIMul - : SubtargetFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", + : RISCVTuneFeature<"short-forward-branch-i-mul", "HasShortForwardBranchIMul", "true", "Enable short forward branch optimization for mul instruction", [TuneShortForwardBranchOpt]>; // Some subtargets require a S2V transfer buffer to move scalars into vectors. // FIXME: Forming .vx/.vf/.wx/.wf can reduce register pressure. def TuneNoSinkSplatOperands - : SubtargetFeature<"no-sink-splat-operands", "SinkSplatOperands", + : RISCVTuneFeature<"no-sink-splat-operands", "SinkSplatOperands", "false", "Disable sink splat operands to enable .vx, .vf," ".wx, and .wf instructions">; def TunePreferWInst - : SubtargetFeature<"prefer-w-inst", "PreferWInst", "true", + : RISCVTuneFeature<"prefer-w-inst", "PreferWInst", "true", "Prefer instructions with W suffix">; def TuneConditionalCompressedMoveFusion - : SubtargetFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", + : RISCVTuneFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion", "true", "Enable branch+c.mv fusion">; def HasConditionalMoveFusion : Predicate<"Subtarget->hasConditionalMoveFusion()">; def NoConditionalMoveFusion : Predicate<"!Subtarget->hasConditionalMoveFusion()">; def TuneHasSingleElementVecFP64 - : SubtargetFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", + : RISCVTuneFeature<"single-element-vec-fp64", "HasSingleElementVectorFP64", "true", "Certain vector FP64 operations produce a single result " "element per cycle">; @@ -1899,11 +1903,11 @@ def TuneVentanaVeyron : SubtargetFeature<"ventana-veyron", "RISCVProcFamily", "V def TuneAndes45 : SubtargetFeature<"andes45", "RISCVProcFamily", "Andes45", "Andes 45-Series processors">; -def TuneVXRMPipelineFlush : SubtargetFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", +def TuneVXRMPipelineFlush : RISCVTuneFeature<"vxrm-pipeline-flush", "HasVXRMPipelineFlush", "true", "VXRM writes causes pipeline flush">; def TunePreferVsetvliOverReadVLENB - : SubtargetFeature<"prefer-vsetvli-over-read-vlenb", + : RISCVTuneFeature<"prefer-vsetvli-over-read-vlenb", "PreferVsetvliOverReadVLENB", "true", "Prefer vsetvli over read vlenb CSR to calculate VLEN">; diff --git a/llvm/lib/TargetParser/RISCVTargetParser.cpp b/llvm/lib/TargetParser/RISCVTargetParser.cpp index 5ea63a973ea37..aa2039a00a550 100644 --- a/llvm/lib/TargetParser/RISCVTargetParser.cpp +++ b/llvm/lib/TargetParser/RISCVTargetParser.cpp @@ -12,7 +12,11 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/SetOperations.h" +#include "llvm/ADT/SmallSet.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/StringExtras.h" +#include "llvm/ADT/StringSet.h" #include "llvm/ADT/StringSwitch.h" #include "llvm/TargetParser/RISCVISAInfo.h" @@ -145,6 +149,101 @@ void getFeaturesForCPU(StringRef CPU, EnabledFeatures.push_back(F.substr(1)); } +using RISCVImpliedTuneFeature = std::pair<const char *, const char *>; + +#define GET_TUNE_FEATURES +#define GET_IMPLIED_TUNE_FEATURES +#include "llvm/TargetParser/RISCVTargetParserDef.inc" + +void getAllTuneFeatures(SmallVectorImpl<StringRef> &Features) { + Features.assign(std::begin(TuneFeatures), std::end(TuneFeatures)); +} + +Error parseTuneFeatureString(StringRef TFString, + SmallVectorImpl<std::string> &ResFeatures) { + const StringSet<> AllTuneFeatureSet(llvm::from_range, TuneFeatures); + using SmallStringSet = SmallSet<StringRef, 4>; + + TFString = TFString.trim(); + // Note: StringSet is not really ergnomic to use in this case here. + SmallStringSet PositiveFeatures; + SmallStringSet NegativeFeatures; + // Phase 1: Collect explicit features. + StringRef FeatureStr; + do { + std::tie(FeatureStr, TFString) = TFString.split(","); + if (AllTuneFeatureSet.count(FeatureStr)) { + if (!PositiveFeatures.insert(FeatureStr).second) + return createStringError(inconvertibleErrorCode(), + "cannot specify more than one instance of '" + + Twine(FeatureStr) + "'"); + } else if (FeatureStr.starts_with("no-")) { + // Check if this is a negative feature, like `no-foo` for `foo`. + StringRef ActualFeature = FeatureStr.drop_front(3); + if (AllTuneFeatureSet.count(ActualFeature)) { + if (!NegativeFeatures.insert(ActualFeature).second) + return createStringError( + inconvertibleErrorCode(), + "cannot specify more than one instance of '" + Twine(FeatureStr) + + "'"); + } + } else { + return createStringError(inconvertibleErrorCode(), + "unrecognized tune feature directive '" + + Twine(FeatureStr) + "'"); + } + } while (!TFString.empty()); + + auto Intersection = + llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' cannot appear in both " + "positive and negative directives"); + } + + // Phase 2: Derive implied features. + StringMap<SmallVector<StringRef, 2>> ImpliedFeatureMap; + StringMap<SmallVector<StringRef, 2>> InverseImpliedFeatureMap; + for (auto [Feature, ImpliedFeature] : ImpliedTuneFeatures) { + ImpliedFeatureMap[Feature].push_back(ImpliedFeature); + InverseImpliedFeatureMap[ImpliedFeature].push_back(Feature); + } + + for (StringRef PF : PositiveFeatures) { + auto ItFeatures = ImpliedFeatureMap.find(PF); + if (ItFeatures != ImpliedFeatureMap.end()) + PositiveFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + for (StringRef NF : NegativeFeatures) { + auto ItFeatures = InverseImpliedFeatureMap.find(NF); + if (ItFeatures != InverseImpliedFeatureMap.end()) + NegativeFeatures.insert(ItFeatures->second.begin(), + ItFeatures->second.end()); + } + + Intersection = llvm::set_intersection(PositiveFeatures, NegativeFeatures); + if (!Intersection.empty()) { + std::string IntersectedStr = join(Intersection, "', '"); + return createStringError(inconvertibleErrorCode(), + "Feature(s) '" + Twine(IntersectedStr) + + "' were implied by both " + "positive and negative directives"); + } + + // Export the result. + const std::string PosPrefix("+"); + const std::string NegPrefix("-"); + for (StringRef PF : PositiveFeatures) + ResFeatures.emplace_back(PosPrefix + PF.str()); + for (StringRef NF : NegativeFeatures) + ResFeatures.emplace_back(NegPrefix + NF.str()); + + return Error::success(); +} } // namespace RISCV namespace RISCVVType { diff --git a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp index 63ac8f993ecdc..e85b08a5df5ff 100644 --- a/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp +++ b/llvm/unittests/TargetParser/RISCVTargetParserTest.cpp @@ -7,6 +7,8 @@ //===----------------------------------------------------------------------===// #include "llvm/TargetParser/RISCVTargetParser.h" +#include "llvm/ADT/STLExtras.h" +#include "llvm/ADT/SmallVector.h" #include "gtest/gtest.h" using namespace llvm; @@ -30,4 +32,90 @@ TEST(RISCVVType, CheckSameRatioLMUL) { EXPECT_EQ(RISCVVType::LMUL_F2, RISCVVType::getSameRatioLMUL(8, RISCVVType::LMUL_F4, 16)); } + +TEST(RISCVTuneFeature, AllTuneFeatures) { + SmallVector<StringRef> AllTuneFeatures; + RISCV::getAllTuneFeatures(AllTuneFeatures); + EXPECT_EQ(AllTuneFeatures.size(), 28U); + for (auto F : {"conditional-cmv-fusion", + "dlen-factor-2", + "disable-latency-sched-heuristic", + "disable-misched-load-clustering", + "disable-misched-store-clustering", + "disable-postmisched-load-clustering", + "disable-postmisched-store-clustering", + "single-element-vec-fp64", + "log-vrgather", + "no-default-unroll", + "no-sink-splat-operands", + "optimized-nf2-segment-load-store", + "optimized-nf3-segment-load-store", + "optimized-nf4-segment-load-store", + "optimized-nf5-segment-load-store", + "optimized-nf6-segment-load-store", + "optimized-nf7-segment-load-store", + "optimized-nf8-segment-load-store", + "optimized-zero-stride-load", + "use-postra-scheduler", + "predictable-select-expensive", + "prefer-vsetvli-over-read-vlenb", + "prefer-w-inst", + "short-forward-branch-i-minmax", + "short-forward-branch-i-mul", + "short-forward-branch-opt", + "vl-dependent-latency", + "vxrm-pipeline-flush"}) + EXPECT_TRUE(is_contained(AllTuneFeatures, F)); +} + +TEST(RISCVTuneFeature, LegalTuneFeatureStrings) { + SmallVector<std::string> Result; + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "log-vrgather,no-short-forward-branch-opt,vl-dependent-latency", + Result))); + EXPECT_TRUE(is_contained(Result, "+log-vrgather")); + EXPECT_TRUE(is_contained(Result, "+vl-dependent-latency")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-short-forward-branch-i-mul,short-forward-branch-i-minmax", Result))); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-i-minmax")); + EXPECT_TRUE(is_contained(Result, "+short-forward-branch-opt")); + EXPECT_TRUE(is_contained(Result, "-short-forward-branch-i-mul")); + + Result.clear(); + EXPECT_FALSE(errorToBool(RISCV::parseTuneFeatureString( + "no-no-default-unroll,no-sink-splat-operands", Result))); + EXPECT_TRUE(is_contained(Result, "+no-sink-splat-operands")); + EXPECT_TRUE(is_contained(Result, "-no-default-unroll")); +} + +TEST(RISCVTuneFeature, UnrecognizedTuneFeature) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("32bit", Result)), + "unrecognized tune feature directive '32bit'"); +} + +TEST(RISCVTuneFeature, DuplicatedFeatures) { + SmallVector<std::string> Result; + EXPECT_EQ(toString(RISCV::parseTuneFeatureString("log-vrgather,log-vrgather", + Result)), + "cannot specify more than one instance of 'log-vrgather'"); + + EXPECT_EQ(toString(RISCV::parseTuneFeatureString( + "log-vrgather,no-log-vrgather,short-forward-branch-i-mul,no-" + "short-forward-branch-i-mul", + Result)), + "Feature(s) 'log-vrgather', 'short-forward-branch-i-mul' cannot " + "appear in both positive and negative directives"); + + EXPECT_EQ( + toString(RISCV::parseTuneFeatureString( + "short-forward-branch-i-mul,no-short-forward-branch-opt", Result)), + "Feature(s) 'short-forward-branch-i-mul', 'short-forward-branch-opt' " + "were implied by both positive and negative directives"); +} } // namespace diff --git a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp index f7959376adc4a..7420fa439ecd4 100644 --- a/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp +++ b/llvm/utils/TableGen/Basic/RISCVTargetDefEmitter.cpp @@ -14,6 +14,7 @@ #include "llvm/ADT/DenseSet.h" #include "llvm/Support/Format.h" #include "llvm/Support/RISCVISAUtils.h" +#include "llvm/TableGen/Error.h" #include "llvm/TableGen/Record.h" #include "llvm/TableGen/TableGenBackend.h" @@ -259,7 +260,52 @@ static void emitRISCVExtensionBitmask(const RecordKeeper &RK, raw_ostream &OS) { << "},\n"; } OS << "};\n"; - OS << "#endif\n"; + OS << "#endif\n\n"; +} + +static void emitRISCVTuneFeatures(const RecordKeeper &RK, raw_ostream &OS) { + std::vector<const Record *> TuneFeatureRecords = + RK.getAllDerivedDefinitionsIfDefined("RISCVTuneFeature"); + + SmallVector<StringRef> TuneFeatures; + // A list of {Feature -> Implied Feature} + SmallVector<std::pair<StringRef, StringRef>> ImpliedFeatures; + for (const auto *R : TuneFeatureRecords) { + StringRef FeatureName = R->getValueAsString("Name"); + TuneFeatures.push_back(FeatureName); + std::vector<const Record *> Implies = R->getValueAsListOfDefs("Implies"); + for (const auto *ImpliedRecord : Implies) { + if (!ImpliedRecord->isSubClassOf("RISCVTuneFeature") || + ImpliedRecord == R) { + PrintError(ImpliedRecord,... [truncated] |
This patch creates a new syntax intended for
-mtune, such that we can writeto reuse the tune CPU definition of
xyzbut "customize" it by enabling featurefoowhile disabling featurebar. The "foo,no-bar" is the new tune feature string we're introducing here.The grammar is simple:
Where
featureNameis one of the subtarget features that are marked by a special TableGen classRISCVTuneFeature.Directives with a "no-" prefix are considered negative ones -- they are supposed to only subtract subtarget features from the base tune CPU. Otherwise, a directive is considered a positive one, which adds a new subtarget feature.
None of the subtarget features generated from positive directives -- either explicitly or implicitly (i.e. implied features) -- should overlaps with those generated from the negative directives. This is a deliberated decision to simplify the logics and prevent it from becoming unmanageable in the future.
Credit to Sam for the rules and semantics describes in #162716 (comment) of which this PR builds on top.
I still need to add documentation. There will also be another follow-up patch to integrate this into Clang.