Skip to content

Conversation

@tex3d
Copy link
Contributor

@tex3d tex3d commented Mar 25, 2025

This change adds vector and multi-dimensional overload support for DXIL operations.

Multi-dimensional (or "extended") overloads are added, where two or more types in a DXIL Op function signature may vary independently, such as both the return type and a parameter type. Until now, only one overload dimension has been necessary.

For single-dim overloads, any number of parameters in a DXIL op may refer to this single overload type.
For multi-dim overloads, each type that can vary must have a unique overload dimension, even when two or more types must be the same. This follows a pattern from llvm intrinsics. If two or more of the types need to be the same, this constraint must be handled manually, outside the automatic overload constraints defined by the DXIL op definitions.

Vector overloads are also added, requiring an additional set of scalar overload types to define the allowed vector element types, on top of the original set describing the allowed scalar overloads for an operation, since both scalar and vector overloads may be allowed on the same operation.

There are several components involved in handling DXIL operation overloads, with some changes:

  • DXIL Op definitions in hctdb.py use a string of characters to define the allowed overloads, and special type names used in parameter definitions that refer to the overload type.
    • Overload string syntax updated and more heavily validated.
    • ',' may separate dimensions for multi-dim overloads
    • '<' indicates that a vector overload is allowed, in which case, scalar components on the left indicate normal scalar overloads allowed, and scalar components on the right indicate the allowed vector element overloads.
      • If scalar overloads are present to the left, and omitted to the right, the scalar components are replicated to the right automatically. For instance: "hf<" is equivalent to "hf<hf".
    • dxil_max_overload_dims = 2 is introduced to define the maximum number of overload dimensions currently supported.
      • This is used to generate the DXIL::kDxilMaxOloadDims definition in DxilConstants.h.
    • "$x0" and "$x1" are used to reference each overloaded dxil type in parameter definitions when more than one overload dimension is defined for a DXIL op. Other special overload types are not allowed for multi-dim overloads, which means you cannot (currently) describe a multi-dim overload where a returned overload type is wrapped in a resource return struct along with residency status. This could be changed in the future if necessary.
    • Enforced rules for multi-dim overloads keep them compatible with the llvm intrinsic overloading scheme.
  • hctdb_instrhelp.py translates overload and param type info from DXIL operation definitions into code inserted into DxilOperations.cpp.
  • DxilOperations.h|cpp encodes allowed overloads inside OpCodeProperty state for each operation in the m_OpCodeProps table. It uses this information, along with generated code, to enforce overload rules on DXIL ops.
    • The allowed overload definition in OpCodeProperty has been rewritten to use a more compact OverloadMask type, support multi-dim overloads, and add a second layer AllowedVectorElements for vector overloads for each dimension.
    • There are assumptions that one llvm::Type* describes the overload type, such as with: GetOpFunc, GetOpFuncList, GetOverloadType, IsOverloadLegal, and m_OpCodeClassCache. The scheme used for multi-dim overloads is to encode each of the overload types in a single unnamed StructType, like type {i32, <4 x float>}. This makes it compatible with all these existing mechanisms without requiring an API overhaul impacting the broader code base. GetExtendedOverloadType is used to construct this type from multiple types.

While updating DxilOperations.h|cpp, I noticed and removed some unused methods: IsDxilOpTypeName, IsDxilOpType, IsDupDxilOpType, GetOriginalDxilOpType.

@tex3d tex3d requested a review from a team as a code owner March 25, 2025 23:05
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@tex3d tex3d force-pushed the exp-multi-overload branch from 4920482 to 96d07b8 Compare March 26, 2025 00:42
Add a new native vector overload type to DXIL intrinsics.
@tex3d tex3d force-pushed the exp-multi-overload branch 4 times, most recently from c5d628f to 73a8ca8 Compare March 26, 2025 03:07
@tex3d tex3d force-pushed the exp-multi-overload branch from 73a8ca8 to 1ac4991 Compare March 26, 2025 04:18
Copy link
Collaborator

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly requests for clarification, but a few comments that I think might identify areas where bad input might lead to bad output.

I'm curious if you were tempted to use the new extended mechanism for all intrinsics. The array of bools is too long to be easily readable and I think a bitfield would be no worse and more concise giving us one way of indicating supported parameter types instead of two. If we were to just allow extended types to have a single entry, might it "just work"? By that I mean essentially removing the restriction against "fd," and preventing the unneeded dot separator might get us 90% there? Naturally we wouldn't want to require the comma in all cases, but it seems that would get us where I'm thinking of under the new code if it were allowed.


// Extended Type Overloads:
// This is an encoding for a multi-dimensional overload.
// 1. Only bAllowOverload[kExtendedTypeSlot] is set to true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what these numbers indicate, but this one is most confusing to me as it seems that bAllowedOverload[kVectorTypeSlot] being set to true is also an option?

I think we are describing two possibilities here:

  1. we are using extended types:
  • only bAllowedOverload[kExtenededTypeSlot] is true.
  • ExtendedOverloads comes into play
  1. we are using at least one vector type:
  • bAllowedOverload[kExtenededTypeSlot] is true but others can be also
  • AllowedVectorElements comes into play

Is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite.

There are basically two overload modes: 1. one overload dimension, and 2. extended overload dimensions.
These modes are orthogonal to whether or not vectors are allowed.

bAllowedOverload[kExtenededTypeSlot] tells you which mode you're in. If it's false, you're in 1. one overload dimension. If it's true, you're in 2. extended overload dimensions.

When in 1. one overload dimension:

  • bAllowedOverload[kExtenededTypeSlot] is false (that's how you know)
  • the ExtendedOverloads are unused and should be zero
  • if bAllowedOverload[kVectorTypeSlot] is false:
    • no changes to overload handling
    • AllowedVectorElements is unused/zero
  • if bAllowedOverload[kVectorTypeSlot] is true, then:
    • Other bits in bAllowedOverload may still be set to describe legal scalar overloads.
    • AllowedVectorElements[0] describes the legal vector element overloads.

When in 2. extended overload dimensions:

  • bAllowedOverload[kExtenededTypeSlot] is true (that's how you know)
  • all other bools in bAllowedOverload must be false
  • each overload dimension is described by an array element of ExtendedOverloads instead of using bAllowedOverload
  • for each dimension n:
    • ExtendedOverloads[n] is used for scalar overloads and a vector bit to indicate vector overload support
    • AllowedVectorElements[n] encodes the element overloads when ExtendedOverloads[n][kVectorTypeSlot] is set, otherwise it should be zero
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been heavily reworked.

I've eliminated the bAllowedOverload array in favor of a NumOverloadDims and use of the bit masks.

Recommend reviewing separate commits, and see at the commit comment for the main refactor.

DXASSERT(IsDxilOpExtendedOverload(opCode),
"otherwise, Dxil Op does not support extended overload");
return GetOpFunc(opCode, GetExtendedOverloadType(ExtendedOverloads));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To potentially clarify for others. This isn't used yet, but will be used by autogenerated code when extended overloads are introduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be used by incoming built-in linalg lowering code in HLOperationLower, but not technically by any "autogenerated" code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I know what you meant by autogenerated code, but was confused by "used by". Rather, this will lead to extracting elements in an upcoming autogenerated code path for new DXIL ops.

@pow2clk
Copy link
Collaborator

pow2clk commented Mar 26, 2025

Some minor notes about the description:

This change adds vector and multi-dimensional overload support for DXIL operations.

Given that this line presents this as a two-pronged change, I think a section describing the vector parts and one for multi-dimensional overloads might be helpful to orient readers about what is being added before getting into implementation details.. In particular because I think an explanation of what exactly multi-dimension overloads are might be helpful.

  • Updates vector character from 't' to '<'

Since the change using 't' never made it into main, I think it can be left unmentioned.

  • Allows legal vector element overloads to be specified after vector overload character.
  • Defaults legal vector element overloads to the scalar element overloads, if any.

I think these might be clearer if there's a bullet above them saying something like "Introduces the '<' character that, when placed after the scalar overloads, indicates that vectors overloads are allowed" This might replace the comment about changing 't' to '<'.

  • Adds 'x' for extended overload mechanism, which supports up to 2 overload dimensions at this point, but is easily expandable if necessary.

Since this isn't visible to users of hctdb.py, I think the second note is more important. The 'x' not being directly accessible to hctdb.py entries might bear mentioning.

  • Processes syntax using ',' to separate multiple overloads and defaulting vector element overloads into new breakdown of main overload string, vector overload list of strings, and list of extended overloads (used if main overload set to 'x').
  • DxilOperations:

    • Extend OpCodeProperty with ExtendedOverloads and AllowedVectorElements arrays using new OverloadMask.
    • When extended bit set in main overloads, ExtendedOverloads array is used for each overload dimension
    • When vector bit set in main overloads or each ExtendedOverloads dimension, AllowedVectorElements are set for corresponding dimension index for allowed element types.
    • Updated generated DXIL op table
@pow2clk
Copy link
Collaborator

pow2clk commented Mar 26, 2025

By way of guidance for other reviewers, it would be most convenient to mostly ignore the last commit as it only contains generated code. Worth glancing at to see what is added, but it really slows down the diff and causes DxilOperations.cpp to be closed by default and the more meaningful additions to get lost in the noise. I reviewed this mostly: https://github.com/microsoft/DirectXShaderCompiler/pull/7259/files/f538ebdc392bca7cb3e95752c0fdd8470249b5f5

@damyanp damyanp added this to the Next Release milestone Mar 28, 2025
tex3d added 5 commits March 28, 2025 14:41
I noticed these functions when adding multi-overload, and meant to delete them. In fact, I had deleted them from the cpp, but then my change was wiped out by a failed hctgen run. I think these were leftover from an attempt to work around a name collision ultimately caused by dx.types.CBufRet.f16|i16 between min-precision (with 4 elements) and native low precision (with 8 elements), caused by failing to initialize the min precision mode correctly for linking. That issue was fixed, and the names made unique by adding ".8" to the end of the 8-element native low precision cbuffer return type.
- Use Twine, raw_svector_ostream, and SmallVector storage to replace uses of std::string - Use SmallVector instead of std::vector for ArgTypes in GetOpFunc
- Functions IsDxilOpType and IsDxilOpTypeName are unused, this removes them.
@github-actions
Copy link
Contributor

github-actions bot commented Mar 29, 2025

✅ With the latest revision this PR passed the Python code formatter.

@tex3d tex3d force-pushed the exp-multi-overload branch 2 times, most recently from da1c4ab to dccb5d7 Compare March 29, 2025 04:22
tex3d added 2 commits March 28, 2025 21:35
Add comments explaining the new system. Eliminate bool array in favor of array of masks for up to N dimensions. Add NumOverloadDims instead of two-mode system. Rework TypeSlots: - use enum, categorize basic, limit masks to used bits - void doesn't need a type slot (NumOverloadDims == 0 instead) - m_OverloadTypeName only contains basic type names Handle multi-overload in FixOverloadNames; new MayHaveNonCanonicalOverload is used to determine whether the overload name could need fixing. Extended overload is still a distinction because of the way the overloads must be wrapped in an unnamed StructType. However, it does not need a bit in the overload mask. Renamed GetVectorType to GetStructVectorType, since it's just used to get a struct for a particular vector type, not a vector type itself. In hctdb.py, no longer separate extended and vector overloads, just verify correctness of the incoming string, and add default vector overloads if necessary. In hctdb_instrhelp.py, update according to changes in hctdb.py, and eliminate needless, problematic, outdated comment printing.
@tex3d tex3d force-pushed the exp-multi-overload branch from dccb5d7 to ad784af Compare March 29, 2025 04:35
pow2clk
pow2clk previously approved these changes Mar 31, 2025
Copy link
Collaborator

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have any feedback that requires a change, but I did make a few small suggestions.

return str;
raw_svector_ostream OS(Storage);
Ty->print(OS);
return OS.str();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything wrong with it, but I'm curious what advantages the change from std::string to SmallVector provides. The print to string mechanism is nice, but was possible before.

I also wonder about the twining for vector types compared to the print to string for extended types in a similar scenario. Again, I have no clear preference here, just wondering at the discrepancy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be more efficient and require no heap allocations 99% of the time.

Regarding the use of Twine, that should be maximally efficient, as it will know the final size in advance of any allocation into the final buffer (which shouldn't be necessary) rather than incrementally adding content and checking if a heap allocation and copy is needed.

But Twine is most useful for a single expression that results in a single concatenation into the final string buffer and return of a StringRef. Anything more complex and you might as well use the raw_svector_ostream.

return false;
if (opCode == OpCode::NumOpCodes)
if (static_cast<unsigned>(opCode) >=
static_cast<unsigned>(OpCode::NumOpCodes))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: do these need to be casted to integers to be compared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps not, but I'm concerned that an unsanitized opCode value could be supplied which may be greater than OpCode::NumOpCodes. If there's an assumption that a value of type OpCode will only ever be set to one of the values defined in the enum, I'm concerned that an invalid opcode value will not be caught.

llvm::Type *GetCBufferRetType(llvm::Type *pOverloadType);
llvm::Type *GetVectorType(unsigned numElements, llvm::Type *pOverloadType);
llvm::Type *GetStructVectorType(unsigned numElements,
llvm::Type *pOverloadType);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I support this renaming to clarify it's not a native vector type, but a struct for unpacked vectors. I considered suggesting adding the unpacked ness to it, but it would get kind of wordy and confusing as GetUnpackedStructVectorType

if last_category != None:
print("")
print(
" // {category:118} void, h, f, d, i1, i8, i16, i32, i64, udt, obj, vec, function attribute, ext oload, vec oload".format(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I hope it was clear that my previous comment wasn't demanding this change, I won't miss the giant boolean matrix.

bob80905
bob80905 previously approved these changes Mar 31, 2025
Copy link
Collaborator

@bob80905 bob80905 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just nits, looks good to me

@pow2clk pow2clk self-requested a review April 1, 2025 00:07
@tex3d tex3d dismissed stale reviews from bob80905 and pow2clk via 15447a3 April 1, 2025 01:24
Copy link
Collaborator

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks!

@tex3d tex3d merged commit 30bfd82 into microsoft:main Apr 1, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from New to Done in HLSL Roadmap Apr 1, 2025
@damyanp
Copy link
Member

damyanp commented Apr 1, 2025 via email

@tex3d tex3d deleted the exp-multi-overload branch April 1, 2025 20:05
@damyanp damyanp moved this to Closed in HLSL Support Apr 25, 2025
@damyanp damyanp removed this from HLSL Support Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants