Skip precise-related scalarization where native vectors supported#8529
Skip precise-related scalarization where native vectors supported#8529pow2clk wants to merge 2 commits into
Conversation
This caused a crash when an element was extracted from a native vector alloca because it used a GEP which was not among the expected operations. Shouldn't be scalarizing native vectors in this case anyway. It was done during DXIL's partial mem2reg to keep the precise indication applied where mem2reg'ing would have erased it. As part of that, they did an SROA because it is skipped elsewhere for the sake of this pass. This allows the vectors to be represented natively as precise. In addition, the marking of the vector as precise using a specialized call is modified to allow vectors, preventing unnecessary extraction and replacement, which didn't cause any failures, but wasn't necessary for 6.9+ Finally, changes the way precise is indicated since it relied on applying metadata to a dx.attribute.precise function that lacked a body. This disallows pass testing for precise since the verifier objects to bodyless functions having metadata. It is simpler and more consistent to apply attributes to the function the way dxilnoop and similar functions do anyway Fixes microsoft#8528
The dx.annotate.precise() internal function call is used to mark variables as precise and is identified by the metadata associated with it. The function never makes it into final dxil. It serves only as an intermediate mechanism to mark things as precise. The LL assembler verifier rejects llvm IR that has a function with no body with attached metadata. In order to have pass tests that involve precise variables indicated by this temporarly internal call, it has to be marked some other way. This changes the precise marking to a function attribute that is used in other places such as the temporary dxil noop call. In practice, it has little impact as variables themselves are still marked as precise using metadata where and when possible. Instead of sharing that mechanism, this gives the temporary function its own that satisfies the LLVM assembler Includes the pass tests for precise native vectors that this change makes possible.
|
Note that the second commit may be excluded with no effect on the compiler functionality, however, the pass tests will be impossible without it or an equivalent fix due to the LLVM IR assembler verifier. I left it separate in case there was anxiety about changing the way the dx.attribute.precise call is annotated. I'd appreciate @dmpots's insight into this mechanism if he remembers how it came about. #323 didn't introduce the mechanism, but it suggests he may have once been familiar with it checks date 9 YEARS AGO! How is that possible? |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR fixes a crash in dxil-cond-mem2reg when operating on precise native vector allocas under SM 6.9+ by avoiding unnecessary scalarization, and improves how “precise” is represented/propagated for native vectors (including updating the marker mechanism to use function attributes instead of function metadata).
Changes:
- Skip precise-vector alloca scalarization in
DxilConditionalMem2Regwhen SM 6.9+ native vector support is available. - Update precise-marking helper generation to support vector overloads under SM 6.9+, and switch “precise marker” identification from function metadata to a
"dx.precise"function attribute. - Add targeted regression tests covering SROA and conditional mem2reg behavior for precise native vectors in SM 6.9+.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/clang/test/CodeGenDXIL/passes/longvec-precise-sroa.ll | New pass test ensuring vector precise marking uses a vector overload call (no scalar extraction) under SM 6.9+. |
| tools/clang/test/CodeGenDXIL/passes/longvec-precise-mem2reg.ll | New pass test ensuring dxil-cond-mem2reg preserves precise native vector allocas (no scalarization) under SM 6.9+. |
| tools/clang/test/CodeGenDXIL/hlsl/types/longvec-precise.hlsl | New source HLSL test for precise native vector element update patterns under SM 6.9+. |
| lib/Transforms/Scalar/DxilConditionalMem2Reg.cpp | Skips precise-vector alloca scalarization when SM 6.9+ native vectors are supported. |
| lib/HLSL/HLModule.cpp | Enables vector overloads for precise marking under SM 6.9+ and uses a function attribute for precise marker functions. |
| lib/DXIL/DxilModule.cpp | Adds DXIL::kPreciseString constant ("dx.precise"). |
| include/dxc/DXIL/DxilConstants.h | Declares DXIL::kPreciseString. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| precise float4 main (float4 pos : POSITION, float4 scale : SCL, float4 shift : OFF) : SV_Position { | ||
| precise float4 position = pos; | ||
| // Initial multiplication to avoid optimizaton just using the input scalar |
| // RUN: %dxc -T vs_6_9 %s | FileCheck %s | ||
| // Tests a specific case of a precise native vector requiring extraction | ||
| // and reinsertion during the alloca phase where conditionalmem2reg is concerned. | ||
| // Serves as the source for longvec-precise.ll and its specific pass tests |
llvm-beanz
left a comment
There was a problem hiding this comment.
I'm not sure this change is safe or legal to make without gating on a shader model.
DXC only ever sets dx.precise as metadata, not as a function attribute, this change makes it a function attribute.
I don't believe we have any testing that expects drivers to read or respect the attribute as a function attribute, so I'm not sure this transformation results in legal DXIL generation.
I believe this only impacts intermediate IR. The function attribute is used specifically for an internal function |
There was a problem hiding this comment.
I think this change looks good to merge.
WRT. #323 and your question to @dmpots, that change impacted the interface for precise metadata on an Instruction (used in DXIL) rather than the intermediate high-level Function metadata, which is not intended to survive DXIL lowering.
This should be safe to merge as far as know.
(I should note that some minor, valid copilot comments could still be addressed before merging)
This caused a crash when an element was extracted from a native vector alloca because it used a GEP which was not among the expected operations. Shouldn't be scalarizing native vectors in this case anyway. It was done during DXIL's partial mem2reg to keep the precise indication applied where mem2reg'ing would have erased it. As part of that, they did an SROA because it is skipped elsewhere for the sake of this pass. This allows the vectors to be represented natively as precise.
In addition, the marking of the vector as precise using a specialized call is modified to allow vectors, preventing unnecessary extraction and replacement, which didn't cause any failures, but wasn't necessary for 6.9+
Finally, changes the way precise is indicated since it relied on applying metadata to a dx.attribute.precise function that lacked a body. This disallows pass testing for precise since the verifier objects to bodyless functions having metadata. It is simpler and more consistent to apply attributes to the function the way dxilnoop and similar functions do anyway
Fixes #8528