- Notifications
You must be signed in to change notification settings - Fork 15.1k
Clang/buildFMulAdd: Use negated attribute #121038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This reverts commit 67789aa.
Use negated attribute if negMul or negAdd. So that we can lower fneg+fmuladd to fmul+fsub if needed. 1) It can save one machine instruction: fneg/fmul/fadd vs fmul/fsub 2) In strict mode, `c-a*b` may be different with `c+(-a)*b`.
| @llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-clang-codegen Author: YunQiang Su (wzssyqa) ChangesUse negated attribute if negMul or negAdd. So that we can lower
Full diff: https://github.com/llvm/llvm-project/pull/121038.diff 7 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 4b71bd730ce12c..14d73de055d8ec 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -4120,6 +4120,10 @@ static Value* buildFMulAdd(llvm::Instruction *MulOp, Value *Addend, CGF.CGM.getIntrinsic(llvm::Intrinsic::experimental_constrained_fmuladd, Addend->getType()), {MulOp0, MulOp1, Addend}); + if (negMul) + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(0, llvm::Attribute::Negated); + if (negAdd) + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(2, llvm::Attribute::Negated); } else { FMulAdd = Builder.CreateCall( CGF.CGM.getIntrinsic(llvm::Intrinsic::fmuladd, Addend->getType()), diff --git a/clang/test/CodeGen/constrained-math-builtins.c b/clang/test/CodeGen/constrained-math-builtins.c index 68b9e75283c547..f044f15e98918b 100644 --- a/clang/test/CodeGen/constrained-math-builtins.c +++ b/clang/test/CodeGen/constrained-math-builtins.c @@ -392,12 +392,12 @@ void bar(float f) { // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg - // CHECK: call double @llvm.experimental.constrained.fmuladd.f64(double %{{.*}}, double %{{.*}}, double %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call double @llvm.experimental.constrained.fmuladd.f64(double %{{.*}}, double %{{.*}}, double negated %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg // CHECK: call x86_fp80 @llvm.experimental.constrained.fmuladd.f80(x86_fp80 %{{.*}}, x86_fp80 %{{.*}}, x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg // CHECK: fneg - // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float negated %{{.*}}, float %{{.*}}, float negated %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg - // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float negated %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") }; diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 7e01331b20c570..bf37e6a788c4b6 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -1573,6 +1573,12 @@ Currently, only the following parameter attributes are defined: | pinf | Positive infinity | 512 | +-------+----------------------+---------------+ +``negated`` + The function parameter marked with this attribute is negated from + its opposite number by the frontend like Clang. The middle end or + backend should convert it back if possible. For example if -(a*b) + is converted to (-a)*b, the arg0 of `fmul` instruction should be + marked with `negated` attribute. ``alignstack(<n>)`` This indicates the alignment that should be considered by the backend when diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h index 21fd27d9838db7..7e9d174db22026 100644 --- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h +++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h @@ -788,6 +788,7 @@ enum AttributeKindCodes { ATTR_KIND_NO_EXT = 99, ATTR_KIND_NO_DIVERGENCE_SOURCE = 100, ATTR_KIND_SANITIZE_TYPE = 101, + ATTR_KIND_NEGATED = 102, }; enum ComdatSelectionKindCodes { diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td index 61955cf883c3f1..baeca5d53f3c46 100644 --- a/llvm/include/llvm/IR/Attributes.td +++ b/llvm/include/llvm/IR/Attributes.td @@ -162,6 +162,9 @@ def Memory : IntAttr<"memory", IntersectCustom, [FnAttr]>; /// Forbidden floating-point classes. def NoFPClass : IntAttr<"nofpclass", IntersectCustom, [ParamAttr, RetAttr]>; +/// Converted from the opposite number +def Negated : EnumAttr<"negated", IntersectAnd, [ParamAttr, RetAttr]>; + /// Function must be optimized for size first. def MinSize : EnumAttr<"minsize", IntersectPreserve, [FnAttr]>; diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp index b4efd3928a2e6f..e87c9d2e13883d 100644 --- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp +++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp @@ -755,6 +755,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) { return bitc::ATTR_KIND_MEMORY; case Attribute::NoFPClass: return bitc::ATTR_KIND_NOFPCLASS; + case Attribute::Negated: + return bitc::ATTR_KIND_NEGATED; case Attribute::Naked: return bitc::ATTR_KIND_NAKED; case Attribute::Nest: diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp index 7ddb9e22c83441..4e1a8c560078aa 100644 --- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp +++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp @@ -918,6 +918,7 @@ Function *CodeExtractor::constructFunctionDeclaration( case Attribute::PresplitCoroutine: case Attribute::Memory: case Attribute::NoFPClass: + case Attribute::Negated: case Attribute::CoroDestroyOnlyWhenComplete: case Attribute::CoroElideSafe: case Attribute::NoDivergenceSource: |
| @llvm/pr-subscribers-clang Author: YunQiang Su (wzssyqa) ChangesUse negated attribute if negMul or negAdd. So that we can lower
Full diff: https://github.com/llvm/llvm-project/pull/121038.diff 7 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 4b71bd730ce12c..14d73de055d8ec 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -4120,6 +4120,10 @@ static Value* buildFMulAdd(llvm::Instruction *MulOp, Value *Addend, CGF.CGM.getIntrinsic(llvm::Intrinsic::experimental_constrained_fmuladd, Addend->getType()), {MulOp0, MulOp1, Addend}); + if (negMul) + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(0, llvm::Attribute::Negated); + if (negAdd) + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(2, llvm::Attribute::Negated); } else { FMulAdd = Builder.CreateCall( CGF.CGM.getIntrinsic(llvm::Intrinsic::fmuladd, Addend->getType()), diff --git a/clang/test/CodeGen/constrained-math-builtins.c b/clang/test/CodeGen/constrained-math-builtins.c index 68b9e75283c547..f044f15e98918b 100644 --- a/clang/test/CodeGen/constrained-math-builtins.c +++ b/clang/test/CodeGen/constrained-math-builtins.c @@ -392,12 +392,12 @@ void bar(float f) { // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg - // CHECK: call double @llvm.experimental.constrained.fmuladd.f64(double %{{.*}}, double %{{.*}}, double %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call double @llvm.experimental.constrained.fmuladd.f64(double %{{.*}}, double %{{.*}}, double negated %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg // CHECK: call x86_fp80 @llvm.experimental.constrained.fmuladd.f80(x86_fp80 %{{.*}}, x86_fp80 %{{.*}}, x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg // CHECK: fneg - // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float negated %{{.*}}, float %{{.*}}, float negated %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg - // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float negated %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") }; diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 7e01331b20c570..bf37e6a788c4b6 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -1573,6 +1573,12 @@ Currently, only the following parameter attributes are defined: | pinf | Positive infinity | 512 | +-------+----------------------+---------------+ +``negated`` + The function parameter marked with this attribute is negated from + its opposite number by the frontend like Clang. The middle end or + backend should convert it back if possible. For example if -(a*b) + is converted to (-a)*b, the arg0 of `fmul` instruction should be + marked with `negated` attribute. ``alignstack(<n>)`` This indicates the alignment that should be considered by the backend when diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h index 21fd27d9838db7..7e9d174db22026 100644 --- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h +++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h @@ -788,6 +788,7 @@ enum AttributeKindCodes { ATTR_KIND_NO_EXT = 99, ATTR_KIND_NO_DIVERGENCE_SOURCE = 100, ATTR_KIND_SANITIZE_TYPE = 101, + ATTR_KIND_NEGATED = 102, }; enum ComdatSelectionKindCodes { diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td index 61955cf883c3f1..baeca5d53f3c46 100644 --- a/llvm/include/llvm/IR/Attributes.td +++ b/llvm/include/llvm/IR/Attributes.td @@ -162,6 +162,9 @@ def Memory : IntAttr<"memory", IntersectCustom, [FnAttr]>; /// Forbidden floating-point classes. def NoFPClass : IntAttr<"nofpclass", IntersectCustom, [ParamAttr, RetAttr]>; +/// Converted from the opposite number +def Negated : EnumAttr<"negated", IntersectAnd, [ParamAttr, RetAttr]>; + /// Function must be optimized for size first. def MinSize : EnumAttr<"minsize", IntersectPreserve, [FnAttr]>; diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp index b4efd3928a2e6f..e87c9d2e13883d 100644 --- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp +++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp @@ -755,6 +755,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) { return bitc::ATTR_KIND_MEMORY; case Attribute::NoFPClass: return bitc::ATTR_KIND_NOFPCLASS; + case Attribute::Negated: + return bitc::ATTR_KIND_NEGATED; case Attribute::Naked: return bitc::ATTR_KIND_NAKED; case Attribute::Nest: diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp index 7ddb9e22c83441..4e1a8c560078aa 100644 --- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp +++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp @@ -918,6 +918,7 @@ Function *CodeExtractor::constructFunctionDeclaration( case Attribute::PresplitCoroutine: case Attribute::Memory: case Attribute::NoFPClass: + case Attribute::Negated: case Attribute::CoroDestroyOnlyWhenComplete: case Attribute::CoroElideSafe: case Attribute::NoDivergenceSource: |
| @llvm/pr-subscribers-llvm-transforms Author: YunQiang Su (wzssyqa) ChangesUse negated attribute if negMul or negAdd. So that we can lower
Full diff: https://github.com/llvm/llvm-project/pull/121038.diff 7 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 4b71bd730ce12c..14d73de055d8ec 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -4120,6 +4120,10 @@ static Value* buildFMulAdd(llvm::Instruction *MulOp, Value *Addend, CGF.CGM.getIntrinsic(llvm::Intrinsic::experimental_constrained_fmuladd, Addend->getType()), {MulOp0, MulOp1, Addend}); + if (negMul) + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(0, llvm::Attribute::Negated); + if (negAdd) + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(2, llvm::Attribute::Negated); } else { FMulAdd = Builder.CreateCall( CGF.CGM.getIntrinsic(llvm::Intrinsic::fmuladd, Addend->getType()), diff --git a/clang/test/CodeGen/constrained-math-builtins.c b/clang/test/CodeGen/constrained-math-builtins.c index 68b9e75283c547..f044f15e98918b 100644 --- a/clang/test/CodeGen/constrained-math-builtins.c +++ b/clang/test/CodeGen/constrained-math-builtins.c @@ -392,12 +392,12 @@ void bar(float f) { // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg - // CHECK: call double @llvm.experimental.constrained.fmuladd.f64(double %{{.*}}, double %{{.*}}, double %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call double @llvm.experimental.constrained.fmuladd.f64(double %{{.*}}, double %{{.*}}, double negated %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg // CHECK: call x86_fp80 @llvm.experimental.constrained.fmuladd.f80(x86_fp80 %{{.*}}, x86_fp80 %{{.*}}, x86_fp80 %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg // CHECK: fneg - // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float negated %{{.*}}, float %{{.*}}, float negated %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") // CHECK: fneg - // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") + // CHECK: call float @llvm.experimental.constrained.fmuladd.f32(float negated %{{.*}}, float %{{.*}}, float %{{.*}}, metadata !"round.tonearest", metadata !"fpexcept.strict") }; diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 7e01331b20c570..bf37e6a788c4b6 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -1573,6 +1573,12 @@ Currently, only the following parameter attributes are defined: | pinf | Positive infinity | 512 | +-------+----------------------+---------------+ +``negated`` + The function parameter marked with this attribute is negated from + its opposite number by the frontend like Clang. The middle end or + backend should convert it back if possible. For example if -(a*b) + is converted to (-a)*b, the arg0 of `fmul` instruction should be + marked with `negated` attribute. ``alignstack(<n>)`` This indicates the alignment that should be considered by the backend when diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h index 21fd27d9838db7..7e9d174db22026 100644 --- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h +++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h @@ -788,6 +788,7 @@ enum AttributeKindCodes { ATTR_KIND_NO_EXT = 99, ATTR_KIND_NO_DIVERGENCE_SOURCE = 100, ATTR_KIND_SANITIZE_TYPE = 101, + ATTR_KIND_NEGATED = 102, }; enum ComdatSelectionKindCodes { diff --git a/llvm/include/llvm/IR/Attributes.td b/llvm/include/llvm/IR/Attributes.td index 61955cf883c3f1..baeca5d53f3c46 100644 --- a/llvm/include/llvm/IR/Attributes.td +++ b/llvm/include/llvm/IR/Attributes.td @@ -162,6 +162,9 @@ def Memory : IntAttr<"memory", IntersectCustom, [FnAttr]>; /// Forbidden floating-point classes. def NoFPClass : IntAttr<"nofpclass", IntersectCustom, [ParamAttr, RetAttr]>; +/// Converted from the opposite number +def Negated : EnumAttr<"negated", IntersectAnd, [ParamAttr, RetAttr]>; + /// Function must be optimized for size first. def MinSize : EnumAttr<"minsize", IntersectPreserve, [FnAttr]>; diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp index b4efd3928a2e6f..e87c9d2e13883d 100644 --- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp +++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp @@ -755,6 +755,8 @@ static uint64_t getAttrKindEncoding(Attribute::AttrKind Kind) { return bitc::ATTR_KIND_MEMORY; case Attribute::NoFPClass: return bitc::ATTR_KIND_NOFPCLASS; + case Attribute::Negated: + return bitc::ATTR_KIND_NEGATED; case Attribute::Naked: return bitc::ATTR_KIND_NAKED; case Attribute::Nest: diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp index 7ddb9e22c83441..4e1a8c560078aa 100644 --- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp +++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp @@ -918,6 +918,7 @@ Function *CodeExtractor::constructFunctionDeclaration( case Attribute::PresplitCoroutine: case Attribute::Memory: case Attribute::NoFPClass: + case Attribute::Negated: case Attribute::CoroDestroyOnlyWhenComplete: case Attribute::CoroElideSafe: case Attribute::NoDivergenceSource: |
| Depends on #121027 |
You can test this locally with the following command:git-clang-format --diff 4cb2a519db10f54815c8a4ccd5accbedc1cdfd07 9a8925b18e609ac646b2c16da81264a261545513 --extensions cpp,h,c -- clang/lib/CodeGen/CGExprScalar.cpp clang/test/CodeGen/constrained-math-builtins.c llvm/include/llvm/Bitcode/LLVMBitCodes.h llvm/lib/Bitcode/Writer/BitcodeWriter.cpp llvm/lib/Transforms/Utils/CodeExtractor.cppView the diff from clang-format here.diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 14d73de055..367ca42053 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -4121,9 +4121,11 @@ static Value* buildFMulAdd(llvm::Instruction *MulOp, Value *Addend, Addend->getType()), {MulOp0, MulOp1, Addend}); if (negMul) - dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(0, llvm::Attribute::Negated); + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(0, + llvm::Attribute::Negated); if (negAdd) - dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(2, llvm::Attribute::Negated); + dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(2, + llvm::Attribute::Negated); } else { FMulAdd = Builder.CreateCall( CGF.CGM.getIntrinsic(llvm::Intrinsic::fmuladd, Addend->getType()), |
arsenm left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't really use an attribute for this. Attributes should not be used for adding constraints to an operation
| Addend->getType()), | ||
| {MulOp0, MulOp1, Addend}); | ||
| if (negMul) | ||
| dyn_cast<llvm::CallBase>(FMulAdd)->addParamAttr(0, llvm::Attribute::Negated); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unchecked dyn_cast
Use negated attribute if negMul or negAdd. So that we can lower
fneg+fmuladd to fmul+fsub if needed.
fneg/fmul/fadd vs fmul/fsub
c-a*bmay be different withc+(-a)*b.