A lot of code uses X->getType()->isFloatingPoint() to disable transformations that are unsafe on FP math. We should audit these and change them to call Ty->isFPOrFPVector() if they are also unsafe for vectors of floats (which is probably most of them). Of course, it would be far better to just have FADD/FMUL/etc instructions separate from ADD/MUL, etc, but that will take time. -Chris
Created attachment 578 [details] pr970-1.patch Performing this audit found several minor issues. This is the first of a small series of patches. It includes changes which should be uncontroversial, including several typo fixes and stripping some code which was found to be dead. There's one diff which doesn't read well; it's just removing an if block and correcting interior indentation.
Created attachment 579 [details] pr970-2.patch This patch modifies several assertions to be correct. Comparison instructions do not apply to vectors, and SCEV will introduce canonical induction variables only of an integer type.
Created attachment 580 [details] pr970-3.patch This patch includes two changes I'm not entirely certain of. First, in PatternMatch.h, the m_Neg matcher is incorrect for VP vectors. It matches sub <+0.0, +0.0, ...>, %x but the correct pattern is sub <-0.0, -0.0, ...>, %x. Since this matcher is dead, I simply commented it out with a note that it was broken. It would be simple to delete or fix it instead. Second, in GlobalOpt.cpp, the ShrinkGlobalToBoolean transformation excludes FP, but allows vectors. It doesn't explain why. FP exclusion was added last November to fix a crash, it looks like. I made it exclude vectors, too, but there's no overwhelming reason why.
Created attachment 581 [details] pr970-4.patch This patch fixes several routines which deal with negation but failed to use the proper floating point negation formula, f(x) = -0.0 - x, for vectors. I simply factored out the logic to get the properly-signed "zero" value and extended it to get the correct floating-point vector. In most cases, these local logic errors were harmless because callers operated exclusively on scalar integers, but in four cases there look to be potential problems or missed optimizations. This patch fixes the dead m_Neg matcher that patch 3 commented out. at lib/Transforms/Scalar/InstructionCombining.cpp:6788: NegVal = InsertNewInstBefore( BinaryOperator::createNeg(SubOp->getOperand(1), "tmp"), SI); This can pass vectors to createNeg and looks like it might miscompile code by inserting a sub <+0.0, +0.0, etc.>, %x as it simplifies an expression. This patch should fix this case. at lib/Target/CBackend/CBackend.cpp:2129 This is now PR1126. at lib/Transforms/Scalar/InstructionCombining.cpp:325: Please review this call site in getComplexity. It is clearly reachable with FP vector operands, and isNeg will now return true for vector negation, which is not a single machine instruction as fneg or integer negation are. Therefore, existing behavior might be preferable. If so, exclude vectors: - if (BinaryOperator::isNeg(V) || BinaryOperator::isNot(V)) + if ((!isa<PackedType>(V->getType()) && BinaryOperator::isNeg(V)) || + BinaryOperator::isNot(V)) at lib/Transforms/Scalar/InstructionCombining.cpp:473: static inline Value *dyn_castNegVal(Value *V) { if (BinaryOperator::isNeg(V)) return BinaryOperator::getNegArgument(V); This could clearly be invoked with FP vector operands. With this patch, the following transformations would be extended to FP vectors: -A + B --> B - A A + -B --> A - B A - (-B) --> A + B -X * -Y --> X * Y These would already have applied to scalar FP, so I have to assume that they're IEEE-approved. Just one more!
Created attachment 582 [details] pr970-5.patch This final patch simplifies the logic in SelectionDAGISel.cpp. No functional change. This construct was in use: if (I.getType()->isFloatingPoint()) visitFPBinary(I, ISD::FADD, ISD::VADD); else visitIntBinary(I, ISD::ADD, ISD::VADD); Which confusingly sent even floating-point vector operations through the "visitIntBinary" routine. However, this was not a bug because visitFPBinary and visitIntBinary were identical (as is the vector opcode). This patch streamlines the logic.
Oh, and one more thing... In LowerSelect.cpp: bool LowerSelect::runOnFunction(Function &F) { bool Changed = false; for (Function::iterator BB = F.begin(), E = F.end(); BB != E; ++BB) for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; ++I) { if (SelectInst *SI = dyn_cast<SelectInst>(I)) if (!OnlyFP || SI->getType()->isFloatingPoint()) { // Split this basic block in half right before the select instruction. This pass is dead except for 'opt -lowerselect', so there's no client to say which (isFloatingPoint or isFPOrFPVector) is the correct behavior for the OnlyFP option. The pass could be deleted, the OnlyFP option disabled, the test switched to !OnlyFP || SI->getType()->isFPOrFPVector(), or the pass extended to lower any combination of FP and integer scalars and vectors.
Gordon, With a few minor cleanups, these patches have all been committed. Thanks for the great patches! Reid.
Mine.
Thanks, Reid.