With there now being up to 8 million integer types, we need the ability to overload intrinsic functions by the type of their arguments. To this end, this PR tracks work that I'm doing to: 1. Create a new MVT::ValueType, iAny which is a place holder for an integer type of any bit width 2. Use iAny in the definition of "llvm_int_ty" in Intrinsics.td 3. Use llvm_int_ty in the definition of overloaded intrinsics that work on a variety of integer bit widths. For example llvm.ctpop will now work on any integer bit width. 4. Convert existing intrinsics that should be overloaded to use the overloading mechanisms (bswap, ctpop, cttz, ctlz). 5. Enhance TableGen to deal with all of the above, especially the generation of the recognizers where only the prefix should be matched for overloaded intrinsics. 6. Implement new intrinsics to support bit manipulation of integer types: bit_select (select one bit from many) bit_set (set a bit in an integer of arbitrary width) bit_part_set (set a range of bits in an arbitrary integer) bit_part_select (select a set of bits from an integer type) bit_concat (concat two integer values of arbitrary width) bit_and_reduce (reduce an iAny to i1 by applying AND bitwise) bit_or_reduce (reduce an iAny to i1 by applying OR bitwise) bit_xor_reduce (reduce an iAny to i1 by applying XOR bitwise) bit_nand_reduce (reduce an iAny to i1 by applying NAND bitwise) bit_nor_reduce (reduce an iAny to i1 by applying NOR bitwise) bit_nxor_reduce (reduce an iAny to i1 by applying NXOR bitwise) 7. Make the Verifier check the names of overloaded intrinsics to ensure that the suffix is of the correct form (. followed by the actual type for each iAny type). 8. Implement lowering of the overloaded intrinsics so that they can be executed regardless of whether a target supports lowering them itself (to an instruction). Targets (and CBE and LLI) can decide which of these intrinsics they wish to support directly. The others will be lowered to internal functions of a similar name (. -> _) with bodies defined in LLVM IR. These function bodies would then need to be code generated. Only those overloadings actually used (called) will be generated.
sounds good to me
Created attachment 740 [details] Patch to implement these features This is an initial patch that implements the intrinsic overloading feature. The ctpop, cttz, ctlz, and bswap intrinsics are modified to use the overloading. New bitwise intrinsics are declared in Intrinsics.td, but not implemented. Changes to support the new mnemonics (without the type) are made in InstructionCombining, ConstantFolding, SelectionDAGISel, and IntrinsicLowering. Remaining to do: 1. Revise InstructionCombining to deal with intrinsics that need a function body generated for them if they aren't handled by the target/codegen 2. Implement the new bitwise intrinsics (conversion from a pass that I have) 3. Testing.
Created attachment 741 [details] Updated patch to implement overloaded intrinsics This patch replaces the previous patch. It corrects a few typos, declares three more bitwise overloaded intrinsics, and supports usage by front ends. In particular, the front ends want to get a FunctionType for a given intrinsic. With overloading, the FE will need to specify the actual types of the arguments. This patch changes the getName, getType, and getDeclaration methods to include two new parameters: an array of Type* and a count for the size of the array. These parameters are provided by the FE to specify the actual types of any type place holders in the intrinsic's definition. Patch for llvm-gcc to follow shortly.
Created attachment 742 [details] llvm-gcc patch This patch just simplifies the ctpop/ctlz/cttz handling so that it can work on any integer bit width and is updated for the intrface changes in Intrinsics.h
Chris, The patches attached to this PR (overloaded intrinsics) are ready for review. Thanks, Reid.
Comments here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070326/046576.html In general, please email patches to llvm-commits. I will see them sooner than if you attach them to a bug. Thanks! -Chris
I generally do but mentioned adding this one to the PR in IRC and you said "okay" and I also sent you an explicit email reminder. Anyway, thinks for the review :) > In Intrinsics.td, it looks like you didn't choose to make the "count" > intrinsics always return i32. As such, you need to update the > LangRef.html doc to talk about bswap.i32.i32 (etc) and ctpop.i47.i47, > not bswap.i32 and ctpop.i47. Actually, I did and reverted it. For some reason I care not to investigate this causes no end of trouble for SelectionDAG. It wants the parameter and result widths to be the same. After fixing a couple asserts it then started producing bad code so I just reverted that portion of the change as its a separate issue from this PR. If you like, it can be tackled later. Furthermore, the name change you suggest above is incompatible with the existing convetion. The return type is not included in the overloading suffix. Rather than impact a large number of .ll files (llvm/test), front ends, and etc. I figured it was fine to ignore the result type because we don't have any intrinsics that are overloaded solely by result type. I'll implement all your other suggestions. This one we can revisit separately.
> For some reason I care not to investigate this > causes no end of trouble for SelectionDAG. It wants the parameter and result > widths to be the same. Right. In SDISel, you need to lower it to create the ctpop node (f.e.) with matching source and destination types. If the destination is not i32, you would then extend or truncate the node's result to the appropriate size. Once you do this, you won't have to touch any of the rest of the code generator. > Furthermore, the name > change you suggest above is incompatible with the existing convetion. The return > type is not included in the overloading suffix. This is a serious problem. > Rather than impact a large number of .ll files (llvm/test), front ends, and etc. I figured it was fine to > ignore the result type because we don't have any intrinsics that are overloaded solely by result type. Perhaps not yet, but we may in the future. The name should be mangled to include one suffix for every "overloadable" aspect of its signature. If it has 13 variable width arguments and an arbitrary result, it needs 14 suffixes. If you want to avoid changing everything, I think that fixing the codegen and making the "count" intrinsics always return i32 is the right way to go. -Chris
Okay, I'll give SDISel one more shot. If it doesn't get happy then I'm going to commit this the way it is as an increment (its big enough already). Either way, I will add the return type overloading in a subsequent patch.
Created attachment 750 [details] llvm-gcc patch (revised) This patch corrects handling of the "actual" types passed to getDeclaration. The result type, although not overloaded, must also be included in the vector of types.
This is done, right?
This is now done and working with 5 intrinsics.