1297 – Support Overloaded Intrinsic Functions

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 1297 - Support Overloaded Intrinsic Functions

Summary: Support Overloaded Intrinsic Functions

Status:	RESOLVED FIXED

Alias:	None

Product:	tools
Classification:	Unclassified
Component:	TableGen (show other bugs)
Version:	trunk
Hardware:	All All

Importance:	P enhancement
Assignee:	Reid Spencer

URL:
Keywords:	new-feature

Depends on:
Blocks:

Reported:	2007-03-30 21:39 PDT by Reid Spencer
Modified:	2010-02-22 12:46 PST (History)
CC List:	3 users (show)

See Also:
Fixed By Commit(s):

Attachments
Patch to implement these features (32.24 KB, patch) 2007-03-31 09:17 PDT, Reid Spencer	Details
Updated patch to implement overloaded intrinsics (39.23 KB, patch) 2007-03-31 10:44 PDT, Reid Spencer	Details
llvm-gcc patch (4.20 KB, patch) 2007-03-31 10:47 PDT, Reid Spencer	Details
llvm-gcc patch (revised) (5.10 KB, patch) 2007-04-01 03:37 PDT, Reid Spencer	Details
Show Obsolete (2) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Reid Spencer 2007-03-30 21:39:59 PDT

With there now being up to 8 million integer types, we need the ability to
overload intrinsic functions by the type of their arguments. To this end, this
PR tracks work that I'm doing to:

1. Create a new MVT::ValueType, iAny which is a place holder for an integer type
   of any bit width
2. Use iAny in the definition of "llvm_int_ty" in Intrinsics.td
3. Use llvm_int_ty in the definition of overloaded intrinsics that work on a 
   variety of integer bit widths. For example llvm.ctpop will now work on any
   integer bit width.
4. Convert existing intrinsics that should be overloaded to use the overloading
   mechanisms (bswap, ctpop, cttz, ctlz).
5. Enhance TableGen to deal with all of the above, especially the generation of
   the recognizers where only the prefix should be matched for overloaded
   intrinsics.
6. Implement new intrinsics to support bit manipulation of integer types:
     bit_select (select one bit from many) 
     bit_set (set a bit in an integer of arbitrary width)
     bit_part_set (set a range of bits in an arbitrary integer)
     bit_part_select (select a set of bits from an integer type)
     bit_concat (concat two integer values of arbitrary width)
     bit_and_reduce (reduce an iAny to i1 by applying AND bitwise)
     bit_or_reduce (reduce an iAny to i1 by applying OR bitwise)
     bit_xor_reduce (reduce an iAny to i1 by applying XOR bitwise)
     bit_nand_reduce (reduce an iAny to i1 by applying NAND bitwise)
     bit_nor_reduce (reduce an iAny to i1 by applying NOR bitwise)
     bit_nxor_reduce (reduce an iAny to i1 by applying NXOR bitwise)
7. Make the Verifier check the names of overloaded intrinsics to ensure that
   the suffix is of the correct form (. followed by the actual type for each
   iAny type).
8. Implement lowering of the overloaded intrinsics so that they can be executed
   regardless of whether a target supports lowering them itself (to an
   instruction).  Targets (and CBE and LLI) can decide which of these intrinsics
   they wish to support directly. The others will be lowered to internal 
   functions of a similar name (. -> _) with bodies defined in LLVM IR. These
   function bodies would then need to be code generated. Only those overloadings
   actually used (called) will be generated.

Comment 1 Chris Lattner 2007-03-30 22:03:45 PDT

sounds good to me

Comment 2 Reid Spencer 2007-03-31 09:17:37 PDT

Created attachment 740 [details]
Patch to implement these features

This is an initial patch that implements the intrinsic overloading feature. The
ctpop, cttz, ctlz, and bswap intrinsics are modified to use the overloading.
New bitwise intrinsics are declared in Intrinsics.td, but not implemented.
Changes to support the new mnemonics (without the type) are made in
InstructionCombining, ConstantFolding, SelectionDAGISel, and IntrinsicLowering.


Remaining to do: 
1. Revise InstructionCombining to deal with intrinsics that need a function 
   body generated for them if they aren't handled by the target/codegen
2. Implement the new bitwise intrinsics (conversion from a pass that I have)
3. Testing.

Comment 3 Reid Spencer 2007-03-31 10:44:39 PDT

Created attachment 741 [details]
Updated patch to implement overloaded intrinsics

This patch replaces the previous patch. It corrects a few typos, declares three
more bitwise overloaded intrinsics, and supports usage by front ends. In
particular, the front ends want to get a FunctionType for a given intrinsic.
With overloading, the FE will need to specify the actual types of the
arguments. This patch changes the getName, getType, and getDeclaration methods
to include two new parameters: an array of Type* and a count for the size of
the array. These parameters are provided by the FE to specify the actual types
of any type place holders in the intrinsic's definition. 

Patch for llvm-gcc to follow shortly.

Comment 4 Reid Spencer 2007-03-31 10:47:58 PDT

Created attachment 742 [details]
llvm-gcc patch

This patch just simplifies the ctpop/ctlz/cttz handling so that it can work on
any integer bit width and is updated for the intrface changes in Intrinsics.h

Comment 5 Reid Spencer 2007-03-31 10:49:06 PDT

Chris,

The patches attached to this PR (overloaded intrinsics) are ready for review.

Thanks,

Reid.

Comment 6 Chris Lattner 2007-03-31 21:12:12 PDT

Comments here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070326/046576.html

In general, please email patches to llvm-commits.  I will see them sooner than if you attach them to a bug.  
Thanks!

-Chris

Comment 7 Reid Spencer 2007-03-31 23:52:20 PDT

I generally do but mentioned adding this one to the PR in IRC and you said
"okay" and I also sent you an explicit email reminder.  Anyway, thinks for the
review :)

> In Intrinsics.td, it looks like you didn't choose to make the "count"  
> intrinsics always return i32.  As such, you need to update the  
> LangRef.html doc to talk about bswap.i32.i32 (etc) and ctpop.i47.i47,  
> not bswap.i32 and ctpop.i47.

Actually, I did and reverted it. For some reason I care not to investigate this
causes no end of trouble for SelectionDAG. It wants the parameter and result
widths to be the same. After fixing a couple asserts it then started producing
bad code so I just reverted that portion of the change as its a separate issue
from this PR. If you like, it can be tackled later.  Furthermore, the name
change you suggest above is incompatible with the existing convetion. The return
type is not included in the overloading suffix. Rather than impact a large
number of .ll files (llvm/test), front ends, and etc. I figured it was fine to
ignore the result type because we don't have any intrinsics that are overloaded
solely by result type. 

I'll implement all your other suggestions. This one we can revisit separately.

Comment 8 Chris Lattner 2007-04-01 00:41:21 PDT

> For some reason I care not to investigate this
> causes no end of trouble for SelectionDAG. It wants the parameter and result
> widths to be the same.

Right.  In SDISel, you need to lower it to create the ctpop node (f.e.) with matching source and 
destination types.  If the destination is not i32, you would then extend or truncate the node's result to 
the appropriate size.  Once you do this, you won't have to touch any of the rest of the code generator.

> Furthermore, the name
> change you suggest above is incompatible with the existing convetion. The return
> type is not included in the overloading suffix.

This is a serious problem.

> Rather than impact a large number of .ll files (llvm/test), front ends, and etc. I figured it was fine to
> ignore the result type because we don't have any intrinsics that are overloaded solely by result type. 

Perhaps not yet, but we may in the future.  The name should be mangled to include one suffix for every 
"overloadable" aspect of its signature.  If it has 13 variable width arguments and an arbitrary result, it 
needs 14 suffixes.

If you want to avoid changing everything, I think that fixing the codegen and making the "count" 
intrinsics always return i32 is the right way to go.

-Chris

Comment 9 Reid Spencer 2007-04-01 00:51:23 PDT

Okay, I'll give SDISel one more shot. If it doesn't get happy then I'm going to
commit this the way it is as an increment (its big enough already).

Either way, I will add the return type overloading in a subsequent patch.

Comment 10 Reid Spencer 2007-04-01 03:37:58 PDT

Created attachment 750 [details]
llvm-gcc patch (revised)

This patch corrects handling of the "actual" types passed to getDeclaration.
The result type, although not overloaded, must also be included in the vector
of types.

Comment 11 Chris Lattner 2007-04-04 21:40:28 PDT

This is done, right?

Comment 12 Reid Spencer 2007-04-04 21:47:03 PDT

This is now done and working with 5 intrinsics.