Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should enhance LLVM switch instruction to take case ranges #1627

Open
lattner opened this issue Mar 13, 2007 · 69 comments
Open

Should enhance LLVM switch instruction to take case ranges #1627

lattner opened this issue Mar 13, 2007 · 69 comments

Comments

@lattner
Copy link
Collaborator

lattner commented Mar 13, 2007

Bugzilla Link 1255
Version trunk
OS All
CC @asl,@nunoplopes,@kaomoneus

Extended Description

The LLVM switch instruction should handle ranges of case values that go to the same destination without
enumerating all of them.

This would be good to do before 2.0, and would be a great project for someone who wants to learn/touch
more of LLVM.

-Chris

@asl
Copy link
Collaborator

asl commented Mar 13, 2007

That's definitely that thing, I've thought about after discussing with Duncan
his switch-handling patch.

In fact, it will make switch instruction much more "natural" and will eliminate
the need of "clusterifying" during switch lowering/codegen.

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 20, 2007

Maybe this can be added to the project page as something suitable for
the google SoC.

@asl
Copy link
Collaborator

asl commented Mar 20, 2007

Maybe, however Open Projects page already mention "code-cleanup" bugs. Maybe
it's just worth to set keyword here? :)

@asl
Copy link
Collaborator

asl commented Jun 30, 2007

I'm currently thinking about implementing this feature (otherwise we can simple
forget about it :)

Main question is: how should case ranges implemented. It seems, that these
conditions should be met:

  1. Only first and last element of the range should be stored. This is pretty
    natural. I've said before, that LLVM's ConstantRange class is not very good
    suitable for storing case ranges, because it's in form [Lower, Upper), so in
    many real-world cases Upper can overflow, which pretty inconvenient for user to
    deal with.

  2. CaseRange should be inherited from User in order not to break usual LLVM API

  3. CaseRange should be some form of "Constant".

Currently I'm planning to play with some specialized version of ConstantStruct.
Any thoughts/objections?

@lattner
Copy link
Collaborator Author

lattner commented Jul 2, 2007

I don't think that CaseRange should be a value. I think that Switch should just have triples of operands.
Just store "lower, upper, destbb" as three operands.

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 2, 2007

I've said before, that LLVM's ConstantRange class is not very good
suitable for storing case ranges, because it's in form [Lower, Upper),
so in many real-world cases Upper can overflow, which pretty inconvenient
for user to deal with.

I don't understand your objection to ConstantRange. First off let me
say that I think it useful to support wrapping case ranges: with Ada,
I'm sure you'll fairly often see case statements in which very positive
and very negative values branch to the same label. This can be naturally
represented using a wrapped range.

Secondly, there's another reason to handle wrapping ranges. The root cause
is that case statements can be signed or unsigned in gcc (Ada supports both)
unlike in C. Of course internally LLVM is presumably going to represent
switches unsigned (or maybe signed, in any case one of the two). That means
that signed case statements need to be mapped to unsigned. Consider the
following range:
case -1 .. 1 => do_something
If mapped directly to unsigned, this naturally becomes a wrapped range. (The
same phenomenon occurs if switches are represented as signed in LLVM, when
converting unsigned switches).

If the LLVM switch stuff is going to have to handle wrapped ranges anyway, I
see no reason not to use ConstantRange.

@lattner
Copy link
Collaborator Author

lattner commented Jul 2, 2007

I don't understand your objection to ConstantRange.

Ah, I don't object to constant range. To be more clear, I object to making ConstantRange itself be a
value*, which would be required to make it an operand of SwitchInst.

Actually the problem I think is that we currently represent the value as an explicit operand of switch.
For example, saying "1 -> bb1, 2 -> bb2, ..." etc, currently has two operands for each source and
destination. Instead of doing this, we could just store the BB's as operands and have a separate list of
the values that correspond to them. Because they do not need to be Value*'s, we could now use
ConstantRanges for these.

Does that make sense?

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 16, 2007

FWIW case ranges are a language feature in standard Fortran. Does this mean that this bug needs to be resolved before llvm-gfortran can work, or does LLVM currently lower switch ranges produced by the gcc front ends to an enumeration of all case values?

@asl
Copy link
Collaborator

asl commented Sep 16, 2007

FWIW case ranges are a language feature in standard Fortran. Does this mean
that this bug needs to be resolved before llvm-gfortran can work, or does LLVM
currently lower switch ranges produced by the gcc front ends to an enumeration
of all case values?

Yes. Exactly. With some threshold value: if case range is too big it emits just range comparison and 'if'.

@lattner
Copy link
Collaborator Author

lattner commented Sep 16, 2007

The SWITCH_EXPR lowering code in llvm-gcc currently converts case ranges to an enumerated set of cases in an llvm switch instruction if the case range is less than 64 elements. If it is more than 64 elements, it emits an "if" before the switch to evaluate the case range. This is not necessarily optimal, but is fully correct.

@kaomoneus
Copy link
Contributor

Hi, guys. What the state of case ranges is by now?

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 7, 2011

No one did anything.

@kaomoneus
Copy link
Contributor

May be we can use ConstantArray instead of ConstantRanges? Something like this:

switch i32 %val, label %otherwise [ i32 0, label %onzero
[2 x i32] [i32 1, i32 5], label %onzerofive
i32 7, label %onsix ]

We also can simplify syntax in this case omitting [2 x i32]:

switch i32 %val, label %otherwise [ i32 0, label %onzero
[i32 1, i32 5], label %onzerofive
i32 7, label %onsix ]

@kaomoneus
Copy link
Contributor

So what about moving case-values to another collection? I can try to do this if no-one against...

@lattner
Copy link
Collaborator Author

lattner commented Sep 28, 2011

FWIW, I don't think that using ConstantArray's here make sense. Instead, SwitchInst should move to not using Value*'s to represent the arguments, they should be a non-Value* datastructure that just contains APInts or something.

@kaomoneus
Copy link
Contributor

It seems that some optimizations becames impossible after case values was moved to non-operands collection. I mean optimizations that replaces Use objects with more optimal ones. "CodeGenPrepare::OptimizeExtUses" for example do it (CodeGenPrepare.cpp, string 1054).

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 29, 2011

I don't understand what you are saying. Currently switch cases are ConstantInt,
i.e. just numbers like 2 or 9. As such they do not use other values, so can't
possibly be impacted in the way you seem to suggest. Indeed they can't possibly
be simplified: how do you simplify the number 3?

@kaomoneus
Copy link
Contributor

Switch condition is stored like a case value too, with idx=0. It is may be a non-constant value. So, it is may be optimized.
But generally you right. It seems that "3" could not be simplified itself :-)

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 29, 2011

I don't think anyone was suggesting changing the way the switch condition is
represented, only the case values.

@kaomoneus
Copy link
Contributor

OK. What about next format of op-list:
[condition, default-dest, bb1, bb2, ...] ?

@asl
Copy link
Collaborator

asl commented Sep 29, 2011

OK. What about next format of op-list:
[condition, default-dest, bb1, bb2, ...] ?
This looks reasonable, might definitely be a "proper" Use

@lattner
Copy link
Collaborator Author

lattner commented Sep 29, 2011

Right. Also, note that changing the storage of the case values in memory doesn't mean that the syntax in .ll files has to change.

@nunoplopes
Copy link
Member

I have two concerns about the current design:

First, the implementation of SwitchInst::CaseIteratorT::getCaseValue() seems to be buggy, since it only iterates over the Low value of each range. It should keep state and be able to iterate over all elements of each case range.
(http://llvm.org/docs/doxygen/html/Instructions_8h_source.html#l02506)

Second, I couldn't find a simple way to cast the output of SwitchInst::CaseIteratorT::getCaseValueEx() (a ConstantRangesSet) to a ConstantRange, which seems to be the most natural object for users that can actually handle case range values. Can this be made available?

@kaomoneus
Copy link
Contributor

SwitchInst::CaseIteratorT::getCaseValue() is deprecated old style method. It is kept for unchanged passes that are not adopted to new case ranges feature.
Currently I'm in middle of case ranges implementation. And SwitchInst is already uses ConstantRangesSet but works in old style, currently each case is a still single number. I'm going to construct SwitchInst with real ranges after all dependend passes will adopted to case-ranges feature.

About ConstantRangesSet and ConstantRange. It is a slightly different classes. Don't try to convert CRS to CR.
Below little examples how to work with CRS:

// Example of enumeration ranges in case
for (SwitchInst::CaseIt i = I.case_begin(), e = I.case_end(); i != e; ++i) {
ConstantRangesSet Case = i.getCaseValueEx();
for (unsigned n = 0, en = Case.getNumItems(); n != en; ++n) {
ConstantRangesSet::Range r = Case.getItem(n);
ConstantInt* Low = r.Low;
ConstantInt* High = r.High;
// Do something with Low and High.
}
}

// Example of looking up for case that satisfies condition:
for (SwitchInst::CaseIt i = I.case_begin(), e = I.case_end(); i != e; ++i) {
ConstantRangesSet Case = i.getCaseValueEx();
if (Case.isSatisfies(Condition)) {
// Do something
break;
}
}

Also note that after all passes will updated and SwitchInst will work with real cases, the ConstantRangesSet will rebased from ConstantInt numbers to the APInt numbers.

@kaomoneus
Copy link
Contributor

Of course the last example is a little bit redundant since you can use
"CaseIt SwitchInst::findCaseValue(const ConstantInt* C)" instead.

@nunoplopes
Copy link
Member

I guess a missed the discussion on the mailing list.
But for me it would make more sense to have ConstantRangesSet be, well, a set of ConstantRanges. Why have a new Range class, if we already have ConstantRange? Furthermore, many optimizations already know about ConstantRanges, so they wouldn't need any change to take advantage of this new case ranges.
Again, for current users, it would be nice to have some way to iterate over a switch's case ranges and get ConstantRanges instead.

@llvmbot
Copy link
Collaborator

llvmbot commented May 22, 2012

I think the name ConstantRangeSet is unfortunate. In fact I think it should
just be a completely general set of integers that happens to store in an
efficient way sets that are unions of intervals (this list of intervals being
exposed for the convenience of users).

I think the implementation is also unfortunate, i.e. poor. Why is it mucking
around with Constants at all, it should only be dealing with APInts? They
should all have the same bitwidth, so basically the constructor should take the
bitwidth and that's it. Under the hood it should probably be a SmallVector of
APInt pairs [Low, High) and that's it. The whole "holder" concept is bad in my
opinion. Breaking the abstraction with IsWide is bad. Right at the start of
the file there is

template struct CRSConstantTypes {
typedef ConstantInt ConstantIntTy;
typedef ConstantRangesSet ConstantRangesSetTy;
};

template <>
struct CRSConstantTypes {
typedef const ConstantInt ConstantIntTy;
typedef const ConstantRangesSet ConstantRangesSetTy;
};

WTF? And this is only on the first screen, I didn't have the courage to look
further.

@llvmbot
Copy link
Collaborator

llvmbot commented May 22, 2012

Alternatively, if ConstantRangesSet is not a real abstract data type, but just a
helper for SwitchInst, then I reckon it should be moved to Instructions.cpp. In
fact it would be great if lots of the SwitchInst stuff in Instructions.h could
be moved out of line. The first thing you see at the start of the public
SwitchInst class is a ton of incomprehensible (to me) iterator stuff,

template <class SwitchInstTy, class ConstantIntTy, class BasicBlockTy>
class CaseIteratorT {
protected:

SwitchInstTy *SI;
unsigned Index;

public:

typedef CaseIteratorT<SwitchInstTy, ConstantIntTy, BasicBlockTy> Self;

/// Initializes case iterator for given SwitchInst and for given
/// case number.    
CaseIteratorT(SwitchInstTy *SI, unsigned CaseNum) {
  this->SI = SI;
  Index = CaseNum;
}

/// Initializes case iterator for given SwitchInst and for given
/// TerminatorInst's successor index.

and so on. I would love it if that could all be replaced with a forward
declaration, and have everything else hidden away out of line in
Instructions.cpp.

@kaomoneus
Copy link
Contributor

Duncan, you right million times. Wrapping concept is temporary and looks terrible. So all this stuff like
template struct CRSConstantTypes {
typedef ConstantInt ConstantIntTy;
typedef ConstantRangesSet ConstantRangesSetTy;
};
will gone away. Since at the last stage I'll replace ConstantInt with APInt.

Relative to Instruction.h - OK. I'll make it with forward declared.

The strategy of case ranges implementation:

  1. Implement new classes. Classes doesn't affect anything. They still work with ConstantInt base values at this stage.
  2. Fictitious replacement of current ConstantInt case values with ConstantRangesSet. Case ranges set will still hold single value, and ConstantInt *getCaseValue() will return it. But additionally implement new method in SwitchInst that allows to work with case ranges. Currenly I think it should be some wrapper that returns either single value or ConstantRangesSet object.
  3. Step-by-step replacement of old "ConstantInt* getCaseValue()" with new alternative. Modify algorithms for all passes that works with SwitchInst. But don't modify LLParser and BitcodeReader/Writer. Still hold single value in each ConstantRangesSet object. On this stage some parts of LLVM will use old-style methods, and some ones new-style. (I am here now)
  4. After all getCaseValue() usages will removed and whole LLVM and its clients will work in new style - modify LLParser, Reader and Writer. Remove getCaseValue().
  5. Replace ConstantInt*-based case ranges set items with APInt ones.

@llvmbot
Copy link
Collaborator

llvmbot commented May 23, 2012

OK, thanks for explaining!

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 9, 2013

I've reverted Stepan's patches in llvm svn 190328.

It would still be great to add this feature sometime, though!

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 3, 2019

I'd like to work on implementing this, if it is still a desried feature, and if no one else is currently working on this. Might need some help along the way though :)

@kaomoneus
Copy link
Contributor

I'd like to work on implementing this, if it is still a desried feature, and
if no one else is currently working on this. Might need some help along the
way though :)

Don't repeat my mistakes! Don't ever try to enhance existing SwitchInst. For this way supposes simultaneous enhancement of existing passes. A lot of them. You can't just add new feature to SwitchInst and keep the rest of llvm unaffected.

I would rather suggest you to introduce new instruction. And at first just to lower it to set of condition blocks. Then you could turn on its support in passes one by one.

Here is my 6 years old code with sketches for new 'switchr' instruction, which could be helpfull to you:

https://github.com/kaomoneus/llvm.stepan.dyatkovskiy/tree/switchr

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 3, 2019

Hi Stepan,

Thanks a lot for letting me know, and also for sharing the old code, I know it'll be really helpful. The idea of introducing a new instruction seems good to me too, should I start working on it? Or should I wait for other people's review on this idea?

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 3, 2019

I'd like to work on implementing this, if it is still a desried feature, and
if no one else is currently working on this. Might need some help along the
way though :)

Don't repeat my mistakes! Don't ever try to enhance existing SwitchInst. For
this way supposes simultaneous enhancement of existing passes. A lot of
them. You can't just add new feature to SwitchInst and keep the rest of llvm
unaffected.

I would rather suggest you to introduce new instruction. And at first just
to lower it to set of condition blocks. Then you could turn on its support
in passes one by one.

Here is my 6 years old code with sketches for new 'switchr' instruction,
which could be helpfull to you:

https://github.com/kaomoneus/llvm.stepan.dyatkovskiy/tree/switchr

Also, does the old code contains your work on changing the switch instruction too? Or does it contain only the switchr sketches?

Missed the reply button last time ':)

@kaomoneus
Copy link
Contributor

Also, does the old code contains your work on changing the switch
instruction too? Or does it contain only the switchr sketches?

Missed the reply button last time ':)

This code contains only new instruction and related modifications. I haven't checked it for years, but I believe it's compilable and runnable. I tried not to touch rest of llvm on purpose. Any line of new code in core should be done very carefully, for you have a lot of different

The best way to check what was done there is to compare 'switchr' branch with those-days master branch.

Smth like this:
$ git clone https://github.com/kaomoneus/llvm.stepan.dyatkovskiy.git llvm-switchr
$ cd llvm-switchr
$ git diff master switchr

As for waiting for review. This is definitely should be reviewed and approved. But may be you could start prototyping already, so then you would had something at your hands during review discussion.

@kaomoneus
Copy link
Contributor

*for you have a lot of different buildbots with different settings and targets. And each of them should be green (successful) all the time.

Also Think about some special trigger which allows to disable maximum of your code for legacy behaviour.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 3, 2019

*for you have a lot of different buildbots with different settings and
targets. And each of them should be green (successful) all the time.

Also Think about some special trigger which allows to disable maximum of
your code for legacy behaviour.

Thanks a lot for your help, I will be checking out the diff between the master and the switchr, and will ask you if I am confused about something.

@kaomoneus
Copy link
Contributor

Sure! You're welcome!

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 4, 2019

Hi Stepan,
I have a few questions regarding the code that you gave me, should I start asking them over here? Or can I ping you on some other platform probably to avoid spamming a lot of simple questions here? Maybe on IRC if you're there?

Also the first question:
I am not able to understand the way you have implemented ranges, here are your comments:

/// If we have flat and ordered collection of numbers, then range
/// [Low, High) may be represented as Low=collection[i], High=collection[i+1].
/// Thus all ranges are covered, also including the wrapped range:
/// Low=collection[collection.size()-1], High=collection[0].

I couldn't really understand what does collection actually represent.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 4, 2019

Hi Stepan,
I have a few questions regarding the code that you gave me, should I start
asking them over here? Or can I ping you on some other platform probably to
avoid spamming a lot of simple questions here? Maybe on IRC if you're there?

Also the first question:
I am not able to understand the way you have implemented ranges, here are
your comments:

/// If we have flat and ordered collection of numbers, then range
/// [Low, High) may be represented as Low=collection[i],
High=collection[i+1].
/// Thus all ranges are covered, also including the wrapped range:
/// Low=collection[collection.size()-1], High=collection[0].

I couldn't really understand what does collection actually represent.

If collections represents the collection of numbers used with the switchr instruction, what does it being wrapped mean?

@kaomoneus
Copy link
Contributor

/// If we have flat and ordered collection of numbers, then range
/// [Low, High) may be represented as Low=collection[i],
High=collection[i+1].
/// Thus all ranges are covered, also including the wrapped range:
/// Low=collection[collection.size()-1], High=collection[0].

I couldn't really understand what does collection actually represent.

If collections represents the collection of numbers used with the switchr
instruction, what does it being wrapped mean?

Honestly I should check code. But something I remember.

"Flat", well, I think that just means single-dimensional. So it's not flat, it's rather.. single dimensional :-) Basically I meant just an ordered sequence of numbers.

Wrapped range Low=collection[collection.size()-1], High=collection[0] perhaps means this:

[ [collection.size()-1], +inf ) U (-inf, collection[0] )

So it's universum with the rest of ranges excluded.

And the last statement better to check in code. I can do it only at monday though.

@lattner
Copy link
Collaborator Author

lattner commented Sep 5, 2019

Don't repeat my mistakes! Don't ever try to enhance existing SwitchInst.

I don't speak for Stepan, but this advice isn't obviously correct to me. I understand and agree with his point that "updating all the passes is a lot of work" but it isn't clear to me that the LLVM community with accept a new "switch with ranges" instruction.

It seems better to define a new global "eliminate switch ranges" operation that takes a SwitchInst, and update all passes to either a) work with ranges, or b) call this helper.

Then file a bug that says "phase out all calls to the 'eliminate switch ranges' helper".

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 5, 2019

By an "eliminate switch ranges" operation, do you mean rewriting the IR, replacing a switch with ranges with (the equivalent of) a bunch of "if" statements and a switch without ranges?

@lattner
Copy link
Collaborator Author

lattner commented Sep 5, 2019

By an "eliminate switch ranges" operation, do you mean rewriting the IR, replacing a switch with ranges with (the equivalent of) a bunch of "if" statements and a switch without ranges?

Yes, exactly. This feature request doesn't add any new expressive capability to switch, it is merely a space/time optimization. I agree that a "Big Bang" patch that updates all the things is impractical, so I'd suggest by enhancing the IR, but keeping all the (nontrivial) things that touch switch simple by lowering the rich representation to the old representation.

As a follow-on, each place where this lowering happens can be eliminated, over time, and by experts on the different parts of the compiler.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 6, 2019

The current switch instruction uses an operand list with
0 being the value to switch, 1 the default block, followed by
the case values and the block to execute in an alternating fashion.

I am new here, so I might miss something extremely obvious, (and
I apologize for it in advance) but,
I was thinking of what Chris Lattner earlier suggested, taking
both upper and lower values of the ranges along with the block
to execute as operands. Why was this idea dropped?

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 7, 2019

The current switch instruction uses an operand list with
0 being the value to switch, 1 the default block, followed by
the case values and the block to execute in an alternating fashion.

I am new here, so I might miss something extremely obvious, (and
I apologize for it in advance) but,
I was thinking of what Chris Lattner earlier suggested, taking
both upper and lower values of the ranges along with the block
to execute as operands. Why was this idea dropped?

As mentioned in the comments before, the reason why
modifying switch instruction is not advisable is because
it'd lead to a lot of changes in the code, which would be really difficult,
and would also result in a huge patch, which will be hard to review.

If that is the case and a new instruction isn't really an option,
then maybe I can clone the GitHub repo
like Stephan, and work on it and ask people to review my work
as and when I am done with specific parts?
That way the the review process can be broken down.

I understand that it will be a lot of work on
my part too, but if
possible I'd like to work on it, it'll help me gain a lot
of experience and knowledge and get me familiar with LLVM (and compilers for that matter,) as mentioned in the issue description.

This might take a lot of time, but I don't think it is
a feature that is required urgently, or something that
could potentially block a release.

I will apologise once again, in case I've got something wrong (please correct me), and also for bad english, it isn't my first language. :)

@lattner
Copy link
Collaborator Author

lattner commented Sep 8, 2019

I agree that megapatches are very problematic, but I don't see how this requires one.

There are straight-forward ways to phase this in as a series of incremental patches, using the existing SwitchInst.

  1. Start by adding expanding the representation of SwitchInst so it can represent ranges. Introduce asmprinter and asmparser support for it. Keep all the existing methods (which should be eventually revised) and make them abort when a case has more than one value. The entire compiler still works.

  2. Add a new method "EliminateCaseRanges" to SwitchInst (or better yet, a utility method somewhere) that rewrites a switch that uses case ranges into one that doesn't. Go change all the passes (one at at time if you'd prefer) to call it. Now all the passes accept case ranges.

  3. Change clang to start generating case ranges, e.g. for the GNU C case range extension "case 1...4:".

  4. Incrementally remove all uses of EliminateCaseRanges, upgrading each pass to work with them directly instead of eliminating them.

I don't see how this would take a megapatch.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 8, 2019

I agree that megapatches are very problematic, but I don't see how this
requires one.

There are straight-forward ways to phase this in as a series of incremental
patches, using the existing SwitchInst.

  1. Start by adding expanding the representation of SwitchInst so it can
    represent ranges. Introduce asmprinter and asmparser support for it. Keep
    all the existing methods (which should be eventually revised) and make them
    abort when a case has more than one value. The entire compiler still works.

  2. Add a new method "EliminateCaseRanges" to SwitchInst (or better yet, a
    utility method somewhere) that rewrites a switch that uses case ranges into
    one that doesn't. Go change all the passes (one at at time if you'd prefer)
    to call it. Now all the passes accept case ranges.

  3. Change clang to start generating case ranges, e.g. for the GNU C case
    range extension "case 1...4:".

  4. Incrementally remove all uses of EliminateCaseRanges, upgrading each pass
    to work with them directly instead of eliminating them.

I don't see how this would take a megapatch.

I see, that will definitely help with the
megapatch. I will follow these steps then :)

Just to clarify things for myself EliminateCaseRanges for now will
be replacing switch with ranges with if statements?
And in step 4 while removing the calls to it everywhere, the actual
handling of ranges will be taken care of?

@kaomoneus
Copy link
Contributor

Just to clarify things for myself EliminateCaseRanges for now will
be replacing switch with ranges with if statements?
And in step 4 while removing the calls to it everywhere, the actual
handling of ranges will be taken care of?
I think Chris meant that EliminateCaseRange should turn new version of switch into legacy one. But as long as it's impossible to have a fine conversion of that kind for big ranges, perhaps he also meant switch -> if conversion as well.
For EliminateCaseRanges I would propose to introduce some threshold N, if range size < N, that introduce one more switch case, then in switch default case introduce set of if's for bigger ranges.

@kaomoneus
Copy link
Contributor

I agree that megapatches are very problematic, but I don't see how this
requires one.

There are straight-forward ways to phase this in as a series of incremental
patches, using the existing SwitchInst.

What if to modify your approach as follows:
0. Introduce LLVM_CASE_RANGES_ENABLED build option. And guard new pr1255 code by relevant #define scope.

  1. Introduce ranges fo SwitchInst.
  2. Make EliminateCaseRanges as special pass and make it to be the first in passes queue (let's say on the top).
  3. Add support for case ranges for existing passes.
  4. Whenever it's possible bring EliminateCaseRanges pass down, until it hits bottom.
  5. Remove EliminateCaseRanges.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 8, 2019

Just to clarify things for myself EliminateCaseRanges for now will
be replacing switch with ranges with if statements?
And in step 4 while removing the calls to it everywhere, the actual
handling of ranges will be taken care of?
I think Chris meant that EliminateCaseRange should turn new version of
switch into legacy one. But as long as it's impossible to have a fine
conversion of that kind for big ranges, perhaps he also meant switch -> if
conversion as well.
For EliminateCaseRanges I would propose to introduce some threshold N, if
range size < N, that introduce one more switch case, then in switch default
case introduce set of if's for bigger ranges.

Thank you for explaining me what the EliminateCaseRanges
method is supposed to do.

What will it be replaced with once we start removing it
from places? Will we still use ifs?

@kaomoneus
Copy link
Contributor

What will it be replaced with once we start removing it
from places? Will we still use ifs?
Not quite understand what you mean. May be rephrase?
If you mean that EliminateCaseRanges is a method of SwitchInst and you remove its calls, then I suppose there should be the only call, or rather the only loop where this call is applied to all SwitchInsts, but correct me if I'm wrong.

The same for pass (when EliminateCaseRanges is a pass).

In any option, I think that it should be removed only, after you added support to all existing passes. Then.. just remove this call from all places, and that's it.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 8, 2019

What will it be replaced with once we start removing it
from places? Will we still use ifs?
Not quite understand what you mean. May be rephrase?
If you mean that EliminateCaseRanges is a method of SwitchInst and you
remove its calls, then I suppose there should be the only call, or rather
the only loop where this call is applied to all SwitchInsts, but correct me
if I'm wrong.

The same for pass (when EliminateCaseRanges is a pass).

In any option, I think that it should be removed only, after you added
support to all existing passes. Then.. just remove this call from all
places, and that's it.

By adding support to all the passes, do you mean that all the passes will now be able to replace switch with ifs in case the range is greater than the threshold value that you earlier mentioned? Without using a call to EliminateCaseRanges? Or without that pass?

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 9, 2019

What will it be replaced with once we start removing it
from places? Will we still use ifs?
Not quite understand what you mean. May be rephrase?
If you mean that EliminateCaseRanges is a method of SwitchInst and you
remove its calls, then I suppose there should be the only call, or rather
the only loop where this call is applied to all SwitchInsts, but correct me
if I'm wrong.

The same for pass (when EliminateCaseRanges is a pass).

In any option, I think that it should be removed only, after you added
support to all existing passes. Then.. just remove this call from all
places, and that's it.

By adding support to all the passes, do you mean that all the passes will
now be able to replace switch with ifs in case the range is greater than the
threshold value that you earlier mentioned? Without using a call to
EliminateCaseRanges? Or without that pass?

As in every pass will be able to do what EleminateCaseRanges was doing?

If this is the case then I guess keeping EliminateCaseRanges as a
method would help in later modification of
the passes. :)

@lattner
Copy link
Collaborator Author

lattner commented Sep 9, 2019

Please do not introduce #ifdef's in the codebase, there should be no need for them.

I think Chris meant that EliminateCaseRange should turn new version of switch
into legacy one.

Correct.

But as long as it's impossible to have a fine conversion of
that kind for big ranges, perhaps he also meant switch -> if conversion as
well.

I don't see a reason to worry about this unless some existing producer (e.g. clang's IRGen) would produce an if range. The simple way to phase that in is to just not have clang (or anything else) produce a switch range in a case that it would produce something smarter than the equivalent lowered switchinst.

We can "take advantage" of case ranges when the whole compiler is more or less upgraded to deal with them.

@kaomoneus
Copy link
Contributor

I don't see a reason to worry about this unless some existing producer (e.g.
clang's IRGen) would produce an if range. The simple way to phase that in
is to just not have clang (or anything else) produce a switch range in a
case that it would produce something smarter than the equivalent lowered
switchinst.

I think that perhaps we should consider two cases of EliminateCaseRanges use.

  1. When we still have some set of passes which don't know how to deal with case ranges. In this case we somehow should lower switch. And in this case we should turn ranges like [1..2) into "case 1:", and turn ranges like [1..100) into "if (c >= 1 && c < 100)".

  2. When all passes support case ranges, but backend doesn't. Then avoid generating case ranges by clang, and if after all we got some set of case ranges, wait until clang's IRGen turn.

Am I right?

@kaomoneus
Copy link
Contributor

As in every pass will be able to do what EleminateCaseRanges was doing?

Note quite. EliminateCaseRanges supposes that it produce code with NO case ranges.

Whilst, if pass support case ranges, then that pass knows how to interpret it, and perhaps can even get some benefits out of case-ranges representation form. E.g. sparse constant propogation or loop unswitch pass could have some benefits in theory (basically all data flow analyzing passes could).
So, if all passes knows what to do with case ranges, there's probably no need in EliminateCaseRanges at all, until perhaps instruction selection phase.

Some targets could also support case ranges directly though.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 12, 2019

I have started working on adding case range support to the SwitchInst class.

I am currently taking the approach taken by Stepan in the code he provided,
i.e., moving case ranges out of the operand list, and using the operand list
for storing only the value to switch on and the basic blocks.

I hope the approach is still relevant?

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants