New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update_test_checks.py is reusing the same TMP name for overlapping values #45296
Comments
This is expected. |
At the very least we need to warn+bailout when there is clash in the use of %tmp* names |
If that’s a prerequisite, couldn’t the script run it? |
BTW, how is this supposed to work if the test case was |
Yes, this has been terrible from the start, and it's probably my fault. Roman's suggestion in comment 1 includes manually doing "s/%tmp/%t/". Here's a one-line fix that would reduce potential conflict practically to zero and might encourage better test writing. The only downside is noise when updating an existing test file, but that can be overcome with an NFC pre-commit. diff --git a/llvm/utils/UpdateTestChecks/common.py b/llvm/utils/UpdateTestChecks/common.py Create a FileCheck variable name based on an IR name.def get_value_name(var):
|
If we want to avoid noise, couldn't we check if the name is already used and if so bump the digit? |
That would still cause some noise and potential confusion because then the FileCheck names could get out-of-sync with the IR names; for example, "[[TMP1]]" corresponds to "%0". Here's a minimal example to show the problem: define i8 @blam(i8 %x) { The root problem is that FileCheck variable names have a naming limitation that is not in LLVM IR - FileCheck names can't start with a digit: So we need to fix that in FileCheck or work around it in the script. A quick look at the script didn't turn up an obvious solution for me. The script is sequentially parsing the file, but anonymous IR variable names don't necessarily occur in sequential order even in relatively "normal" IR files (consider phis and loops): define void @small_tc(float* %A, float* %B) { tmp2: for.end: That also means we can't error out on a name conflict. The script would have to gain knowledge about IR semantics that it doesn't currently have. Even a warning would be too noisy IMO because it would fire on standard patterns like the loop above even if there was no "%tmp2" variable? So I'd go with the hack and live with reduced name collision risk. :) |
To make this slightly less awful, we can add a warning as Simon suggested: diff --git a/llvm/utils/UpdateTestChecks/common.py b/llvm/utils/UpdateTestChecks/common.py spaces, commas, paren, or end of the stringIR_VALUE_RE = re.compile(r'(\s+)%([\w.-]+?)([,\s()]|\Z)') +NAMELESS_PREFIX = "NAMELESS" Create a FileCheck variable name based on an IR name.def get_value_name(var):
into defs, and variables we have seen into uses.def transform_line_vars(match):
|
I almost fell back into the 'tmp' variable trap on a recent commit. I'll post the patch for review for more feedback. |
Avoid the name collision with: |
For the record, we effectively reverted the above patch with: But we kept a warning from the script and updated -instnamer to reduce odds for conflict. See the review for details. |
Extended Description
I found a bug where the update script will reuse the same temporary name for two variables that overlap.
Essentially the test case looks like:
tmp1 = …
%1 = …
= %1
= %tmp1
And update_test_checks.py will assign TMP1 to both %1 and %tmp1.
To reproduce:
update_test_checks.py --opt-binary opt wrong_update.orig.ll
Result:
The generated REGEX will have something like:
; CHECK-NEXT: [[TMP1:%.]] = call float @eggs(<4 x float> getelementptr inbounds ([2 x <4 x float>], [2 x <4 x float>]* @global.5, i32 0, i32 0), <4 x float>* @global)
<...>
; CHECK-NEXT: [[TMP1:%.]] = call i32 @wibble(i16 2, i16 1, i16 -1, i16 0, i16 20, i16 0, i16 0)
; CHECK-NEXT: [[TMP4:%.]] = bitcast i32 [[TMP1]] to <4 x i8>
<...>
; CHECK-NEXT: [[TMP9:%.*]] = insertelement <4 x float> undef, float [[TMP1]], i32 0
This is wrong, since the second TMP1 use (happening on the definition of TMP9) should use the value from the first TMP1.
The text was updated successfully, but these errors were encountered: