LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 46199 - [OpenCL] Invalid optimization of subgroup functions with non-uniform control flow
Summary: [OpenCL] Invalid optimization of subgroup functions with non-uniform control ...
Status: CONFIRMED
Alias: None
Product: clang
Classification: Unclassified
Component: OpenCL (show other bugs)
Version: trunk
Hardware: PC Windows NT
: P enhancement
Assignee: Unassigned Clang Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-06-04 08:11 PDT by Piotr Fusik
Modified: 2021-11-05 04:17 PDT (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Piotr Fusik 2020-06-04 08:11:20 PDT
The following OpenCL code:

kernel void test(global int *data)
{
    uint id = (uint) get_global_id(0);
    if (id < 4)
        data[id] = sub_group_elect();
    else
        data[id] = sub_group_elect();
}

when compiled with the current master (f2c97656644e783622a6e60fe452b41ffe0f1d18) as follows: clang -cl-std=CL2.0 -include opencl-c.h -S -emit-llvm sub_group_elect_opt.cl

has an invalid optimization of combining the branches:

; Function Attrs: convergent norecurse nounwind uwtable
define dso_local spir_kernel void @test(i32* nocapture %data) local_unnamed_addr #0 !kernel_arg_addr_space !4 !kernel_arg_access_qual !5 !kernel_arg_type !6 !kernel_arg_base_type !6 !kernel_arg_type_qual !7 {
entry:
  %call = tail call i64 @"?get_global_id@@$$J0YAKI@Z"(i32 0) #3
  %call2 = tail call i32 @"?sub_group_elect@@$$J0YAHXZ"() #4
  %idxprom = and i64 %call, 4294967295
  %arrayidx = getelementptr inbounds i32, i32* %data, i64 %idxprom
  store i32 %call2, i32* %arrayidx, align 4, !tbaa !8
  ret void
}

sub_group_elect is one of the many functions added in https://reviews.llvm.org/D79781 that perform implicit communication within a subgroup, i.e. threads implemented as SIMD on GPU. All the added functions are affected.

https://reviews.llvm.org/D68994 is a proposal of addressing this problem with the `convergent` attribute.
Comment 1 Anastasia Stulova 2020-06-04 08:29:16 PDT
Marking subgroups operations that are to be called in non-uniform control flow with convergent attribute doesn't help because convergent attribute was added for functions called within uniform CF (by all work items):
https://clang.llvm.org/docs/AttributeReference.html#convergent