========================== Vector Predication Roadmap ========================== .. contents:: Table of Contents :depth: 3 :local: Motivation ========== This proposal defines a roadmap towards native vector predication in LLVM, specifically for vector instructions with a mask and/or an explicit vector length. LLVM currently has no target-independent means to model predicated vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V extension and NEC SX-Aurora. Only some predicated vector operations, such as masked loads and stores, are available through intrinsics [MaskedIR]_. The Vector Predication (VP) extensions is a concrete RFC and prototype implementation to achieve native vector predication in LLVM. The VP prototype and all related discussions can be found in the VP patch on Phabricator [VPRFC]_. Roadmap ======= 1. IR-level VP intrinsics ------------------------- - There is a consensus on the semantics/instruction set of VP. - VP intrinsics and attributes are available on IR level. - TTI has capability flags for VP (``supportsVP()``?, ``haveActiveVectorLength()``?). Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer), potential integration in Clang with builtins. 2. CodeGen support ------------------ - VP intrinsics translate to first-class SDNodes (eg ``llvm.vp.fdiv.* -> vp_fdiv``). - VP legalization (legalize explicit vector length to mask (AVX512), legalize VP SDNodes to pre-existing ones (SSE, NEON)). Result: Backend development based on VP SDNodes. 3. Lift InstSimplify/InstCombine/DAGCombiner to VP -------------------------------------------------- - Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes that match standard vector IR and VP intrinsics. - Add a matcher context to PatternMatch and context-aware IR Builder APIs. - Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular vector instructions. - Incrementally lift InstCombine/InstSimplify to operate on VP as well as regular IR instructions. Result: Optimization of VP intrinsics on par with standard vector instructions. 4. Deprecate llvm.masked.* / llvm.experimental.reduce.* ------------------------------------------------------- - Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP. - DCE transitional APIs. Result: VP has superseded earlier vector intrinsics. 5. Predicated IR Instructions ----------------------------- - Vector instructions have an optional mask and vector length parameter. These lower to VP SDNodes (from Stage 2). - Phase out VP intrinsics, only keeping those that are not equivalent to vectorized scalar instructions (reduce, shuffles, ..) - InstCombine/InstSimplify expect predication in regular Instructions (Stage (3) has laid the groundwork). Result: Native vector predication in IR. References ========== .. [MaskedIR] `llvm.masked.*` intrinsics, https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics .. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM, https://reviews.llvm.org/D57504