summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorSjoerd Meijer <sjoerd.meijer@arm.com>2019-07-25 07:33:13 +0000
committerSjoerd Meijer <sjoerd.meijer@arm.com>2019-07-25 07:33:13 +0000
commitdfb6cacf9c29f492202b77d55f9c174a2917b190 (patch)
tree8683d6e2bf33282f9b8350da39e51e593d2a7d82 /docs
parentc7a4550a9dec3c6a09f8214a28dcbeee8032a1c1 (diff)
downloadclang-dfb6cacf9c29f492202b77d55f9c174a2917b190.tar.gz
[Clang] New loop pragma vectorize_predicate
This adds a new vectorize predication loop hint: #pragma clang loop vectorize_predicate(enable) that can be used to indicate to the vectoriser that all (load/store) instructions should be predicated (masked). This allows, for example, folding of the remainder loop into the main loop. This patch will be followed up with D64916 and D65197. The former is a refactoring in the loopvectorizer and the groundwork to make tail loop folding a more general concept, and in the latter the actual tail loop folding transformation will be implemented. Differential Revision: https://reviews.llvm.org/D64744 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@366989 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r--docs/LanguageExtensions.rst21
1 files changed, 18 insertions, 3 deletions
diff --git a/docs/LanguageExtensions.rst b/docs/LanguageExtensions.rst
index cb72c459c1..5bd234f2bf 100644
--- a/docs/LanguageExtensions.rst
+++ b/docs/LanguageExtensions.rst
@@ -2946,12 +2946,12 @@ Extensions for loop hint optimizations
The ``#pragma clang loop`` directive is used to specify hints for optimizing the
subsequent for, while, do-while, or c++11 range-based for loop. The directive
-provides options for vectorization, interleaving, unrolling and
+provides options for vectorization, interleaving, predication, unrolling and
distribution. Loop hints can be specified before any loop and will be ignored if
the optimization is not safe to apply.
-Vectorization and Interleaving
-------------------------------
+Vectorization, Interleaving, and Predication
+--------------------------------------------
A vectorized loop performs multiple iterations of the original loop
in parallel using vector instructions. The instruction set of the target
@@ -2994,6 +2994,21 @@ width/count of the set of target architectures supported by your application.
Specifying a width/count of 1 disables the optimization, and is equivalent to
``vectorize(disable)`` or ``interleave(disable)``.
+Vector predication is enabled by ``vectorize_predicate(enable)``, for example:
+
+.. code-block:: c++
+
+ #pragma clang loop vectorize(enable)
+ #pragma clang loop vectorize_predicate(enable)
+ for(...) {
+ ...
+ }
+
+This predicates (masks) all instructions in the loop, which allows the scalar
+remainder loop (the tail) to be folded into the main vectorized loop. This
+might be more efficient when vector predication is efficiently supported by the
+target platform.
+
Loop Unrolling
--------------