Documentation for #pragma clang loop directive and options vectorize and interleave.

author Tyler Nowicki <tnowicki@apple.com>

Wed, 18 Jun 2014 00:51:32 +0000 (00:51 +0000)

committer Tyler Nowicki <tnowicki@apple.com>

Wed, 18 Jun 2014 00:51:32 +0000 (00:51 +0000)
author Tyler Nowicki <tnowicki@apple.com>
Wed, 18 Jun 2014 00:51:32 +0000 (00:51 +0000)
committer Tyler Nowicki <tnowicki@apple.com>
Wed, 18 Jun 2014 00:51:32 +0000 (00:51 +0000)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst

index 2de18ce..a9ba907 100644 (file)
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -1764,3 +1764,68 @@ The ``container`` function is also in the region and will not be optimized, but
  it causes the instantiation of ``twice`` and ``thrice`` with an ``int`` type; of
  these two instantiations, ``twice`` will be optimized (because its definition
  was outside the region) and ``thrice`` will not be optimized.
+
+Extensions for loop hint optimizations
+======================================
+
+The ``#pragma clang loop`` directive is used to specify hints for optimizing the
+subsequent for, while, do-while, or c++11 range-based for loop. The directive
+provides options for vectorization and interleaving. Loop hints can be specified
+before any loop and will be ignored if the optimization is not safe to apply.
+
+A vectorized loop performs multiple iterations of the original loop
+in parallel using vector instructions. The instruction set of the target
+processor determines which vector instructions are available and their vector
+widths. This restricts the types of loops that can be vectorized. The vectorizer
+automatically determines if the loop is safe and profitable to vectorize. A
+vector instruction cost model is used to select the vector width.
+
+Interleaving multiple loop iterations allows modern processors to further
+improve instruction-level parallelism (ILP) using advanced hardware features,
+such as multiple execution units and out-of-order execution. The vectorizer uses
+a cost model that depends on the register pressure and generated code size to
+select the interleaving count.
+
+Vectorization is enabled by ``vectorize(enable)`` and interleaving is enabled
+by ``interleave(enable)``. This is useful when compiling with ``-Os`` to
+manually enable vectorization or interleaving.
+
+.. code-block:: c++
+
+  #pragma clang loop vectorize(enable)
+  #pragma clang loop interleave(enable)
+  for(...) {
+    ...
+  }
+
+The vector width is specified by ``vectorize_width(_value_)`` and the interleave
+count is specified by ``interleave_count(_value_)``, where
+_value_ is a positive integer. This is useful for specifying the optimal
+width/count of the set of target architectures supported by your application.
+
+.. code-block:: c++
+
+
+  #pragma clang loop vectorize_width(2)
+  #pragma clang loop interleave_count(2)
+  for(...) {
+    ...
+  }
+
+Specifying a width/count of 1 disables the optimization, and is equivalent to
+``vectorize(disable)`` or ``interleave(disable)``.
+
+For convenience multiple loop hints can be specified on a single line.
+
+.. code-block:: c++
+
+  #pragma clang loop vectorize_width(4) interleave_count(8)
+  for(...) {
+    ...
+  }
+
+If an optimization cannot be applied any hints that apply to it will be ignored.
+For example, the hint ``vectorize_width(4)`` is ignored if the loop is not
+proven safe to vectorize. To identify and diagnose optimization issues use
+`-Rpass`, `-Rpass-missed`, and `-Rpass-analysis` command line options. See the
+user guide for details.
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst

index 1311e19..a7bbbb5 100644 (file)
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -97,6 +97,14 @@ passes via three new flags: `-Rpass`, `-Rpass-missed` and `-Rpass-analysis`.
  These flags take a POSIX regular expression which indicates the name
  of the pass (or passes) that should emit optimization remarks.
  
+New Pragmas in Clang
+-----------------------
+
+Loop optimization hints can be specified using the new `#pragma clang loop`
+directive just prior to the desired loop. The directive allows vectorization
+and interleaving to be enabled or disabled, and the vector width and interleave
+count to be manually specified. See language extensions for details.
+
  C Language Changes in Clang
  ---------------------------
  
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td

index ab83db2..df4b38d 100644 (file)
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1812,5 +1812,5 @@ def LoopHint : Attr {
    }
    }];
  
-  let Documentation = [Undocumented];
+  let Documentation = [LoopHintDocs];
  }
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td

index 7441fe5..6c8c9a3 100644 (file)
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -1012,3 +1012,14 @@ This attribute is incompatible with the ``always_inline`` attribute.
    }];
  }
  
+def LoopHintDocs : Documentation {
+  let Category = DocCatStmt;
+  let Content = [{
+The ``#pragma clang loop'' directive allows loop optimization hints to be
+specified for the subsequent loop. The directive allows vectorization
+and interleaving to be enabled or disabled, and the vector width and interleave
+count to be manually specified. See `language extensions
+<http://clang.llvm.org/docs/LanguageExtensions.html#extensions-for-loop-hint-optimizations>'_
+for details.
+  }];
+}
author	Tyler Nowicki <tnowicki@apple.com>
	Wed, 18 Jun 2014 00:51:32 +0000 (00:51 +0000)
committer	Tyler Nowicki <tnowicki@apple.com>
	Wed, 18 Jun 2014 00:51:32 +0000 (00:51 +0000)
clang/docs/LanguageExtensions.rst		patch \| blob \| history
clang/docs/ReleaseNotes.rst		patch \| blob \| history
clang/include/clang/Basic/Attr.td		patch \| blob \| history
clang/include/clang/Basic/AttrDocs.td		patch \| blob \| history