Upstream version 9.38.198.0

[platform/framework/web/crosswalk.git] / src / native_client_sdk / src / doc / reference / pnacl-c-cpp-language-support.rst
diff --git a/src/native_client_sdk/src/doc/reference/pnacl-c-cpp-language-support.rst b/src/native_client_sdk/src/doc/reference/pnacl-c-cpp-language-support.rst

index 5bb5ac8..6fb8751 100644 (file)
--- a/src/native_client_sdk/src/doc/reference/pnacl-c-cpp-language-support.rst
+++ b/src/native_client_sdk/src/doc/reference/pnacl-c-cpp-language-support.rst
@@ -59,7 +59,7 @@ locations to each other as the C11/C++11 standards do.
  Non-atomic memory accesses may be reordered, separated, elided or fused
  according to C and C++'s memory model before the pexe is created as well
  as after its creation. Accessing atomic memory location through
-non-atomic primitives is `Undefined Behavior <undefined_behavior>`.
+non-atomic primitives is :ref:`Undefined Behavior <undefined_behavior>`.
  
  As in C11/C++11 some atomic accesses may be implemented with locks on
  certain platforms. The ``ATOMIC_*_LOCK_FREE`` macros will always be
@@ -170,6 +170,8 @@ in `Memory Model and Atomics`_.
  PNaCl and NaCl support ``setjmp`` and ``longjmp`` without any
  restrictions beyond C's.
  
+.. _exception_handling:
+
  C++ Exception Handling
  ======================
  
@@ -194,15 +196,200 @@ PNaCl this barrier is only guaranteed to order ``volatile`` and atomic
  memory accesses, though in practice the implementation attempts to also
  prevent reordering of memory accesses to objects which may escape.
  
+PNaCl supports :ref:`Portable SIMD Vectors <portable_simd_vectors>`,
+which are traditionally expressed through target-specific intrinsics or
+inline assembly.
+
  NaCl supports a fairly wide subset of inline assembly through GCC's
  inline assembly syntax, with the restriction that the sandboxing model
  for the target architecture has to be respected.
  
+.. _portable_simd_vectors:
+
+Portable SIMD Vectors
+=====================
+
+SIMD vectors aren't part of the C/C++ standards and are traditionally
+very hardware-specific. Portable Native Client offers a portable version
+of SIMD vector datatypes and operations which map well to modern
+architectures and offer performance which matches or approaches
+hardware-specific uses.
+
+SIMD vector support was added to Portable Native Client for version 37 of Chrome
+and more features, including performance enhancements, have been added in
+subsequent releases, see the :ref:`Release Notes <sdk-release-notes>` for more
+details.
+
+Hand-Coding Vector Extensions
+-----------------------------
+
+The initial vector support in Portable Native Client adds `LLVM vectors
+<http://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors>`_
+and `GCC vectors
+<http://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html>`_ since these
+are well supported by different hardware platforms and don't require any
+new compiler intrinsics.
+
+Vector types can be used through the ``vector_size`` attribute:
+
+.. naclcode::
+
+  #define VECTOR_BYTES 16
+  typedef int v4s __attribute__((vector_size(VECTOR_BYTES)));
+  v4s a = {1,2,3,4};
+  v4s b = {5,6,7,8};
+  v4s c, d, e;
+  c = a + b;  /* c = {6,8,10,12} */
+  d = b >> a; /* d = {2,1,0,0} */
+
+Vector comparisons are represented as a bitmask as wide as the compared
+elements of all ``0`` or all ``1``:
+
+.. naclcode::
+
+  typedef int v4s __attribute__((vector_size(16)));
+  v4s snip(v4s in) {
+    v4s limit = {32,64,128,256};
+    v4s mask = in > limit;
+    v4s ret = in & mask;
+    return ret;
+  }
+
+Vector datatypes are currently expected to be 128-bit wide with one of the
+following element types, and they're expected to be aligned to the underlying
+element's bit width (loads and store will otherwise be broken up into scalar
+accesses to prevent faults):
+
+============  ============  ================ ======================
+Type          Num Elements  Vector Bit Width Expected Bit Alignment
+============  ============  ================ ======================
+``uint8_t``   16            128              8
+``int8_t``    16            128              8
+``uint16_t``  8             128              16
+``int16_t``   8             128              16
+``uint32_t``  4             128              32
+``int32_t``   4             128              32
+``float``     4             128              32
+============  ============  ================ ======================
+
+64-bit integers and double-precision floating point will be supported in
+a future release, as will 256-bit and 512-bit vectors.
+
+Vector element bit width alignment can be stated explicitly (this is assumed by
+PNaCl, but not necessarily by other compilers), and smaller alignments can also
+be specified:
+
+.. naclcode::
+
+  typedef int v4s_element   __attribute__((vector_size(16), aligned(4)));
+  typedef int v4s_unaligned __attribute__((vector_size(16), aligned(1)));
+
+
+The following operators are supported on vectors:
+
++----------------------------------------------+
+| unary ``+``, ``-``                           |
++----------------------------------------------+
+| ``++``, ``--``                               |
++----------------------------------------------+
+| ``+``, ``-``, ``*``, ``/``, ``%``            |
++----------------------------------------------+
+| ``&``, ``|``, ``^``, ``~``                   |
++----------------------------------------------+
+| ``>>``, ``<<``                               |
++----------------------------------------------+
+| ``!``, ``&&``, ``||``                        |
++----------------------------------------------+
+| ``==``, ``!=``, ``>``, ``<``, ``>=``, ``<=`` |
++----------------------------------------------+
+| ``=``                                        |
++----------------------------------------------+
+
+C-style casts can be used to convert one vector type to another without
+modifying the underlying bits. ``__builtin_convertvector`` can be used
+to convert from one type to another provided both types have the same
+number of elements, truncating when converting from floating-point to
+integer.
+
+.. naclcode::
+
+  typedef unsigned v4u __attribute__((vector_size(16)));
+  typedef float v4f __attribute__((vector_size(16)));
+  v4u a = {0x3f19999a,0x40000000,0x40490fdb,0x66ff0c30};
+  v4f b = (v4f) a; /* b = {0.6,2,3.14159,6.02214e+23}  */
+  v4u c = __builtin_convertvector(b, v4u); /* c = {0,2,3,0} */
+
+It is also possible to use array-style indexing into vectors to extract
+individual elements using ``[]``.
+
+.. naclcode::
+
+  typedef unsigned v4u __attribute__((vector_size(16)));
+  template<typename T>
+  void print(const T v) {
+    for (size_t i = 0; i != sizeof(v) / sizeof(v[0]); ++i)
+      std::cout << v[i] << ' ';
+    std::cout << std::endl;
+  }
+
+Vector shuffles (often called permutation or swizzle) operations are
+supported through ``__builtin_shufflevector``. The builtin has two
+vector arguments of the same element type, followed by a list of
+constant integers that specify the element indices of the first two
+vectors that should be extracted and returned in a new vector. These
+element indices are numbered sequentially starting with the first
+vector, continuing into the second vector. Thus, if ``vec1`` is a
+4-element vector, index ``5`` would refer to the second element of
+``vec2``. An index of ``-1`` can be used to indicate that the
+corresponding element in the returned vector is a don’t care and can be
+optimized by the backend.
+
+The result of ``__builtin_shufflevector`` is a vector with the same
+element type as ``vec1`` / ``vec2`` but that has an element count equal
+to the number of indices specified.
+
+.. naclcode::
+
+  // identity operation - return 4-element vector v1.
+  __builtin_shufflevector(v1, v1, 0, 1, 2, 3)
+
+  // "Splat" element 0 of v1 into a 4-element result.
+  __builtin_shufflevector(v1, v1, 0, 0, 0, 0)
+
+  // Reverse 4-element vector v1.
+  __builtin_shufflevector(v1, v1, 3, 2, 1, 0)
+
+  // Concatenate every other element of 4-element vectors v1 and v2.
+  __builtin_shufflevector(v1, v2, 0, 2, 4, 6)
+
+  // Concatenate every other element of 8-element vectors v1 and v2.
+  __builtin_shufflevector(v1, v2, 0, 2, 4, 6, 8, 10, 12, 14)
+
+  // Shuffle v1 with some elements being undefined
+  __builtin_shufflevector(v1, v1, 3, -1, 1, -1)
+
+One common use of ``__builtin_shufflevector`` is to perform
+vector-scalar operations:
+
+.. naclcode::
+
+  typedef int v4s __attribute__((vector_size(16)));
+  v4s shift_right_by(v4s shift_me, int shift_amount) {
+    v4s tmp = {shift_amount};
+    return shift_me >> __builtin_shuffle_vector(tmp, tmp, 0, 0, 0, 0);
+  }
+
+Auto-Vectorization
+------------------
+
+Auto-vectorization is currently not enabled for Portable Native Client,
+but will be in a future release.
+
  Undefined Behavior
  ==================
  
  The C and C++ languages expose some undefined behavior which is
-discussed in `PNaCl Undefined Behavior <undefined_behavior>`.
+discussed in :ref:`PNaCl Undefined Behavior <undefined_behavior>`.
  
  Floating-Point
  ==============
@@ -250,14 +437,6 @@ NaCl supports computed ``goto`` without any transformation.
  Future Directions
  =================
  
-SIMD
-----
-
-PNaCl currently doesn't support SIMD. We plan to add SIMD support in the
-very near future.
-
-NaCl supports SIMD.
-
  Inter-Process Communication
  ---------------------------