8 files changed, 717 insertions, 29 deletions
diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index c605b3bcf50..7ff04cde765 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -1926,11 +1926,11 @@ facilities of the standard C library available.
 This macro is defined when the C++ compiler is in use.  You can use
 @code{__cplusplus} to test whether a header is compiled by a C compiler
 or a C++ compiler.  This macro is similar to @code{__STDC_VERSION__}, in
-that it expands to a version number.  A fully conforming implementation
-of the 1998 C++ standard will define this macro to @code{199711L}.  The
-GNU C++ compiler is not yet fully conforming, so it uses @code{1}
-instead.  It is hoped to complete the implementation of standard C++
-in the near future.
+that it expands to a version number.  Depending on the language standard
+selected, the value of the macro is @code{199711L}, as mandated by the
+1998 C++ standard; @code{201103L}, per the 2011 C++ standard; an
+unspecified value strictly larger than @code{201103L} for the experimental 
+languages enabled by @option{-std=c++1y} and @option{-std=gnu++1y}.
 
 @item __OBJC__
 This macro is defined, with value 1, when the Objective-C compiler is in
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 76a90ecb31f..1c85a3e7382 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -82,6 +82,7 @@ extensions, accepted by GCC in C90 mode and in C++.
 * x86 specific memory model extensions for transactional memory:: x86 memory models.
 * Object Size Checking:: Built-in functions for limited buffer overflow
                         checking.
+* Cilk Plus Builtins::  Built-in functions for the Cilk Plus language extension.
 * Other Builtins::      Other built-in functions.
 * Target Builtins::     Built-in functions specific to particular targets.
 * Target Format Checks:: Format checks specific to particular targets.
@@ -7438,6 +7439,8 @@ This built-in function performs an atomic test-and-set operation on
 the byte at @code{*@var{ptr}}.  The byte is set to some implementation
 defined nonzero ``set'' value and the return value is @code{true} if and only
 if the previous contents were ``set''.
+It should be only used for operands of type @code{bool} or @code{char}. For 
+other types only part of the value may be set.
 
 All memory models are valid.
 
@@ -7447,6 +7450,10 @@ All memory models are valid.
 
 This built-in function performs an atomic clear operation on
 @code{*@var{ptr}}.  After the operation, @code{*@var{ptr}} contains 0.
+It should be only used for operands of type @code{bool} or @code{char} and 
+in conjunction with @code{__atomic_test_and_set}.
+For other types it may only clear partially. If the type is not @code{bool}
+prefer using @code{__atomic_store}.
 
 The valid memory model variants are
 @code{__ATOMIC_RELAXED}, @code{__ATOMIC_SEQ_CST}, and
@@ -7518,18 +7525,20 @@ End lock elision on a lock variable.
 Memory model must be @code{__ATOMIC_RELEASE} or stronger.
 @end table
 
-When a lock acquire fails it's required for good performance to abort
+When a lock acquire fails it is required for good performance to abort
 the transaction quickly. This can be done with a @code{_mm_pause}
 
 @smallexample
 #include <immintrin.h> // For _mm_pause
 
+int lockvar;
+
 /* Acquire lock with lock elision */
 while (__atomic_exchange_n(&lockvar, 1, __ATOMIC_ACQUIRE|__ATOMIC_HLE_ACQUIRE))
     _mm_pause(); /* Abort failed transaction */
 ...
 /* Free lock with lock elision */
-__atomic_clear(&lockvar, __ATOMIC_RELEASE|__ATOMIC_HLE_RELEASE);
+__atomic_store_n(&lockvar, 0, __ATOMIC_RELEASE|__ATOMIC_HLE_RELEASE);
 @end smallexample
 
 @node Object Size Checking
@@ -8788,6 +8797,32 @@ Similar to @code{__builtin_bswap32}, except the argument and return types
 are 64 bit.
 @end deftypefn
 
+@node Cilk Plus Builtins
+@section Cilk Plus C/C++ language extension Built-in Functions.
+
+GCC provides support for the following built-in reduction funtions if Cilk Plus
+is enabled. Cilk Plus can be enabled using the @option{-fcilkplus} flag.
+
+@itemize @bullet
+@item __sec_implicit_index
+@item __sec_reduce
+@item __sec_reduce_add
+@item __sec_reduce_all_nonzero
+@item __sec_reduce_all_zero
+@item __sec_reduce_any_nonzero
+@item __sec_reduce_any_zero
+@item __sec_reduce_max
+@item __sec_reduce_min
+@item __sec_reduce_max_ind
+@item __sec_reduce_min_ind
+@item __sec_reduce_mul
+@item __sec_reduce_mutating
+@end itemize
+
+Further details and examples about these built-in functions are described 
+in the Cilk Plus language manual which can be found at 
+@uref{http://www.cilkplus.org}.
+
 @node Target Builtins
 @section Built-in Functions Specific to Particular Target Machines
 
@@ -8812,6 +8847,7 @@ instructions, but allow the compiler to schedule those calls.
 * PowerPC Built-in Functions::
 * PowerPC AltiVec/VSX Built-in Functions::
 * RX Built-in Functions::
+* S/390 System z Built-in Functions::
 * SH Built-in Functions::
 * SPARC VIS Built-in Functions::
 * SPU Built-in Functions::
@@ -13937,6 +13973,341 @@ if the VSX instruction set is available.  The @samp{vec_vsx_ld} and
 @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
 @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
 
+If the ISA 2.07 additions to the vector/scalar (power8-vector)
+instruction set is available, the following additional functions are
+available for both 32-bit and 64-bit targets.  For 64-bit targets, you
+can use @var{vector long} instead of @var{vector long long},
+@var{vector bool long} instead of @var{vector bool long long}, and
+@var{vector unsigned long} instead of @var{vector unsigned long long}.
+
+@smallexample
+vector long long vec_abs (vector long long);
+
+vector long long vec_add (vector long long, vector long long);
+vector unsigned long long vec_add (vector unsigned long long,
+                                   vector unsigned long long);
+
+int vec_all_eq (vector long long, vector long long);
+int vec_all_ge (vector long long, vector long long);
+int vec_all_gt (vector long long, vector long long);
+int vec_all_le (vector long long, vector long long);
+int vec_all_lt (vector long long, vector long long);
+int vec_all_ne (vector long long, vector long long);
+int vec_any_eq (vector long long, vector long long);
+int vec_any_ge (vector long long, vector long long);
+int vec_any_gt (vector long long, vector long long);
+int vec_any_le (vector long long, vector long long);
+int vec_any_lt (vector long long, vector long long);
+int vec_any_ne (vector long long, vector long long);
+
+vector long long vec_eqv (vector long long, vector long long);
+vector long long vec_eqv (vector bool long long, vector long long);
+vector long long vec_eqv (vector long long, vector bool long long);
+vector unsigned long long vec_eqv (vector unsigned long long,
+                                   vector unsigned long long);
+vector unsigned long long vec_eqv (vector bool long long,
+                                   vector unsigned long long);
+vector unsigned long long vec_eqv (vector unsigned long long,
+                                   vector bool long long);
+vector int vec_eqv (vector int, vector int);
+vector int vec_eqv (vector bool int, vector int);
+vector int vec_eqv (vector int, vector bool int);
+vector unsigned int vec_eqv (vector unsigned int, vector unsigned int);
+vector unsigned int vec_eqv (vector bool unsigned int,
+                             vector unsigned int);
+vector unsigned int vec_eqv (vector unsigned int,
+                             vector bool unsigned int);
+vector short vec_eqv (vector short, vector short);
+vector short vec_eqv (vector bool short, vector short);
+vector short vec_eqv (vector short, vector bool short);
+vector unsigned short vec_eqv (vector unsigned short, vector unsigned short);
+vector unsigned short vec_eqv (vector bool unsigned short,
+                               vector unsigned short);
+vector unsigned short vec_eqv (vector unsigned short,
+                               vector bool unsigned short);
+vector signed char vec_eqv (vector signed char, vector signed char);
+vector signed char vec_eqv (vector bool signed char, vector signed char);
+vector signed char vec_eqv (vector signed char, vector bool signed char);
+vector unsigned char vec_eqv (vector unsigned char, vector unsigned char);
+vector unsigned char vec_eqv (vector bool unsigned char, vector unsigned char);
+vector unsigned char vec_eqv (vector unsigned char, vector bool unsigned char);
+
+vector long long vec_max (vector long long, vector long long);
+vector unsigned long long vec_max (vector unsigned long long,
+                                   vector unsigned long long);
+
+vector long long vec_min (vector long long, vector long long);
+vector unsigned long long vec_min (vector unsigned long long,
+                                   vector unsigned long long);
+
+vector long long vec_nand (vector long long, vector long long);
+vector long long vec_nand (vector bool long long, vector long long);
+vector long long vec_nand (vector long long, vector bool long long);
+vector unsigned long long vec_nand (vector unsigned long long,
+                                    vector unsigned long long);
+vector unsigned long long vec_nand (vector bool long long,
+                                   vector unsigned long long);
+vector unsigned long long vec_nand (vector unsigned long long,
+                                    vector bool long long);
+vector int vec_nand (vector int, vector int);
+vector int vec_nand (vector bool int, vector int);
+vector int vec_nand (vector int, vector bool int);
+vector unsigned int vec_nand (vector unsigned int, vector unsigned int);
+vector unsigned int vec_nand (vector bool unsigned int,
+                              vector unsigned int);
+vector unsigned int vec_nand (vector unsigned int,
+                              vector bool unsigned int);
+vector short vec_nand (vector short, vector short);
+vector short vec_nand (vector bool short, vector short);
+vector short vec_nand (vector short, vector bool short);
+vector unsigned short vec_nand (vector unsigned short, vector unsigned short);
+vector unsigned short vec_nand (vector bool unsigned short,
+                                vector unsigned short);
+vector unsigned short vec_nand (vector unsigned short,
+                                vector bool unsigned short);
+vector signed char vec_nand (vector signed char, vector signed char);
+vector signed char vec_nand (vector bool signed char, vector signed char);
+vector signed char vec_nand (vector signed char, vector bool signed char);
+vector unsigned char vec_nand (vector unsigned char, vector unsigned char);
+vector unsigned char vec_nand (vector bool unsigned char, vector unsigned char);
+vector unsigned char vec_nand (vector unsigned char, vector bool unsigned char);
+
+vector long long vec_orc (vector long long, vector long long);
+vector long long vec_orc (vector bool long long, vector long long);
+vector long long vec_orc (vector long long, vector bool long long);
+vector unsigned long long vec_orc (vector unsigned long long,
+                                   vector unsigned long long);
+vector unsigned long long vec_orc (vector bool long long,
+                                   vector unsigned long long);
+vector unsigned long long vec_orc (vector unsigned long long,
+                                   vector bool long long);
+vector int vec_orc (vector int, vector int);
+vector int vec_orc (vector bool int, vector int);
+vector int vec_orc (vector int, vector bool int);
+vector unsigned int vec_orc (vector unsigned int, vector unsigned int);
+vector unsigned int vec_orc (vector bool unsigned int,
+                             vector unsigned int);
+vector unsigned int vec_orc (vector unsigned int,
+                             vector bool unsigned int);
+vector short vec_orc (vector short, vector short);
+vector short vec_orc (vector bool short, vector short);
+vector short vec_orc (vector short, vector bool short);
+vector unsigned short vec_orc (vector unsigned short, vector unsigned short);
+vector unsigned short vec_orc (vector bool unsigned short,
+                               vector unsigned short);
+vector unsigned short vec_orc (vector unsigned short,
+                               vector bool unsigned short);
+vector signed char vec_orc (vector signed char, vector signed char);
+vector signed char vec_orc (vector bool signed char, vector signed char);
+vector signed char vec_orc (vector signed char, vector bool signed char);
+vector unsigned char vec_orc (vector unsigned char, vector unsigned char);
+vector unsigned char vec_orc (vector bool unsigned char, vector unsigned char);
+vector unsigned char vec_orc (vector unsigned char, vector bool unsigned char);
+
+vector int vec_pack (vector long long, vector long long);
+vector unsigned int vec_pack (vector unsigned long long,
+                              vector unsigned long long);
+vector bool int vec_pack (vector bool long long, vector bool long long);
+
+vector int vec_packs (vector long long, vector long long);
+vector unsigned int vec_packs (vector unsigned long long,
+                               vector unsigned long long);
+
+vector unsigned int vec_packsu (vector long long, vector long long);
+
+vector long long vec_rl (vector long long,
+                         vector unsigned long long);
+vector long long vec_rl (vector unsigned long long,
+                         vector unsigned long long);
+
+vector long long vec_sl (vector long long, vector unsigned long long);
+vector long long vec_sl (vector unsigned long long,
+                         vector unsigned long long);
+
+vector long long vec_sr (vector long long, vector unsigned long long);
+vector unsigned long long char vec_sr (vector unsigned long long,
+                                       vector unsigned long long);
+
+vector long long vec_sra (vector long long, vector unsigned long long);
+vector unsigned long long vec_sra (vector unsigned long long,
+                                   vector unsigned long long);
+
+vector long long vec_sub (vector long long, vector long long);
+vector unsigned long long vec_sub (vector unsigned long long,
+                                   vector unsigned long long);
+
+vector long long vec_unpackh (vector int);
+vector unsigned long long vec_unpackh (vector unsigned int);
+
+vector long long vec_unpackl (vector int);
+vector unsigned long long vec_unpackl (vector unsigned int);
+
+vector long long vec_vaddudm (vector long long, vector long long);
+vector long long vec_vaddudm (vector bool long long, vector long long);
+vector long long vec_vaddudm (vector long long, vector bool long long);
+vector unsigned long long vec_vaddudm (vector unsigned long long,
+                                       vector unsigned long long);
+vector unsigned long long vec_vaddudm (vector bool unsigned long long,
+                                       vector unsigned long long);
+vector unsigned long long vec_vaddudm (vector unsigned long long,
+                                       vector bool unsigned long long);
+
+vector long long vec_vclz (vector long long);
+vector unsigned long long vec_vclz (vector unsigned long long);
+vector int vec_vclz (vector int);
+vector unsigned int vec_vclz (vector int);
+vector short vec_vclz (vector short);
+vector unsigned short vec_vclz (vector unsigned short);
+vector signed char vec_vclz (vector signed char);
+vector unsigned char vec_vclz (vector unsigned char);
+
+vector signed char vec_vclzb (vector signed char);
+vector unsigned char vec_vclzb (vector unsigned char);
+
+vector long long vec_vclzd (vector long long);
+vector unsigned long long vec_vclzd (vector unsigned long long);
+
+vector short vec_vclzh (vector short);
+vector unsigned short vec_vclzh (vector unsigned short);
+
+vector int vec_vclzw (vector int);
+vector unsigned int vec_vclzw (vector int);
+
+vector long long vec_vmaxsd (vector long long, vector long long);
+
+vector unsigned long long vec_vmaxud (vector unsigned long long,
+                                      unsigned vector long long);
+
+vector long long vec_vminsd (vector long long, vector long long);
+
+vector unsigned long long vec_vminud (vector long long,
+                                      vector long long);
+
+vector int vec_vpksdss (vector long long, vector long long);
+vector unsigned int vec_vpksdss (vector long long, vector long long);
+
+vector unsigned int vec_vpkudus (vector unsigned long long,
+                                 vector unsigned long long);
+
+vector int vec_vpkudum (vector long long, vector long long);
+vector unsigned int vec_vpkudum (vector unsigned long long,
+                                 vector unsigned long long);
+vector bool int vec_vpkudum (vector bool long long, vector bool long long);
+
+vector long long vec_vpopcnt (vector long long);
+vector unsigned long long vec_vpopcnt (vector unsigned long long);
+vector int vec_vpopcnt (vector int);
+vector unsigned int vec_vpopcnt (vector int);
+vector short vec_vpopcnt (vector short);
+vector unsigned short vec_vpopcnt (vector unsigned short);
+vector signed char vec_vpopcnt (vector signed char);
+vector unsigned char vec_vpopcnt (vector unsigned char);
+
+vector signed char vec_vpopcntb (vector signed char);
+vector unsigned char vec_vpopcntb (vector unsigned char);
+
+vector long long vec_vpopcntd (vector long long);
+vector unsigned long long vec_vpopcntd (vector unsigned long long);
+
+vector short vec_vpopcnth (vector short);
+vector unsigned short vec_vpopcnth (vector unsigned short);
+
+vector int vec_vpopcntw (vector int);
+vector unsigned int vec_vpopcntw (vector int);
+
+vector long long vec_vrld (vector long long, vector unsigned long long);
+vector unsigned long long vec_vrld (vector unsigned long long,
+                                    vector unsigned long long);
+
+vector long long vec_vsld (vector long long, vector unsigned long long);
+vector long long vec_vsld (vector unsigned long long,
+                           vector unsigned long long);
+
+vector long long vec_vsrad (vector long long, vector unsigned long long);
+vector unsigned long long vec_vsrad (vector unsigned long long,
+                                     vector unsigned long long);
+
+vector long long vec_vsrd (vector long long, vector unsigned long long);
+vector unsigned long long char vec_vsrd (vector unsigned long long,
+                                         vector unsigned long long);
+
+vector long long vec_vsubudm (vector long long, vector long long);
+vector long long vec_vsubudm (vector bool long long, vector long long);
+vector long long vec_vsubudm (vector long long, vector bool long long);
+vector unsigned long long vec_vsubudm (vector unsigned long long,
+                                       vector unsigned long long);
+vector unsigned long long vec_vsubudm (vector bool long long,
+                                       vector unsigned long long);
+vector unsigned long long vec_vsubudm (vector unsigned long long,
+                                       vector bool long long);
+
+vector long long vec_vupkhsw (vector int);
+vector unsigned long long vec_vupkhsw (vector unsigned int);
+
+vector long long vec_vupklsw (vector int);
+vector unsigned long long vec_vupklsw (vector int);
+@end smallexample
+
+If the cryptographic instructions are enabled (@option{-mcrypto} or
+@option{-mcpu=power8}), the following builtins are enabled.
+
+@smallexample
+vector unsigned long long __builtin_crypto_vsbox (vector unsigned long long);
+
+vector unsigned long long __builtin_crypto_vcipher (vector unsigned long long,
+                                                    vector unsigned long long);
+
+vector unsigned long long __builtin_crypto_vcipherlast
+                                     (vector unsigned long long,
+                                      vector unsigned long long);
+
+vector unsigned long long __builtin_crypto_vncipher (vector unsigned long long,
+                                                     vector unsigned long long);
+
+vector unsigned long long __builtin_crypto_vncipherlast
+                                     (vector unsigned long long,
+                                      vector unsigned long long);
+
+vector unsigned char __builtin_crypto_vpermxor (vector unsigned char,
+                                                vector unsigned char,
+                                                vector unsigned char);
+
+vector unsigned short __builtin_crypto_vpermxor (vector unsigned short,
+                                                 vector unsigned short,
+                                                 vector unsigned short);
+
+vector unsigned int __builtin_crypto_vpermxor (vector unsigned int,
+                                               vector unsigned int,
+                                               vector unsigned int);
+
+vector unsigned long long __builtin_crypto_vpermxor (vector unsigned long long,
+                                                     vector unsigned long long,
+                                                     vector unsigned long long);
+
+vector unsigned char __builtin_crypto_vpmsumb (vector unsigned char,
+                                               vector unsigned char);
+
+vector unsigned short __builtin_crypto_vpmsumb (vector unsigned short,
+                                                vector unsigned short);
+
+vector unsigned int __builtin_crypto_vpmsumb (vector unsigned int,
+                                              vector unsigned int);
+
+vector unsigned long long __builtin_crypto_vpmsumb (vector unsigned long long,
+                                                    vector unsigned long long);
+
+vector unsigned long long __builtin_crypto_vshasigmad
+                               (vector unsigned long long, int, int);
+
+vector unsigned int __builtin_crypto_vshasigmaw (vector unsigned int,
+                                                 int, int);
+@end smallexample
+
+The second argument to the @var{__builtin_crypto_vshasigmad} and
+@var{__builtin_crypto_vshasigmaw} builtin functions must be a constant
+integer that is 0 or 1.  The third argument to these builtin functions
+must be a constant integer in the range of 0 to 15.
+
 @node RX Built-in Functions
 @subsection RX Built-in Functions
 GCC supports some of the RX instructions which cannot be expressed in
@@ -14052,6 +14423,120 @@ bit in the processor status word.
 Generates the @code{wait} machine instruction.
 @end deftypefn
 
+@node S/390 System z Built-in Functions
+@subsection S/390 System z Built-in Functions
+@deftypefn {Built-in Function} int __builtin_tbegin (void*)
+Generates the @code{tbegin} machine instruction starting a
+non-constraint hardware transaction.  If the parameter is non-NULL the
+memory area is used to store the transaction diagnostic buffer and
+will be passed as first operand to @code{tbegin}.  This buffer can be
+defined using the @code{struct __htm_tdb} C struct defined in
+@code{htmintrin.h} and must reside on a double-word boundary.  The
+second tbegin operand is set to @code{0xff0c}. This enables
+save/restore of all GPRs and disables aborts for FPR and AR
+manipulations inside the transaction body.  The condition code set by
+the tbegin instruction is returned as integer value.  The tbegin
+instruction by definition overwrites the content of all FPRs.  The
+compiler will generate code which saves and restores the FPRs.  For
+soft-float code it is recommended to used the @code{*_nofloat}
+variant.  In order to prevent a TDB from being written it is required
+to pass an constant zero value as parameter.  Passing the zero value
+through a variable is not sufficient.  Although modifications of
+access registers inside the transaction will not trigger an
+transaction abort it is not supported to actually modify them.  Access
+registers do not get saved when entering a transaction. They will have
+undefined state when reaching the abort code.
+@end deftypefn
+
+Macros for the possible return codes of tbegin are defined in the
+@code{htmintrin.h} header file:
+
+@table @code
+@item _HTM_TBEGIN_STARTED
+@code{tbegin} has been executed as part of normal processing.  The
+transaction body is supposed to be executed.
+@item _HTM_TBEGIN_INDETERMINATE
+The transaction was aborted due to an indeterminate condition which
+might be persistent.
+@item _HTM_TBEGIN_TRANSIENT
+The transaction aborted due to a transient failure.  The transaction
+should be re-executed in that case.
+@item _HTM_TBEGIN_PERSISTENT
+The transaction aborted due to a persistent failure.  Re-execution
+under same circumstances will not be productive.
+@end table
+
+@defmac _HTM_FIRST_USER_ABORT_CODE
+The @code{_HTM_FIRST_USER_ABORT_CODE} defined in @code{htmintrin.h}
+specifies the first abort code which can be used for
+@code{__builtin_tabort}.  Values below this threshold are reserved for
+machine use.
+@end defmac
+
+@deftp {Data type} {struct __htm_tdb}
+The @code{struct __htm_tdb} defined in @code{htmintrin.h} describes
+the structure of the transaction diagnostic block as specified in the
+Principles of Operation manual chapter 5-91.
+@end deftp
+
+@deftypefn {Built-in Function} int __builtin_tbegin_nofloat (void*)
+Same as @code{__builtin_tbegin} but without FPR saves and restores.
+Using this variant in code making use of FPRs will leave the FPRs in
+undefined state when entering the transaction abort handler code.
+@end deftypefn
+
+@deftypefn {Built-in Function} int __builtin_tbegin_retry (void*, int)
+In addition to @code{__builtin_tbegin} a loop for transient failures
+is generated.  If tbegin returns a condition code of 2 the transaction
+will be retried as often as specified in the second argument.  The
+perform processor assist instruction is used to tell the CPU about the
+number of fails so far.
+@end deftypefn
+
+@deftypefn {Built-in Function} int __builtin_tbegin_retry_nofloat (void*, int)
+Same as @code{__builtin_tbegin_retry} but without FPR saves and
+restores.  Using this variant in code making use of FPRs will leave
+the FPRs in undefined state when entering the transaction abort
+handler code.
+@end deftypefn
+
+@deftypefn {Built-in Function} void __builtin_tbeginc (void)
+Generates the @code{tbeginc} machine instruction starting a constraint
+hardware transaction.  The second operand is set to @code{0xff08}.
+@end deftypefn
+
+@deftypefn {Built-in Function} int __builtin_tend (void)
+Generates the @code{tend} machine instruction finishing a transaction
+and making the changes visible to other threads.  The condition code
+generated by tend is returned as integer value.
+@end deftypefn
+
+@deftypefn {Built-in Function} void __builtin_tabort (int)
+Generates the @code{tabort} machine instruction with the specified
+abort code.  Abort codes from 0 through 255 are reserved and will
+result in an error message.
+@end deftypefn
+
+@deftypefn {Built-in Function} void __builtin_tx_assist (int)
+Generates the @code{ppa rX,rY,1} machine instruction.  Where the
+integer parameter is loaded into rX and a value of zero is loaded into
+rY.  The integer parameter specifies the number of times the
+transaction repeatedly aborted.
+@end deftypefn
+
+@deftypefn {Built-in Function} int __builtin_tx_nesting_depth (void)
+Generates the @code{etnd} machine instruction.  The current nesting
+depth is returned as integer value.  For a nesting depth of 0 the code
+is not executed as part of an transaction.
+@end deftypefn
+
+@deftypefn {Built-in Function} void __builtin_non_tx_store (unsigned long long *, unsigned long long)
+
+Generates the @code{ntstg} machine instruction.  The second argument
+is written to the first arguments location.  The store operation will
+not be rolled-back in case of an transaction abort.
+@end deftypefn
+
 @node SH Built-in Functions
 @subsection SH Built-in Functions
 The following built-in functions are supported on the SH1, SH2, SH3 and SH4
diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 9db8e8fd8eb..2b675277790 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -122,15 +122,17 @@ gcov [@option{-v}|@option{--version}] [@option{-h}|@option{--help}]
      [@option{-a}|@option{--all-blocks}]
      [@option{-b}|@option{--branch-probabilities}]
      [@option{-c}|@option{--branch-counts}]
-     [@option{-u}|@option{--unconditional-branches}]
-     [@option{-n}|@option{--no-output}]
+     [@option{-d}|@option{--display-progress}]
+     [@option{-f}|@option{--function-summaries}]
+     [@option{-i}|@option{--intermediate-format}]
      [@option{-l}|@option{--long-file-names}]
+     [@option{-m}|@option{--demangled-names}]
+     [@option{-n}|@option{--no-output}]
+     [@option{-o}|@option{--object-directory} @var{directory|file}]
      [@option{-p}|@option{--preserve-paths}]
      [@option{-r}|@option{--relative-only}]
-     [@option{-f}|@option{--function-summaries}]
-     [@option{-o}|@option{--object-directory} @var{directory|file}]
      [@option{-s}|@option{--source-prefix} @var{directory}]
-     [@option{-d}|@option{--display-progress}]
+     [@option{-u}|@option{--unconditional-branches}]
      @var{files}
 @c man end
 @c man begin SEEALSO
@@ -232,6 +234,50 @@ Unconditional branches are normally not interesting.
 @itemx --display-progress
 Display the progress on the standard output.
 
+@item -i
+@itemx --intermediate-format
+Output gcov file in an easy-to-parse intermediate text format that can
+be used by @command{lcov} or other tools. The output is a single
+@file{.gcov} file per @file{.gcda} file. No source code is required.
+
+The format of the intermediate @file{.gcov} file is plain text with
+one entry per line
+
+@smallexample
+file:@var{source_file_name}
+function:@var{line_number},@var{execution_count},@var{function_name}
+lcount:@var{line number},@var{execution_count}
+branch:@var{line_number},@var{branch_coverage_type}
+
+Where the @var{branch_coverage_type} is
+   notexec (Branch not executed)
+   taken (Branch executed and taken)
+   nottaken (Branch executed, but not taken)
+
+There can be multiple @var{file} entries in an intermediate gcov
+file. All entries following a @var{file} pertain to that source file
+until the next @var{file} entry.
+@end smallexample
+
+Here is a sample when @option{-i} is used in conjuction with @option{-b} option:
+
+@smallexample
+file:array.cc
+function:11,1,_Z3sumRKSt6vectorIPiSaIS0_EE
+function:22,1,main
+lcount:11,1
+lcount:12,1
+lcount:14,1
+branch:14,taken
+lcount:26,1
+branch:28,nottaken
+@end smallexample
+
+@item -m
+@itemx --demangled-names
+Display demangled function names in output. The default is to show
+mangled function names.
+
 @end table
 
 @command{gcov} should be run with the current directory the same as that
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f02c226e5a9..1496d3042af 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -512,7 +512,8 @@ Objective-C and Objective-C++ Dialects}.
 -mword-relocations @gol
 -mfix-cortex-m3-ldrd @gol
 -munaligned-access @gol
--mneon-for-64bits}
+-mneon-for-64bits @gol
+-mrestrict-it}
 
 @emph{AVR Options}
 @gccoptlist{-mmcu=@var{mcu} -maccumulate-args -mbranch-cost=@var{cost} @gol
@@ -752,6 +753,7 @@ Objective-C and Objective-C++ Dialects}.
 -mno-float -msingle-float  -mdouble-float  @gol
 -mdsp  -mno-dsp  -mdspr2  -mno-dspr2 @gol
 -mmcu -mmno-mcu @gol
+-meva -mno-eva @gol
 -mmicromips -mno-micromips @gol
 -mfpu=@var{fpu-type} @gol
 -msmartmips  -mno-smartmips @gol
@@ -860,7 +862,10 @@ See RS/6000 and PowerPC Options.
 -mno-recip-precision @gol
 -mveclibabi=@var{type} -mfriz -mno-friz @gol
 -mpointers-to-nested-functions -mno-pointers-to-nested-functions @gol
--msave-toc-indirect -mno-save-toc-indirect}
+-msave-toc-indirect -mno-save-toc-indirect @gol
+-mpower8-fusion -mno-mpower8-fusion -mpower8-vector -mno-power8-vector @gol
+-mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol
+-mquad-memory -mno-quad-memory}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -933,7 +938,7 @@ See RS/6000 and PowerPC Options.
 -mvis2  -mno-vis2  -mvis3  -mno-vis3 @gol
 -mcbcond -mno-cbcond @gol
 -mfmaf  -mno-fmaf  -mpopc  -mno-popc @gol
--mfix-at697f}
+-mfix-at697f -mfix-ut699}
 
 @emph{SPU Options}
 @gccoptlist{-mwarn-reloc -merror-reloc @gol
@@ -1796,6 +1801,17 @@ Program Interface v3.0 @w{@uref{http://www.openmp.org/}}.  This option
 implies @option{-pthread}, and thus is only supported on targets that
 have support for @option{-pthread}.
 
+@item -fcilkplus
+@opindex fcilkplus
+@cindex Enable Cilk Plus
+Enable the usage of Cilk Language extension features for C/C++.  When the flag
+@option{-fcilkplus} is specified, all the Cilk Plus components are converted 
+to the appropriate C/C++ code.  The present implementation follows ABI version 
+0.9.  There are four major parts to Cilk Plus language 
+extension: Array Notations, Cilk Keywords, SIMD annotations and elemental 
+functions.  Detailed information about Cilk Plus can be found at 
+@w{@uref{http://www.cilkplus.org}}. 
+
 @item -fgnu-tm
 @opindex fgnu-tm
 When the option @option{-fgnu-tm} is specified, the compiler
@@ -6158,7 +6174,7 @@ Controls optimization dumps from various optimization passes. If the
 @samp{-@var{options}} form is used, @var{options} is a list of
 @samp{-} separated options to select the dump details and
 optimizations.  If @var{options} is not specified, it defaults to
-@option{all} for details and @option{optall} for optimization
+@option{optimized} for details and @option{optall} for optimization
 groups. If the @var{filename} is not specified, it defaults to
 @file{stderr}. Note that the output @var{filename} will be overwritten
 in case of multiple translation units. If a combined output from
@@ -11618,6 +11634,12 @@ defined.
 Enables using Neon to handle scalar 64-bits operations. This is
 disabled by default since the cost of moving data from core registers
 to Neon is high.
+
+@item -mrestrict-it
+@opindex mrestrict-it
+Restricts generation of IT blocks to conform to the rules of ARMv8.
+IT blocks can only contain a single 16-bit instruction from a select
+set of instructions. This option is on by default for ARMv8 Thumb mode.
 @end table
 
 @node AVR Options
@@ -13811,10 +13833,19 @@ Intel Core CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3,
 SSE4.1, SSE4.2, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C instruction
 set support.
 
+@item core-avx2
+Intel Core CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
+SSE4.1, SSE4.2, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI, BMI2
+and F16C instruction set support.
+
 @item atom
-Intel Atom CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3
+Intel Atom CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3 and SSSE3
 instruction set support.
 
+@item slm
+Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
+SSE4.1 and SSE4.2 instruction set support.
+
 @item k6
 AMD K6 CPU with MMX instruction set support.
 
@@ -15993,7 +16024,7 @@ The processor names are:
 @samp{1004kc}, @samp{1004kf2_1}, @samp{1004kf1_1},
 @samp{loongson2e}, @samp{loongson2f}, @samp{loongson3a},
 @samp{m4k},
-@samp{m14k}, @samp{m14ke}, @samp{m14kec},
+@samp{m14k}, @samp{m14kc}, @samp{m14ke}, @samp{m14kec},
 @samp{octeon}, @samp{octeon+}, @samp{octeon2},
 @samp{orion},
 @samp{r2000}, @samp{r3000}, @samp{r3900}, @samp{r4000}, @samp{r4400},
@@ -16362,6 +16393,12 @@ Use (do not use) MT Multithreading instructions.
 @opindex mno-mcu
 Use (do not use) the MIPS MCU ASE instructions.
 
+@item -meva
+@itemx -mno-eva
+@opindex meva
+@opindex mno-eva
+Use (do not use) the MIPS Enhanced Virtual Addressing instructions.
+
 @item -mlong64
 @opindex mlong64
 Force @code{long} types to be 64 bits wide.  See @option{-mlong32} for
@@ -17341,7 +17378,8 @@ following options:
 @gccoptlist{-maltivec  -mfprnd  -mhard-float  -mmfcrf  -mmultiple @gol
 -mpopcntb -mpopcntd  -mpowerpc64 @gol
 -mpowerpc-gpopt  -mpowerpc-gfxopt  -msingle-float -mdouble-float @gol
--msimple-fpu -mstring  -mmulhw  -mdlmzb  -mmfpgpr -mvsx}
+-msimple-fpu -mstring  -mmulhw  -mdlmzb  -mmfpgpr -mvsx @gol
+-mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory}
 
 The particular options set for any particular CPU varies between
 compiler versions, depending on what setting seems to produce optimal
@@ -17459,6 +17497,47 @@ Generate code that uses (does not use) vector/scalar (VSX)
 instructions, and also enable the use of built-in functions that allow
 more direct access to the VSX instruction set.
 
+@item -mcrypto
+@itemx -mno-crypto
+@opindex mcrypto
+@opindex mno-crypto
+Enable the use (disable) of the built-in functions that allow direct
+access to the cryptographic instructions that were added in version
+2.07 of the PowerPC ISA.
+
+@item -mdirect-move
+@itemx -mno-direct-move
+@opindex mdirect-move
+@opindex mno-direct-move
+Generate code that uses (does not use) the instructions to move data
+between the general purpose registers and the vector/scalar (VSX)
+registers that were added in version 2.07 of the PowerPC ISA.
+
+@item -mpower8-fusion
+@itemx -mno-power8-fusion
+@opindex mpower8-fusion
+@opindex mno-power8-fusion
+Generate code that keeps (does not keeps) some integer operations
+adjacent so that the instructions can be fused together on power8 and
+later processors.
+
+@item -mpower8-vector
+@itemx -mno-power8-vector
+@opindex mpower8-vector
+@opindex mno-power8-vector
+Generate code that uses (does not use) the vector and scalar
+instructions that were added in version 2.07 of the PowerPC ISA.  Also
+enable the use of built-in functions that allow more direct access to
+the vector instructions.
+
+@item -mquad-memory
+@itemx -mno-quad-memory
+@opindex mquad-memory
+@opindex mno-quad-memory
+Generate code that uses (does not use) the quad word memory
+instructions.  The @option{-mquad-memory} option requires use of
+64-bit mode.
+
 @item -mfloat-gprs=@var{yes/single/double/no}
 @itemx -mfloat-gprs
 @opindex mfloat-gprs
@@ -19404,6 +19483,11 @@ later.
 @opindex mfix-at697f
 Enable the documented workaround for the single erratum of the Atmel AT697F
 processor (which corresponds to erratum #13 of the AT697E processor).
+
+@item -mfix-ut699
+@opindex mfix-ut699
+Enable the documented workarounds for the floating-point errata of the UT699
+processor.
 @end table
 
 These @samp{-m} options are supported in addition to the above
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index f5dd5478338..3b20991af52 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1711,9 +1711,6 @@ Floating point constant zero
 @item Z
 Integer constant zero
 
-@item Usa
-An absolute symbolic address
-
 @item Ush
 The high part (bits 12 and upwards) of the pc-relative address of a symbol
 within 4GB of the instruction
@@ -2055,7 +2052,7 @@ Any constant whose absolute value is no greater than 4-bits.
 
 @end table
 
-@item PowerPC and IBM RS6000---@file{config/rs6000/rs6000.h}
+@item PowerPC and IBM RS6000---@file{config/rs6000/constraints.md}
 @table @code
 @item b
 Address base register
@@ -2069,6 +2066,9 @@ Floating point register (containing 32-bit value)
 @item v
 Altivec vector register
 
+@item wa
+Any VSX register
+
 @item wd
 VSX vector register to hold vector double data
 
@@ -2081,6 +2081,15 @@ If @option{-mmfpgpr} was used, a floating point register
 @item wl
 If the LFIWAX instruction is enabled, a floating point register
 
+@item wm
+If direct moves are enabled, a VSX register.
+
+@item wn
+No register.
+
+@item wr
+General purpose register if 64-bit mode is used
+
 @item ws
 VSX vector register to hold scalar float data
 
@@ -2093,8 +2102,9 @@ If the STFIWX instruction is enabled, a floating point register
 @item wz
 If the LFIWZX instruction is enabled, a floating point register
 
-@item wa
-Any VSX register
+@item wQ
+A memory address that will work with the @code{lq} and @code{stq}
+instructions.
 
 @item h
 @samp{MQ}, @samp{CTR}, or @samp{LINK} register
@@ -8856,7 +8866,8 @@ can be quite tedious to describe these forms directly in the
 (define_cond_exec
   [@var{predicate-pattern}]
   "@var{condition}"
-  "@var{output-template}")
+  "@var{output-template}"
+  "@var{optional-insn-attribues}")
 @end smallexample
 
 @var{predicate-pattern} is the condition that must be true for the
@@ -8877,6 +8888,13 @@ In order to handle the general case, there is a global variable
 @code{current_insn_predicate} that will contain the entire predicate
 if the current insn is predicated, and will otherwise be @code{NULL}.
 
+@var{optional-insn-attributes} is an optional vector of attributes that gets
+appended to the insn attributes of the produced cond_exec rtx. It can
+be used to add some distinguishing attribute to cond_exec rtxs produced
+that way. An example usage would be to use this attribute in conjunction
+with attributes on the main pattern to disable particular alternatives under
+certain conditions.
+
 When @code{define_cond_exec} is used, an implicit reference to
 the @code{predicable} instruction attribute is made.
 @xref{Insn Attributes}.  This attribute must be a boolean (i.e.@: have
diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi
index 654f2295e39..045f964a939 100644
--- a/gcc/doc/passes.texi
+++ b/gcc/doc/passes.texi
@@ -17,6 +17,7 @@ where near complete.
 
 @menu
 * Parsing pass::         The language front end turns text into bits.
+* Cilk Plus Transformation:: Transform Cilk Plus Code to equivalent C/C++.
 * Gimplification pass::  The bits are turned into something we can optimize.
 * Pass manager::         Sequencing the optimization passes.
 * Tree SSA passes::      Optimizations on a high-level representation.
@@ -101,6 +102,36 @@ that is more descriptive than "rest_of".
 The middle-end will, at its option, emit the function and data
 definitions immediately or queue them for later processing.
 
+@node Cilk Plus Transformation
+@section Cilk Plus Transformation
+@cindex CILK_PLUS
+
+If Cilk Plus generation (flag @option{-fcilkplus}) is enabled, all the Cilk 
+Plus code is transformed into equivalent C and C++ functions.  Majority of this 
+transformation occurs toward the end of the parsing and right before the 
+gimplification pass.  
+
+These are the major components to the Cilk Plus language extension:
+@itemize @bullet
+@item Array Notations:
+During parsing phase, all the array notation specific information is stored in 
+@code{ARRAY_NOTATION_REF} tree using the function 
+@code{c_parser_array_notation}.  During the end of parsing, we check the entire
+function to see if there are any array notation specific code (using the 
+function @code{contains_array_notation_expr}).  If this function returns 
+true, then we expand them using either @code{expand_array_notation_exprs} or
+@code{build_array_notation_expr}.  For the cases where array notations are 
+inside conditions, they are transformed using the function 
+@code{fix_conditional_array_notations}.  The C language-specific routines are 
+located in @file{c/c-array-notation.c} and the equivalent C++ routines are in 
+file @file{cp/cp-array-notation.c}.  Common routines such as functions to 
+initialize builtin functions are stored in @file{array-notation-common.c}.
+@end itemize
+
+Detailed information about Cilk Plus and language specification is provided in 
+@w{@uref{http://www.cilkplus.org/}}.  It is worth mentioning that the current 
+implementation follows ABI 0.9.
+
 @node Gimplification pass
 @section Gimplification pass
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 2482eb484b0..f030b56ef6d 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -1078,6 +1078,15 @@ arrays to be word-aligned so that @code{strcpy} calls that copy
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
@@ -2898,6 +2907,10 @@ A target hook which returns true if we use LRA instead of reload pass.  It means
 A target hook which returns the register priority number to which the  register @var{hard_regno} belongs to.  The bigger the number, the  more preferable the hard register usage (when all other conditions are  the same).  This hook can be used to prefer some hard register over  others in LRA.  For example, some x86-64 register usage needs  additional prefix which makes instructions longer.  The hook can  return lower priority number for such registers make them less favorable  and as result making the generated code smaller.    The default version of this target hook returns always zero.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_REGISTER_USAGE_LEVELING_P (void)
+A target hook which returns true if we need register usage leveling.  That means if a few hard registers are equally good for the  assignment, we choose the least used hard register.  The register  usage leveling may be profitable for some targets.  Don't use the  usage leveling for targets with conditional execution or targets  with big register files as it hurts if-conversion and cross-jumping  optimizations.    The default version of this target hook returns always false.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_DIFFERENT_ADDR_DISPLACEMENT_P (void)
 A target hook which returns true if an address with the same structure  can have different maximal legitimate displacement.  For example, the  displacement can depend on memory mode or on operand combinations in  the insn.    The default version of this target hook returns always false.
 @end deftypefn
@@ -6765,7 +6778,7 @@ Deallocate internal data in target scheduling context pointed to by @var{tc}.
 Deallocate a store for target scheduling context pointed to by @var{tc}.
 @end deftypefn
 
-@deftypefn {Target Hook} int TARGET_SCHED_SPECULATE_INSN (rtx @var{insn}, int @var{request}, rtx *@var{new_pat})
+@deftypefn {Target Hook} int TARGET_SCHED_SPECULATE_INSN (rtx @var{insn}, unsigned int @var{dep_status}, rtx *@var{new_pat})
 This hook is called by the insn scheduler when @var{insn} has only
 speculative dependencies and therefore can be scheduled speculatively.
 The hook is used to check if the pattern of @var{insn} has a speculative
@@ -6776,13 +6789,13 @@ speculation.  If the return value equals 1 then @var{new_pat} is assigned
 the generated speculative pattern.
 @end deftypefn
 
-@deftypefn {Target Hook} bool TARGET_SCHED_NEEDS_BLOCK_P (int @var{dep_status})
+@deftypefn {Target Hook} bool TARGET_SCHED_NEEDS_BLOCK_P (unsigned int @var{dep_status})
 This hook is called by the insn scheduler during generation of recovery code
 for @var{insn}.  It should return @code{true}, if the corresponding check
 instruction should branch to recovery code, or @code{false} otherwise.
 @end deftypefn
 
-@deftypefn {Target Hook} rtx TARGET_SCHED_GEN_SPEC_CHECK (rtx @var{insn}, rtx @var{label}, int @var{mutate_p})
+@deftypefn {Target Hook} rtx TARGET_SCHED_GEN_SPEC_CHECK (rtx @var{insn}, rtx @var{label}, unsigned int @var{ds})
 This hook is called by the insn scheduler to generate a pattern for recovery
 check instruction.  If @var{mutate_p} is zero, then @var{insn} is a
 speculative instruction for which the check should be generated.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 611d6813a56..cc25fec495e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -1062,6 +1062,15 @@ arrays to be word-aligned so that @code{strcpy} calls that copy
 constants to character arrays can be done inline.
 @end defmac
 
+@defmac DATA_ABI_ALIGNMENT (@var{type}, @var{basic-align})
+Similar to @code{DATA_ALIGNMENT}, but for the cases where the ABI mandates
+some alignment increase, instead of optimization only purposes.  E.g.@
+AMD x86-64 psABI says that variables with array type larger than 15 bytes
+must be aligned to 16 byte boundaries.
+
+If this macro is not defined, then @var{basic-align} is used.
+@end defmac
+
 @defmac CONSTANT_ALIGNMENT (@var{constant}, @var{basic-align})
 If defined, a C expression to compute the alignment given to a constant
 that is being placed in memory.  @var{constant} is the constant and
@@ -2870,6 +2879,8 @@ as below:
 
 @hook TARGET_REGISTER_PRIORITY
 
+@hook TARGET_REGISTER_USAGE_LEVELING_P
+
 @hook TARGET_DIFFERENT_ADDR_DISPLACEMENT_P
 
 @hook TARGET_SPILL_CLASS