2016-04-15 Basile Starynkevitch <basile@starynkevitch.net>

{{merging with even more of GCC 6, using subversion 1.9 svn merge -r230101:230160 ^/trunk }} git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/melt-branch@235026 138bc75d-0d04-0410-961f-82ee72b054a4
author: bstarynk <bstarynk@138bc75d-0d04-0410-961f-82ee72b054a4> 2016-04-15 13:13:48 +0000
committer: bstarynk <bstarynk@138bc75d-0d04-0410-961f-82ee72b054a4> 2016-04-15 13:13:48 +0000
commit: df168526dd4d08c5faa014d585874f978bf73d80 (patch)
tree: baab9f6705e45f350fc6dbdd45e2924ae75d2d1b
parent: ffbf47a37f7d2d4aa647f4bf0f231a8f2399049b (diff)
download: gcc-df168526dd4d08c5faa014d585874f978bf73d80.tar.gz
86 files changed, 2946 insertions, 525 deletions
diff --git a/ChangeLog.MELT b/ChangeLog.MELT
index e7ffae7f403..2705c783c3e 100644
--- a/ChangeLog.MELT
+++ b/ChangeLog.MELT
@@ -1,5 +1,10 @@
 
 2016-04-15   Basile Starynkevitch  <basile@starynkevitch.net>
+	{{merging with even more of GCC 6, using subversion 1.9
+	svn merge -r230101:230160 ^/trunk
+	}}
+
+2016-04-15   Basile Starynkevitch  <basile@starynkevitch.net>
 	{{trouble merging with  GCC 6 svn rev 230222, should investigate}}
 
 2016-04-15   Basile Starynkevitch  <basile@starynkevitch.net>
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2100306a011..c29980930a9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,296 @@
+2015-11-11  Simon Dardis  <simon.dardis@imgtec.com>
+
+	* config/mips/mips.c (mips_breakable_sequence_p): New function.
+	(mips_break_sequence): New function. 
+	(mips_reorg_process_insns): Use them. Use compact branches in selected
+	situations.
+
+2015-11-11  Alan Lawrence  <alan.lawrence@arm.com>
+
+	* fold-const.c (get_array_ctor_element_at_index): Fix whitespace, typo.
+
+2015-11-11  Jiong Wang  <jiong.wang@arm.com>
+	    Jim Wilson  <wilson@gcc.gnu.org>
+
+	PR target/67305
+	* config/arm/arm.md (neon_vector_mem_operand): Return FALSE if strict
+	be true and eliminable registers mentioned.
+
+2015-11-11  Claudiu Zissulescu  <claziss@synopsys.com>
+
+	* common/config/arc/arc-common.c (arc_handle_option): Handle ARCv2
+	options.
+	* config/arc/arc-opts.h: Add ARCv2 CPUs.
+	* config/arc/arc-protos.h (arc_secondary_reload_conv): Prototype.
+	* config/arc/arc.c (arc_secondary_reload): Handle subreg (reg)
+	situation, and store instructions with large offsets.
+	(arc_secondary_reload_conv): New function.
+	(arc_init): Add ARCv2 options.
+	(arc_conditional_register_usage): Select the proper register usage
+	for ARCv2 processors.
+	(arc_handle_interrupt_attribute): ILINK2 is only valid for ARCv1
+	architecture.
+	(arc_compute_function_type): Likewise.
+	(arc_print_operand): Handle new ARCv2 punctuation characters.
+	(arc_return_in_memory): ARCv2 ABI returns in registers up to 16
+	bytes.
+	(workaround_arc_anomaly, arc_asm_insn_p, arc_loop_hazard): New
+	function.
+	(arc_reorg, arc_hazard): Use it.
+	* config/arc/arc.h (TARGET_CPU_CPP_BUILTINS): Define __HS__ and
+	__EM__.
+	(ASM_SPEC): Add ARCv2 options.
+	(TARGET_NORM): ARC HS has norm instructions by default.
+	(TARGET_OPTFPE): Use optimized floating point emulation for ARC
+	HS.
+	(TARGET_AT_DBR_CONDEXEC): Only for ARC600 family.
+	(TARGET_EM, TARGET_HS, TARGET_V2, TARGET_MPYW, TARGET_MULTI):
+	Define.
+	(SIGNED_INT16, TARGET_MPY, TARGET_ARC700_MPY, TARGET_ANY_MPY):
+	Likewise.
+	(TARGET_ARC600_FAMILY, TARGET_ARCOMPACT_FAMILY): Likewise.
+	(TARGET_LP_WR_INTERLOCK): Likewise.
+	* config/arc/arc.md
+	(commutative_binary_mult_comparison_result_used, movsicc_insn)
+	(mulsi3, mulsi3_600_lib, mulsidi3, mulsidi3_700, mulsi3_highpart)
+	(umulsi3_highpart_i, umulsi3_highpart_int, umulsi3_highpart)
+	(umulsidi3, umulsidi3_700, cstoresi4, simple_return, p_return_i):
+	Use it for ARCv2.
+	(mulhisi3, mulhisi3_imm, mulhisi3_reg, umulhisi3, umulhisi3_imm)
+	(umulhisi3_reg, umulhisi3_reg, mulsi3_v2, nopv, bswapsi2)
+	(prefetch, divsi3, udivsi3 modsi3, umodsi3, arcset, arcsetltu)
+	(arcsetgeu, arcsethi, arcsetls, reload_*_load, reload_*_store)
+	(extzvsi): New pattern.
+	* config/arc/arc.opt: New ARCv2 options.
+	* config/arc/arcEM.md: New file.
+	* config/arc/arcHS.md: Likewise.
+	* config/arc/constraints.md (C3p): New constraint, accepts 1 and 2
+	values.
+	(Cm2): A signed 9-bit integer constant constraint.
+	(C62): An unsigned 6-bit integer constant constraint.
+	(C16): A signed 16-bit integer constant constraint.
+	* config/arc/predicates.md (mult_operator): Add ARCv2 processort.
+	(short_const_int_operand): New predicate.
+	* config/arc/t-arc-newlib: Add ARCv2 multilib options.
+	* doc/invoke.texi: Add documentation for -mcpu=<archs/arcem>
+	-mcode-density and -mdiv-rem.
+
+2015-11-11  Julia Koval  <julia.koval@intel.com>
+
+	* config/i386/i386.c (m_SKYLAKE_AVX512): Fix typo.
+
+2015-11-11  Julia Koval  <julia.koval@intel.com>
+
+	* config/i386/i386.c: Handle "skylake" and
+	"skylake-avx512".
+
+2015-11-11  Martin Liska  <mliska@suse.cz>
+
+	* gimple-ssa-strength-reduction.c (create_phi_basis):
+	Use auto_vec.
+	* passes.c (release_dump_file_name): New function.
+	(pass_init_dump_file): Used from this function.
+	(pass_fini_dump_file): Likewise.
+	* tree-sra.c (convert_callers_for_node): Use xstrdup_for_dump.
+	* var-tracking.c (vt_initialize): Use pool_allocator.
+
+2015-11-11  Richard Biener  <rguenth@gcc.gnu.org>
+	    Jiong Wang	    <jiong.wang@arm.com>
+
+	PR tree-optimization/68234
+	* tree-vrp.c (vrp_visit_phi_node): Extend SCEV check to those loop PHI
+	node which estimiated to be VR_VARYING initially.
+
+2015-11-11  Robert Suchanek  <robert.suchanek@imgtec.com>
+
+	* regname.c (scan_rtx_reg): Check the matching number of consecutive
+	registers when tying chains.
+	(build_def_use): Move terminated_this_insn earlier in the function.
+
+2015-11-10  Mike Frysinger  <vapier@gentoo.org>
+
+	* configure.ac: Use = with test and not ==.
+	* configure: Regenerated.
+
+2015-11-11  David Edelsohn  <dje.gcc@gmail.com>
+
+	* config/rs6000/aix.h (TARGET_OS_AIX_CPP_BUILTINS): Add cpu and
+	machine asserts.  Update defines for 64 bit.
+
+2015-11-11  Charles Baylis  <charles.baylis@linaro.org>
+
+	PR target/63870
+	* config/arm/neon.md (neon_vld1_lane<mode>): Remove error for invalid
+	lane number.
+	(neon_vst1_lane<mode>): Likewise.
+	(neon_vld2_lane<mode>): Likewise.
+	(neon_vst2_lane<mode>): Likewise.
+	(neon_vld3_lane<mode>): Likewise.
+	(neon_vst3_lane<mode>): Likewise.
+	(neon_vld4_lane<mode>): Likewise.
+	(neon_vst4_lane<mode>): Likewise.
+
+2015-11-11  Charles Baylis  <charles.baylis@linaro.org>
+
+	PR target/63870
+	* config/arm/arm-builtins.c: (arm_load1_qualifiers) Use
+	qualifier_struct_load_store_lane_index.
+	(arm_storestruct_lane_qualifiers) Likewise.
+	* config/arm/neon.md: (neon_vld1_lane<mode>) Reverse lane numbers for
+	big-endian.
+	(neon_vst1_lane<mode>) Likewise.
+	(neon_vld2_lane<mode>) Likewise.
+	(neon_vst2_lane<mode>) Likewise.
+	(neon_vld3_lane<mode>) Likewise.
+	(neon_vst3_lane<mode>) Likewise.
+	(neon_vld4_lane<mode>) Likewise.
+	(neon_vst4_lane<mode>) Likewise.
+
+2015-11-11  Charles Baylis  <charles.baylis@linaro.org>
+
+	PR target/63870
+	* config/arm/arm-builtins.c (enum arm_type_qualifiers): New enumerator
+	qualifier_struct_load_store_lane_index.
+	(builtin_arg): New enumerator NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX.
+	(arm_expand_neon_args): New parameter. Remove ellipsis. Handle NEON
+	argument qualifiers.
+	(arm_expand_neon_builtin): Handle new NEON argument qualifier.
+	* config/arm/arm.h (NEON_ENDIAN_LANE_N): New macro.
+
+2015-11-10  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* config/nvptx/nvptx.opt (moptimize): New flag.
+	* config/nvptx/nvptx.c (nvptx_option_override): Set nvptx_optimize
+	default.
+	(nvptx_optimize_inner): New.
+	(nvptx_process_pars): Call it when optimizing.
+	* doc/invoke.texi (Nvidia PTX Options): Document -moptimize.
+
+2015-11-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
+
+	* config/rs6000/rs6000.c (rs6000_secondary_reload_direct_move):
+	Remove redundant code.
+
+2015-11-10  Jeff Law  <law@redhat.com>
+
+	* config/ft32/ft32.c (ft32_print_operand): Supply mode to
+	call to output_address.
+	* config/moxie/moxie.c (moxie_print_operand_address): Similarly.
+	Add unnamed machine_mode argument.
+
+2015-11-10  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* config.gcc (powerpc*-*-*, rs6000*-*-*): Add power9 to hosts that
+	default to 64-bit.
+
+2015-11-10  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386.md (*movabs<mode>_1): Add explicit
+	size directives for -masm=intel.
+	(*movabs<mode>_2): Ditto.
+
+2015-11-10  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386.c (ix86_print_operand): Remove dead code that
+	tried to avoid (%rip) for call operands.
+
+2015-11-10  Uros Bizjak  <ubizjak@gmail.com>
+
+	* config/i386/i386.c (ix86_print_operand_address_as): Add no_rip
+	argument.  Do not use RIP relative addressing when no_rip is set.
+	(ix86_print_operand): Update call to ix86_print_operand_address_as.
+	(ix86_print_operand_address): Ditto.
+	* config/i386/i386.md (*movabs<mode>_1): Use %P modifier for
+	absolute movabs operand 0.  Add square braces for -masm=intel.
+	(*movabs<mode>_2): Ditto for operand 1.
+
+2015-11-10  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
+
+	* config/arm/arm.c (arm_new_rtx_costs, FIX case): Handle
+	combine_vcvtf2i pattern.
+
+2015-11-10  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
+
+	* config/arm/arm.c (neon_valid_immediate): Remove integer
+	CONST_DOUBLE handling.  It should never occur.
+
+2015-11-10  Matthew Wahab  <matthew.wahab@arm.com>
+
+	* config/aarch64/atomics.md (unspecv): Move to iterators.md.
+	(ATOMIC_LDOP): Likewise.
+	(atomic_ldop): Likewise.
+	* config/aarch64/iterators.md (unspecv): Moved from atomics.md.
+	(ATOMIC_LDOP): Likewise.
+	(atomic_ldop): Likewise.
+
+2015-11-10  Martin Liska  <mliska@suse.cz>
+
+	* alloc-pool.h (allocate_raw): New function.
+	(operator new (size_t, object_allocator<T> &a)): Use the
+	function instead of object_allocator::allocate).
+
+2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
+
+	* config/i386/sse.md (HALFMASKMODE): New attribute.
+	(DOUBLEMASKMODE): New attribute.
+	(vec_pack_trunc_qi): New.
+	(vec_pack_trunc_<mode>): New.
+	(vec_unpacks_lo_hi): New.
+	(vec_unpacks_lo_si): New.
+	(vec_unpacks_lo_di): New.
+	(vec_unpacks_hi_hi): New.
+	(vec_unpacks_hi_<mode>): New.
+
+2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
+
+	* optabs.c (expand_binop_directly): Allow scalar mode for
+	vec_pack_trunc_optab.
+	* tree-vect-loop.c (vect_determine_vectorization_factor): Skip
+	boolean vector producers from pattern sequence when computing VF.
+	* tree-vect-patterns.c (vect_vect_recog_func_ptrs) Add
+	vect_recog_mask_conversion_pattern.
+	(search_type_for_mask): Choose the smallest
+	type if different size types are mixed.
+	(build_mask_conversion): New.
+	(vect_recog_mask_conversion_pattern): New.
+	(vect_pattern_recog_1): Allow scalar mode for boolean vectype.
+	* tree-vect-stmts.c (vectorizable_mask_load_store): Support masked
+	load with pattern.
+	(vectorizable_conversion): Support boolean vectors.
+	(free_stmt_vec_info): Allow patterns for statements with no lhs.
+	* tree-vectorizer.h (NUM_PATTERNS): Increase to 14.
+
+2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
+
+	* config/i386/i386-protos.h (ix86_expand_sse_movcc): New.
+	* config/i386/i386.c (ix86_expand_sse_movcc): Make public.
+	Cast mask to FP mode if required.
+	* config/i386/sse.md (vcond_mask_<mode><avx512fmaskmodelower>): New.
+	(vcond_mask_<mode><avx512fmaskmodelower>): New.
+	(vcond_mask_<mode><sseintvecmodelower>): New.
+	(vcond_mask_<mode><sseintvecmodelower>): New.
+	(vcond_mask_v2div2di): New.
+	(vcond_mask_<mode><sseintvecmodelower>): New.
+	(vcond_mask_<mode><sseintvecmodelower>): New.
+
+2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
+
+	* optabs-query.h (get_vcond_mask_icode): New.
+	* optabs-tree.c (expand_vec_cond_expr_p): Use
+	get_vcond_mask_icode for VEC_COND_EXPR with mask.
+	* optabs.c (expand_vec_cond_mask_expr): New.
+	(expand_vec_cond_expr): Use get_vcond_mask_icode
+	when possible.
+	* optabs.def (vcond_mask_optab): New.
+	* tree-vect-patterns.c (vect_recog_bool_pattern): Don't
+	generate redundant comparison for COND_EXPR.
+	* tree-vect-stmts.c (vect_is_simple_cond): Allow SSA_NAME
+	as a condition.
+	(vectorizable_condition): Likewise.
+	* tree-vect-slp.c (vect_get_and_check_slp_defs): Allow
+	cond_exp with no embedded comparison.
+	(vect_build_slp_tree_1): Likewise.
+
 2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
 
 	* config/i386/sse.md (maskload<mode>): Rename to ...
@@ -302,6 +595,7 @@
 	Fix comment typo.
 
 2015-11-09  Michael Meissner  <meissner@linux.vnet.ibm.com>
+	    Peter Bergner  <bergner@vnet.ibm.com>
 
 	* config/rs6000/rs6000.opt (-mpower9-fusion): Add new switches for
 	ISA 3.0 (power9).
diff --git a/gcc/DATESTAMP b/gcc/DATESTAMP
index 7ed3ab068e1..ef86fadfebb 100644
--- a/gcc/DATESTAMP
+++ b/gcc/DATESTAMP
@@ -1 +1 @@
-20151110
+20151111
diff --git a/gcc/alloc-pool.h b/gcc/alloc-pool.h
index bf9b0ebd6ee..38aff284997 100644
--- a/gcc/alloc-pool.h
+++ b/gcc/alloc-pool.h
@@ -477,12 +477,25 @@ public:
     m_allocator.release_if_empty ();
   }
 
+
+  /* Allocate memory for instance of type T and call a default constructor.  */
+
   inline T *
   allocate () ATTRIBUTE_MALLOC
   {
     return ::new (m_allocator.allocate ()) T;
   }
 
+  /* Allocate memory for instance of type T and return void * that
+     could be used in situations where a default constructor is not provided
+     by the class T.  */
+
+  inline void *
+  allocate_raw () ATTRIBUTE_MALLOC
+  {
+    return m_allocator.allocate ();
+  }
+
   inline void
   remove (T *object)
   {
@@ -528,7 +541,7 @@ template <typename T>
 inline void *
 operator new (size_t, object_allocator<T> &a)
 {
-  return a.allocate ();
+  return a.allocate_raw ();
 }
 
 /* Hashtable mapping alloc_pool names to descriptors.  */
diff --git a/gcc/common/config/arc/arc-common.c b/gcc/common/config/arc/arc-common.c
index 489bdb22533..c06f488d285 100644
--- a/gcc/common/config/arc/arc-common.c
+++ b/gcc/common/config/arc/arc-common.c
@@ -33,7 +33,7 @@ arc_option_init_struct (struct gcc_options *opts)
 {
   opts->x_flag_no_common = 255; /* Mark as not user-initialized.  */
 
-  /* Which cpu we're compiling for (ARC600, ARC601, ARC700).  */
+  /* Which cpu we're compiling for (ARC600, ARC601, ARC700, ARCv2).  */
   arc_cpu = PROCESSOR_NONE;
 }
 
@@ -68,6 +68,7 @@ arc_handle_option (struct gcc_options *opts, struct gcc_options *opts_set,
 {
   size_t code = decoded->opt_index;
   int value = decoded->value;
+  const char *arg = decoded->arg;
 
   switch (code)
     {
@@ -91,9 +92,40 @@ arc_handle_option (struct gcc_options *opts, struct gcc_options *opts_set,
 	  if (! (opts_set->x_target_flags & MASK_BARREL_SHIFTER) )
 	    opts->x_target_flags &= ~MASK_BARREL_SHIFTER;
 	  break;
+	case PROCESSOR_ARCHS:
+	  if ( !(opts_set->x_target_flags & MASK_BARREL_SHIFTER))
+	    opts->x_target_flags |= MASK_BARREL_SHIFTER;  /* Default: on.  */
+	  if ( !(opts_set->x_target_flags & MASK_CODE_DENSITY))
+	    opts->x_target_flags |= MASK_CODE_DENSITY;	  /* Default: on.  */
+	  if ( !(opts_set->x_target_flags & MASK_NORM_SET))
+	    opts->x_target_flags |= MASK_NORM_SET;	  /* Default: on.  */
+	  if ( !(opts_set->x_target_flags & MASK_SWAP_SET))
+	    opts->x_target_flags |= MASK_SWAP_SET;	  /* Default: on.  */
+	  if ( !(opts_set->x_target_flags & MASK_DIVREM))
+	    opts->x_target_flags |= MASK_DIVREM;	  /* Default: on.  */
+	  break;
+
+	case PROCESSOR_ARCEM:
+	  if ( !(opts_set->x_target_flags & MASK_BARREL_SHIFTER))
+	    opts->x_target_flags |= MASK_BARREL_SHIFTER;  /* Default: on.  */
+	  if ( !(opts_set->x_target_flags & MASK_CODE_DENSITY))
+	    opts->x_target_flags &= ~MASK_CODE_DENSITY;	  /* Default: off.  */
+	  if ( !(opts_set->x_target_flags & MASK_NORM_SET))
+	    opts->x_target_flags &= ~MASK_NORM_SET;	  /* Default: off.  */
+	  if ( !(opts_set->x_target_flags & MASK_SWAP_SET))
+	    opts->x_target_flags &= ~MASK_SWAP_SET;	  /* Default: off.  */
+	  if ( !(opts_set->x_target_flags & MASK_DIVREM))
+	    opts->x_target_flags &= ~MASK_DIVREM;	  /* Default: off.  */
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
+      break;
+
+    case OPT_mmpy_option_:
+      if (value < 0 || value > 9)
+	error_at (loc, "bad value %qs for -mmpy-option switch", arg);
+      break;
     }
 
   return true;
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 9cc765e2bc1..59aee2cfdcd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -439,7 +439,7 @@ powerpc*-*-*)
 	cpu_type=rs6000
 	extra_headers="ppc-asm.h altivec.h spe.h ppu_intrinsics.h paired.h spu2vmx.h vec_types.h si2vmx.h htmintrin.h htmxlintrin.h"
 	case x$with_cpu in
-	    xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[345678]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
+	    xpowerpc64|xdefault64|x6[23]0|x970|xG5|xpower[3456789]|xpower6x|xrs64a|xcell|xa2|xe500mc64|xe5500|xe6500)
 		cpu_is_64bit=yes
 		;;
 	esac
@@ -4131,7 +4131,7 @@ case "${target}" in
 				eval "with_$which=405"
 				;;
 			"" | common | native \
-			| power | power[2345678] | power6x | powerpc | powerpc64 \
+			| power | power[23456789] | power6x | powerpc | powerpc64 \
 			| rios | rios1 | rios2 | rsc | rsc1 | rs64a \
 			| 401 | 403 | 405 | 405fp | 440 | 440fp | 464 | 464fp \
 			| 476 | 476fp | 505 | 601 | 602 | 603 | 603e | ec603e \
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index e7ac5f6fc1c..3c034fb4376 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -18,34 +18,6 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; <http://www.gnu.org/licenses/>.
 
-(define_c_enum "unspecv"
- [
-    UNSPECV_LX				; Represent a load-exclusive.
-    UNSPECV_SX				; Represent a store-exclusive.
-    UNSPECV_LDA				; Represent an atomic load or load-acquire.
-    UNSPECV_STL				; Represent an atomic store or store-release.
-    UNSPECV_ATOMIC_CMPSW		; Represent an atomic compare swap.
-    UNSPECV_ATOMIC_EXCHG		; Represent an atomic exchange.
-    UNSPECV_ATOMIC_CAS			; Represent an atomic CAS.
-    UNSPECV_ATOMIC_SWP			; Represent an atomic SWP.
-    UNSPECV_ATOMIC_OP			; Represent an atomic operation.
-    UNSPECV_ATOMIC_LDOP			; Represent an atomic load-operation
-    UNSPECV_ATOMIC_LDOP_OR		; Represent an atomic load-or
-    UNSPECV_ATOMIC_LDOP_BIC		; Represent an atomic load-bic
-    UNSPECV_ATOMIC_LDOP_XOR		; Represent an atomic load-xor
-    UNSPECV_ATOMIC_LDOP_PLUS		; Represent an atomic load-add
-])
-
-;; Iterators for load-operate instructions.
-
-(define_int_iterator ATOMIC_LDOP
- [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC
-  UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS])
-
-(define_int_attr atomic_ldop
- [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr")
-  (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")])
-
 ;; Instruction patterns.
 
 (define_expand "atomic_compare_and_swap<mode>"
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index c4a1c9888ea..c2eb7dec99d 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -306,6 +306,29 @@
     UNSPEC_VEC_SHR      ; Used in aarch64-simd.md.
 ])
 
+;; ------------------------------------------------------------------
+;; Unspec enumerations for Atomics.  They are here so that they can be
+;; used in the int_iterators for atomic operations.
+;; ------------------------------------------------------------------
+
+(define_c_enum "unspecv"
+ [
+    UNSPECV_LX			; Represent a load-exclusive.
+    UNSPECV_SX			; Represent a store-exclusive.
+    UNSPECV_LDA			; Represent an atomic load or load-acquire.
+    UNSPECV_STL			; Represent an atomic store or store-release.
+    UNSPECV_ATOMIC_CMPSW	; Represent an atomic compare swap.
+    UNSPECV_ATOMIC_EXCHG	; Represent an atomic exchange.
+    UNSPECV_ATOMIC_CAS		; Represent an atomic CAS.
+    UNSPECV_ATOMIC_SWP		; Represent an atomic SWP.
+    UNSPECV_ATOMIC_OP		; Represent an atomic operation.
+    UNSPECV_ATOMIC_LDOP		; Represent an atomic load-operation
+    UNSPECV_ATOMIC_LDOP_OR	; Represent an atomic load-or
+    UNSPECV_ATOMIC_LDOP_BIC	; Represent an atomic load-bic
+    UNSPECV_ATOMIC_LDOP_XOR	; Represent an atomic load-xor
+    UNSPECV_ATOMIC_LDOP_PLUS	; Represent an atomic load-add
+])
+
 ;; -------------------------------------------------------------------
 ;; Mode attributes
 ;; -------------------------------------------------------------------
@@ -965,6 +988,16 @@
 
 (define_int_iterator CRYPTO_SHA256 [UNSPEC_SHA256H UNSPEC_SHA256H2])
 
+;; Iterators for atomic operations.
+
+(define_int_iterator ATOMIC_LDOP
+ [UNSPECV_ATOMIC_LDOP_OR UNSPECV_ATOMIC_LDOP_BIC
+  UNSPECV_ATOMIC_LDOP_XOR UNSPECV_ATOMIC_LDOP_PLUS])
+
+(define_int_attr atomic_ldop
+ [(UNSPECV_ATOMIC_LDOP_OR "set") (UNSPECV_ATOMIC_LDOP_BIC "clr")
+  (UNSPECV_ATOMIC_LDOP_XOR "eor") (UNSPECV_ATOMIC_LDOP_PLUS "add")])
+
 ;; -------------------------------------------------------------------
 ;; Int Iterators Attributes.
 ;; -------------------------------------------------------------------
diff --git a/gcc/config/arc/arc-opts.h b/gcc/config/arc/arc-opts.h
index cca1f035636..a33f4b77521 100644
--- a/gcc/config/arc/arc-opts.h
+++ b/gcc/config/arc/arc-opts.h
@@ -23,5 +23,7 @@ enum processor_type
   PROCESSOR_NONE,
   PROCESSOR_ARC600,
   PROCESSOR_ARC601,
-  PROCESSOR_ARC700
+  PROCESSOR_ARC700,
+  PROCESSOR_ARCEM,
+  PROCESSOR_ARCHS
 };
diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index ff82ecf63dd..6e04351159b 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -118,3 +118,4 @@ extern bool arc_epilogue_uses (int regno);
 extern int regno_clobbered_p (unsigned int, rtx_insn *, machine_mode, int);
 extern int arc_return_slot_offset (void);
 extern bool arc_legitimize_reload_address (rtx *, machine_mode, int, int);
+extern void arc_secondary_reload_conv (rtx, rtx, rtx, bool);
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 01261bc702a..85d53e4d2e3 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -590,10 +590,26 @@ arc_sched_adjust_priority (rtx_insn *insn, int priority)
   return priority;
 }
 
+/* For ARC base register + offset addressing, the validity of the
+   address is mode-dependent for most of the offset range, as the
+   offset can be scaled by the access size.
+   We don't expose these as mode-dependent addresses in the
+   mode_dependent_address_p target hook, because that would disable
+   lots of optimizations, and most uses of these addresses are for 32
+   or 64 bit accesses anyways, which are fine.
+   However, that leaves some addresses for 8 / 16 bit values not
+   properly reloaded by the generic code, which is why we have to
+   schedule secondary reloads for these.  */
+
 static reg_class_t
-arc_secondary_reload (bool in_p, rtx x, reg_class_t cl, machine_mode,
-		      secondary_reload_info *)
+arc_secondary_reload (bool in_p,
+		      rtx x,
+		      reg_class_t cl,
+		      machine_mode mode,
+		      secondary_reload_info *sri)
 {
+  enum rtx_code code = GET_CODE (x);
+
   if (cl == DOUBLE_REGS)
     return GENERAL_REGS;
 
@@ -601,9 +617,86 @@ arc_secondary_reload (bool in_p, rtx x, reg_class_t cl, machine_mode,
   if ((cl == LPCOUNT_REG || cl == WRITABLE_CORE_REGS)
       && in_p && MEM_P (x))
     return GENERAL_REGS;
+
+ /* If we have a subreg (reg), where reg is a pseudo (that will end in
+    a memory location), then we may need a scratch register to handle
+    the fp/sp+largeoffset address.  */
+  if (code == SUBREG)
+    {
+      rtx addr = NULL_RTX;
+      x = SUBREG_REG (x);
+
+      if (REG_P (x))
+	{
+	  int regno = REGNO (x);
+	  if (regno >= FIRST_PSEUDO_REGISTER)
+	    regno = reg_renumber[regno];
+
+	  if (regno != -1)
+	    return NO_REGS;
+
+	  /* It is a pseudo that ends in a stack location.  */
+	  if (reg_equiv_mem (REGNO (x)))
+	    {
+	      /* Get the equivalent address and check the range of the
+		 offset.  */
+	      rtx mem = reg_equiv_mem (REGNO (x));
+	      addr = find_replacement (&XEXP (mem, 0));
+	    }
+	}
+      else
+	{
+	  gcc_assert (MEM_P (x));
+	  addr = XEXP (x, 0);
+	  addr = simplify_rtx (addr);
+	}
+      if (addr && GET_CODE (addr) == PLUS
+	  && CONST_INT_P (XEXP (addr, 1))
+	  && (!RTX_OK_FOR_OFFSET_P (mode, XEXP (addr, 1))))
+	{
+	  switch (mode)
+	    {
+	    case QImode:
+	      sri->icode =
+		in_p ? CODE_FOR_reload_qi_load : CODE_FOR_reload_qi_store;
+	      break;
+	    case HImode:
+	      sri->icode =
+		in_p ? CODE_FOR_reload_hi_load : CODE_FOR_reload_hi_store;
+	      break;
+	    default:
+	      break;
+	    }
+	}
+    }
   return NO_REGS;
 }
 
+/* Convert reloads using offsets that are too large to use indirect
+   addressing.  */
+
+void
+arc_secondary_reload_conv (rtx reg, rtx mem, rtx scratch, bool store_p)
+{
+  rtx addr;
+
+  gcc_assert (GET_CODE (mem) == MEM);
+  addr = XEXP (mem, 0);
+
+  /* Large offset: use a move.  FIXME: ld ops accepts limms as
+     offsets.  Hence, the following move insn is not required.  */
+  emit_move_insn (scratch, addr);
+  mem = replace_equiv_address_nv (mem, scratch);
+
+  /* Now create the move.  */
+  if (store_p)
+    emit_insn (gen_rtx_SET (mem, reg));
+  else
+    emit_insn (gen_rtx_SET (reg, mem));
+
+  return;
+}
+
 static unsigned arc_ifcvt (void);
 
 namespace {
@@ -687,23 +780,35 @@ arc_init (void)
 {
   enum attr_tune tune_dflt = TUNE_NONE;
 
-  if (TARGET_ARC600)
+  switch (arc_cpu)
     {
+    case PROCESSOR_ARC600:
       arc_cpu_string = "ARC600";
       tune_dflt = TUNE_ARC600;
-    }
-  else if (TARGET_ARC601)
-    {
+      break;
+
+    case PROCESSOR_ARC601:
       arc_cpu_string = "ARC601";
       tune_dflt = TUNE_ARC600;
-    }
-  else if (TARGET_ARC700)
-    {
+      break;
+
+    case PROCESSOR_ARC700:
       arc_cpu_string = "ARC700";
       tune_dflt = TUNE_ARC700_4_2_STD;
+      break;
+
+    case PROCESSOR_ARCEM:
+      arc_cpu_string = "EM";
+      break;
+
+    case PROCESSOR_ARCHS:
+      arc_cpu_string = "HS";
+      break;
+
+    default:
+      gcc_unreachable ();
     }
-  else
-    gcc_unreachable ();
+
   if (arc_tune == TUNE_NONE)
     arc_tune = tune_dflt;
   /* Note: arc_multcost is only used in rtx_cost if speed is true.  */
@@ -737,15 +842,15 @@ arc_init (void)
       }
 
   /* Support mul64 generation only for ARC600.  */
-  if (TARGET_MUL64_SET && TARGET_ARC700)
-      error ("-mmul64 not supported for ARC700");
+  if (TARGET_MUL64_SET && (!TARGET_ARC600_FAMILY))
+      error ("-mmul64 not supported for ARC700 or ARCv2");
 
-  /* MPY instructions valid only for ARC700.  */
-  if (TARGET_NOMPY_SET && !TARGET_ARC700)
-      error ("-mno-mpy supported only for ARC700");
+  /* MPY instructions valid only for ARC700 or ARCv2.  */
+  if (TARGET_NOMPY_SET && TARGET_ARC600_FAMILY)
+      error ("-mno-mpy supported only for ARC700 or ARCv2");
 
   /* mul/mac instructions only for ARC600.  */
-  if (TARGET_MULMAC_32BY16_SET && !(TARGET_ARC600 || TARGET_ARC601))
+  if (TARGET_MULMAC_32BY16_SET && (!TARGET_ARC600_FAMILY))
       error ("-mmul32x16 supported only for ARC600 or ARC601");
 
   if (!TARGET_DPFP && TARGET_DPFP_DISABLE_LRSR)
@@ -757,18 +862,25 @@ arc_init (void)
     error ("FPX fast and compact options cannot be specified together");
 
   /* FPX-2. No fast-spfp for arc600 or arc601.  */
-  if (TARGET_SPFP_FAST_SET && (TARGET_ARC600 || TARGET_ARC601))
+  if (TARGET_SPFP_FAST_SET && TARGET_ARC600_FAMILY)
     error ("-mspfp_fast not available on ARC600 or ARC601");
 
   /* FPX-3. No FPX extensions on pre-ARC600 cores.  */
   if ((TARGET_DPFP || TARGET_SPFP)
-      && !(TARGET_ARC600 || TARGET_ARC601 || TARGET_ARC700))
+      && !TARGET_ARCOMPACT_FAMILY)
     error ("FPX extensions not available on pre-ARC600 cores");
 
+  /* Only selected multiplier configurations are available for HS.  */
+  if (TARGET_HS && ((arc_mpy_option > 2 && arc_mpy_option < 7)
+		    || (arc_mpy_option == 1)))
+    error ("This multiplier configuration is not available for HS cores");
+
   /* Warn for unimplemented PIC in pre-ARC700 cores, and disable flag_pic.  */
-  if (flag_pic && !TARGET_ARC700)
+  if (flag_pic && TARGET_ARC600_FAMILY)
     {
-      warning (DK_WARNING, "PIC is not supported for %s. Generating non-PIC code only..", arc_cpu_string);
+      warning (DK_WARNING,
+	       "PIC is not supported for %s. Generating non-PIC code only..",
+	       arc_cpu_string);
       flag_pic = 0;
     }
 
@@ -782,6 +894,8 @@ arc_init (void)
   arc_punct_chars['!'] = 1;
   arc_punct_chars['^'] = 1;
   arc_punct_chars['&'] = 1;
+  arc_punct_chars['+'] = 1;
+  arc_punct_chars['_'] = 1;
 
   if (optimize > 1 && !TARGET_NO_COND_EXEC)
     {
@@ -825,7 +939,7 @@ arc_override_options (void)
   if (flag_no_common == 255)
     flag_no_common = !TARGET_NO_SDATA_SET;
 
-  /* TARGET_COMPACT_CASESI needs the "q" register class.  */ \
+  /* TARGET_COMPACT_CASESI needs the "q" register class.  */
   if (TARGET_MIXED_CODE)
     TARGET_Q_CLASS = 1;
   if (!TARGET_Q_CLASS)
@@ -1198,6 +1312,8 @@ arc_init_reg_tables (void)
   char rname57[5] = "r57";
   char rname58[5] = "r58";
   char rname59[5] = "r59";
+  char rname29[7] = "ilink1";
+  char rname30[7] = "ilink2";
 
 static void
 arc_conditional_register_usage (void)
@@ -1206,6 +1322,14 @@ arc_conditional_register_usage (void)
   int i;
   int fix_start = 60, fix_end = 55;
 
+  if (TARGET_V2)
+    {
+      /* For ARCv2 the core register set is changed.  */
+      strcpy (rname29, "ilink");
+      strcpy (rname30, "r30");
+      fixed_regs[30] = call_used_regs[30] = 1;
+   }
+
   if (TARGET_MUL64_SET)
     {
       fix_start = 57;
@@ -1271,7 +1395,7 @@ arc_conditional_register_usage (void)
      machine_dependent_reorg.  */
   if (TARGET_ARC600)
     CLEAR_HARD_REG_BIT (reg_class_contents[SIBCALL_REGS], LP_COUNT);
-  else if (!TARGET_ARC700)
+  else if (!TARGET_LP_WR_INTERLOCK)
     fixed_regs[LP_COUNT] = 1;
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
     if (!call_used_regs[regno])
@@ -1279,7 +1403,7 @@ arc_conditional_register_usage (void)
   for (regno = 32; regno < 60; regno++)
     if (!fixed_regs[regno])
       SET_HARD_REG_BIT (reg_class_contents[WRITABLE_CORE_REGS], regno);
-  if (TARGET_ARC700)
+  if (!TARGET_ARC600_FAMILY)
     {
       for (regno = 32; regno <= 60; regno++)
 	CLEAR_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], regno);
@@ -1313,7 +1437,7 @@ arc_conditional_register_usage (void)
 	  = (fixed_regs[i]
 	     ? (TEST_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], i)
 		? CHEAP_CORE_REGS : ALL_CORE_REGS)
-	     : ((TARGET_ARC700
+	     : (((!TARGET_ARC600_FAMILY)
 		 && TEST_HARD_REG_BIT (reg_class_contents[CHEAP_CORE_REGS], i))
 		? CHEAP_CORE_REGS : WRITABLE_CORE_REGS));
       else
@@ -1331,7 +1455,8 @@ arc_conditional_register_usage (void)
 
   /* Handle Special Registers.  */
   arc_regno_reg_class[29] = LINK_REGS; /* ilink1 register.  */
-  arc_regno_reg_class[30] = LINK_REGS; /* ilink2 register.  */
+  if (!TARGET_V2)
+    arc_regno_reg_class[30] = LINK_REGS; /* ilink2 register.  */
   arc_regno_reg_class[31] = LINK_REGS; /* blink register.  */
   arc_regno_reg_class[60] = LPCOUNT_REG;
   arc_regno_reg_class[61] = NO_REGS;      /* CC_REG: must be NO_REGS.  */
@@ -1413,13 +1538,23 @@ arc_handle_interrupt_attribute (tree *, tree name, tree args, int,
       *no_add_attrs = true;
     }
   else if (strcmp (TREE_STRING_POINTER (value), "ilink1")
-	   && strcmp (TREE_STRING_POINTER (value), "ilink2"))
+	   && strcmp (TREE_STRING_POINTER (value), "ilink2")
+	   && !TARGET_V2)
     {
       warning (OPT_Wattributes,
 	       "argument of %qE attribute is not \"ilink1\" or \"ilink2\"",
 	       name);
       *no_add_attrs = true;
     }
+  else if (TARGET_V2
+	   && strcmp (TREE_STRING_POINTER (value), "ilink"))
+    {
+      warning (OPT_Wattributes,
+	       "argument of %qE attribute is not \"ilink\"",
+	       name);
+      *no_add_attrs = true;
+    }
+
   return NULL_TREE;
 }
 
@@ -1931,7 +2066,8 @@ arc_compute_function_type (struct function *fun)
 	{
 	  tree value = TREE_VALUE (args);
 
-	  if (!strcmp (TREE_STRING_POINTER (value), "ilink1"))
+	  if (!strcmp (TREE_STRING_POINTER (value), "ilink1")
+	      || !strcmp (TREE_STRING_POINTER (value), "ilink"))
 	    fn_type = ARC_FUNCTION_ILINK1;
 	  else if (!strcmp (TREE_STRING_POINTER (value), "ilink2"))
 	    fn_type = ARC_FUNCTION_ILINK2;
@@ -3115,6 +3251,18 @@ arc_print_operand (FILE *file, rtx x, int code)
       if (TARGET_ANNOTATE_ALIGN && cfun->machine->size_reason)
 	fprintf (file, "; unalign: %d", cfun->machine->unalign);
       return;
+    case '+':
+      if (TARGET_V2)
+	fputs ("m", file);
+      else
+	fputs ("h", file);
+      return;
+    case '_':
+      if (TARGET_V2)
+	fputs ("h", file);
+      else
+	fputs ("w", file);
+      return;
     default :
       /* Unknown flag.  */
       output_operand_lossage ("invalid operand output code");
@@ -4224,7 +4372,7 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
 	*total= arc_multcost;
       /* We do not want synth_mult sequences when optimizing
 	 for size.  */
-      else if (TARGET_MUL64_SET || (TARGET_ARC700 && !TARGET_NOMPY_SET))
+      else if (TARGET_MUL64_SET || TARGET_ARC700_MPY)
 	*total = COSTS_N_INSNS (1);
       else
 	*total = COSTS_N_INSNS (2);
@@ -5639,7 +5787,7 @@ arc_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED)
   else
     {
       HOST_WIDE_INT size = int_size_in_bytes (type);
-      return (size == -1 || size > 8);
+      return (size == -1 || size > (TARGET_V2 ? 16 : 8));
     }
 }
 
@@ -5737,6 +5885,26 @@ arc_invalid_within_doloop (const rtx_insn *insn)
   return NULL;
 }
 
+/* The same functionality as arc_hazard.  It is called in machine
+   reorg before any other optimization.  Hence, the NOP size is taken
+   into account when doing branch shortening.  */
+
+static void
+workaround_arc_anomaly (void)
+{
+  rtx_insn *insn, *succ0;
+
+  /* For any architecture: call arc_hazard here.  */
+  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+    {
+      succ0 = next_real_insn (insn);
+      if (arc_hazard (insn, succ0))
+	{
+	  emit_insn_before (gen_nopv (), succ0);
+	}
+    }
+}
+
 static int arc_reorg_in_progress = 0;
 
 /* ARC's machince specific reorg function.  */
@@ -5750,6 +5918,8 @@ arc_reorg (void)
   long offset;
   int changed;
 
+  workaround_arc_anomaly ();
+
   cfun->machine->arc_reorg_started = 1;
   arc_reorg_in_progress = 1;
 
@@ -7758,6 +7928,109 @@ arc600_corereg_hazard (rtx_insn *pred, rtx_insn *succ)
   return 0;
 }
 
+/* Given a rtx, check if it is an assembly instruction or not.  */
+
+static int
+arc_asm_insn_p (rtx x)
+{
+  int i, j;
+
+  if (x == 0)
+    return 0;
+
+  switch (GET_CODE (x))
+    {
+    case ASM_OPERANDS:
+    case ASM_INPUT:
+      return 1;
+
+    case SET:
+      return arc_asm_insn_p (SET_SRC (x));
+
+    case PARALLEL:
+      j = 0;
+      for (i = XVECLEN (x, 0) - 1; i >= 0; i--)
+	j += arc_asm_insn_p (XVECEXP (x, 0, i));
+      if ( j > 0)
+	return 1;
+      break;
+
+    default:
+      break;
+    }
+
+  return 0;
+}
+
+/* We might have a CALL to a non-returning function before a loop end.
+   ??? Although the manual says that's OK (the target is outside the
+   loop, and the loop counter unused there), the assembler barfs on
+   this for ARC600, so we must insert a nop before such a call too.
+   For ARC700, and ARCv2 is not allowed to have the last ZOL
+   instruction a jump to a location where lp_count is modified.  */
+
+static bool
+arc_loop_hazard (rtx_insn *pred, rtx_insn *succ)
+{
+  rtx_insn *jump  = NULL;
+  rtx_insn *label = NULL;
+  basic_block succ_bb;
+
+  if (recog_memoized (succ) != CODE_FOR_doloop_end_i)
+    return false;
+
+  /* Phase 1: ARC600 and ARCv2HS doesn't allow any control instruction
+     (i.e., jump/call) as the last instruction of a ZOL.  */
+  if (TARGET_ARC600 || TARGET_HS)
+    if (JUMP_P (pred) || CALL_P (pred)
+	|| arc_asm_insn_p (PATTERN (pred))
+	|| GET_CODE (PATTERN (pred)) == SEQUENCE)
+      return true;
+
+  /* Phase 2: Any architecture, it is not allowed to have the last ZOL
+     instruction a jump to a location where lp_count is modified.  */
+
+  /* Phase 2a: Dig for the jump instruction.  */
+  if (JUMP_P (pred))
+    jump = pred;
+  else if (GET_CODE (PATTERN (pred)) == SEQUENCE
+	   && JUMP_P (XVECEXP (PATTERN (pred), 0, 0)))
+    jump = as_a <rtx_insn *> XVECEXP (PATTERN (pred), 0, 0);
+  else
+    return false;
+
+  label = JUMP_LABEL_AS_INSN (jump);
+  if (!label)
+    return false;
+
+  /* Phase 2b: Make sure is not a millicode jump.  */
+  if ((GET_CODE (PATTERN (jump)) == PARALLEL)
+      && (XVECEXP (PATTERN (jump), 0, 0) == ret_rtx))
+    return false;
+
+  /* Phase 2c: Make sure is not a simple_return.  */
+  if ((GET_CODE (PATTERN (jump)) == SIMPLE_RETURN)
+      || (GET_CODE (label) == SIMPLE_RETURN))
+    return false;
+
+  /* Pahse 2d: Go to the target of the jump and check for aliveness of
+     LP_COUNT register.  */
+  succ_bb = BLOCK_FOR_INSN (label);
+  if (!succ_bb)
+    {
+      gcc_assert (NEXT_INSN (label));
+      if (NOTE_INSN_BASIC_BLOCK_P (NEXT_INSN (label)))
+	succ_bb = NOTE_BASIC_BLOCK (NEXT_INSN (label));
+      else
+	succ_bb = BLOCK_FOR_INSN (NEXT_INSN (label));
+    }
+
+  if (succ_bb && REGNO_REG_SET_P (df_get_live_out (succ_bb), LP_COUNT))
+    return true;
+
+  return false;
+}
+
 /* For ARC600:
    A write to a core reg greater or equal to 32 must not be immediately
    followed by a use.  Anticipate the length requirement to insert a nop
@@ -7766,19 +8039,16 @@ arc600_corereg_hazard (rtx_insn *pred, rtx_insn *succ)
 int
 arc_hazard (rtx_insn *pred, rtx_insn *succ)
 {
-  if (!TARGET_ARC600)
-    return 0;
   if (!pred || !INSN_P (pred) || !succ || !INSN_P (succ))
     return 0;
-  /* We might have a CALL to a non-returning function before a loop end.
-     ??? Although the manual says that's OK (the target is outside the loop,
-     and the loop counter unused there), the assembler barfs on this, so we
-     must instert a nop before such a call too.  */
-  if (recog_memoized (succ) == CODE_FOR_doloop_end_i
-      && (JUMP_P (pred) || CALL_P (pred)
-	  || GET_CODE (PATTERN (pred)) == SEQUENCE))
+
+  if (arc_loop_hazard (pred, succ))
     return 4;
-  return arc600_corereg_hazard (pred, succ);
+
+  if (TARGET_ARC600)
+    return arc600_corereg_hazard (pred, succ);
+
+  return 0;
 }
 
 /* Return length adjustment for INSN.  */
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index e8baf5b8d79..d312f9f14a7 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -80,6 +80,14 @@ along with GCC; see the file COPYING3.  If not see
 	builtin_define ("__A7__");	\
 	builtin_define ("__ARC700__");	\
       }					\
+    else if (TARGET_EM)			\
+      {					\
+	builtin_define ("__EM__");	\
+      }					\
+    else if (TARGET_HS)			\
+      {					\
+	builtin_define ("__HS__");	\
+      }					\
     if (TARGET_NORM)			\
       {					\
 	builtin_define ("__ARC_NORM__");\
@@ -143,6 +151,8 @@ along with GCC; see the file COPYING3.  If not see
 %{mcpu=ARC700|!mcpu=*:%{mlock}} \
 %{mcpu=ARC700|!mcpu=*:%{mswape}} \
 %{mcpu=ARC700|!mcpu=*:%{mrtsc}} \
+%{mcpu=ARCHS:-mHS} \
+%{mcpu=ARCEM:-mEM} \
 "
 
 #if DEFAULT_LIBC == LIBC_UCLIBC
@@ -246,12 +256,13 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Non-zero means the cpu supports norm instruction.  This flag is set by
    default for A7, and only for pre A7 cores when -mnorm is given.  */
-#define TARGET_NORM (TARGET_ARC700 || TARGET_NORM_SET)
+#define TARGET_NORM (TARGET_ARC700 || TARGET_NORM_SET || TARGET_HS)
 /* Indicate if an optimized floating point emulation library is available.  */
 #define TARGET_OPTFPE \
  (TARGET_ARC700 \
   /* We need a barrel shifter and NORM.  */ \
-  || (TARGET_ARC600 && TARGET_NORM_SET))
+  || (TARGET_ARC600 && TARGET_NORM_SET) \
+  || TARGET_HS)
 
 /* Non-zero means the cpu supports swap instruction.  This flag is set by
    default for A7, and only for pre A7 cores when -mswap is given.  */
@@ -271,11 +282,15 @@ along with GCC; see the file COPYING3.  If not see
 
 /* For an anulled-true delay slot insn for a delayed branch, should we only
    use conditional execution?  */
-#define TARGET_AT_DBR_CONDEXEC  (!TARGET_ARC700)
+#define TARGET_AT_DBR_CONDEXEC  (!TARGET_ARC700 && !TARGET_V2)
 
 #define TARGET_ARC600 (arc_cpu == PROCESSOR_ARC600)
 #define TARGET_ARC601 (arc_cpu == PROCESSOR_ARC601)
 #define TARGET_ARC700 (arc_cpu == PROCESSOR_ARC700)
+#define TARGET_EM     (arc_cpu == PROCESSOR_ARCEM)
+#define TARGET_HS     (arc_cpu == PROCESSOR_ARCHS)
+#define TARGET_V2							\
+  ((arc_cpu == PROCESSOR_ARCHS) || (arc_cpu == PROCESSOR_ARCEM))
 
 /* Recast the cpu class to be the cpu attribute.  */
 #define arc_cpu_attr ((enum attr_cpu)arc_cpu)
@@ -744,6 +759,7 @@ extern enum reg_class arc_regno_reg_class[];
   ((unsigned) (((X) >> (SHIFT)) + 0x100) \
    < 0x200 - ((unsigned) (OFFSET) >> (SHIFT)))
 #define SIGNED_INT12(X) ((unsigned) ((X) + 0x800) < 0x1000)
+#define SIGNED_INT16(X) ((unsigned) ((X) + 0x8000) < 0x10000)
 #define LARGE_INT(X) \
 (((X) < 0) \
  ? (X) >= (-(HOST_WIDE_INT) 0x7fffffff - 1) \
@@ -1305,6 +1321,7 @@ do {							\
 #endif
 #define SET_ASM_OP "\t.set\t"
 
+extern char rname29[], rname30[];
 extern char rname56[], rname57[], rname58[], rname59[];
 /* How to refer to registers in assembler output.
    This sequence is indexed by compiler's hard-register-number (see above).  */
@@ -1312,7 +1329,7 @@ extern char rname56[], rname57[], rname58[], rname59[];
 {  "r0",   "r1",   "r2",   "r3",       "r4",     "r5",     "r6",    "r7",	\
    "r8",   "r9",  "r10",  "r11",      "r12",    "r13",    "r14",   "r15",	\
   "r16",  "r17",  "r18",  "r19",      "r20",    "r21",    "r22",   "r23",	\
-  "r24",  "r25",   "gp",   "fp",       "sp", "ilink1", "ilink2", "blink",	\
+  "r24",  "r25",   "gp",   "fp",       "sp",  rname29,  rname30, "blink",	\
   "r32",  "r33",  "r34",  "r35",      "r36",    "r37",    "r38",   "r39",	\
    "d1",   "d1",   "d2",   "d2",      "r44",    "r45",    "r46",   "r47",	\
   "r48",  "r49",  "r50",  "r51",      "r52",    "r53",    "r54",   "r55",	\
@@ -1678,4 +1695,25 @@ enum
 #define SFUNC_CHECK_PREDICABLE \
   (GET_CODE (PATTERN (insn)) != COND_EXEC || !flag_pic || !TARGET_MEDIUM_CALLS)
 
+/* MPYW feature macro.  Only valid for ARCHS and ARCEM cores.  */
+#define TARGET_MPYW     ((arc_mpy_option > 0) && TARGET_V2)
+/* Full ARCv2 multiplication feature macro.  */
+#define TARGET_MULTI    ((arc_mpy_option > 1) && TARGET_V2)
+/* General MPY feature macro.  */
+#define TARGET_MPY      ((TARGET_ARC700 && (!TARGET_NOMPY_SET)) || TARGET_MULTI)
+/* ARC700 MPY feature macro.  */
+#define TARGET_ARC700_MPY (TARGET_ARC700 && (!TARGET_NOMPY_SET))
+/* Any multiplication feature macro.  */
+#define TARGET_ANY_MPY						\
+  (TARGET_MPY || TARGET_MUL64_SET || TARGET_MULMAC_32BY16_SET)
+
+/* ARC600 and ARC601 feature macro.  */
+#define TARGET_ARC600_FAMILY (TARGET_ARC600 || TARGET_ARC601)
+/* ARC600, ARC601 and ARC700 feature macro.  */
+#define TARGET_ARCOMPACT_FAMILY				\
+  (TARGET_ARC600 || TARGET_ARC601 || TARGET_ARC700)
+/* Loop count register can be read in very next instruction after has
+   been written to by an ordinary instruction.  */
+#define TARGET_LP_WR_INTERLOCK (!TARGET_ARC600_FAMILY)
+
 #endif /* GCC_ARC_H */
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index e1da4d70085..1d070a30d82 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -84,6 +84,8 @@
 ;; Include DFA scheduluers
 (include ("arc600.md"))
 (include ("arc700.md"))
+(include ("arcEM.md"))
+(include ("arcHS.md"))
 
 ;; Predicates
 
@@ -124,6 +126,7 @@
    (VUNSPEC_SR 26) ; blockage insn for writing to an auxiliary register
    (VUNSPEC_TRAP_S 27) ; blockage insn for trap_s generation
    (VUNSPEC_UNIMP_S 28) ; blockage insn for unimp_s generation
+   (VUNSPEC_NOP 29) ; volatile NOP
 
    (R0_REG 0)
    (R1_REG 1)
@@ -165,7 +168,7 @@
    simd_varith_with_acc, simd_vlogic, simd_vlogic_with_acc,
    simd_vcompare, simd_vpermute, simd_vpack, simd_vpack_with_acc,
    simd_valign, simd_valign_with_acc, simd_vcontrol,
-   simd_vspecial_3cycle, simd_vspecial_4cycle, simd_dma"
+   simd_vspecial_3cycle, simd_vspecial_4cycle, simd_dma, mul16_em, div_rem"
   (cond [(eq_attr "is_sfunc" "yes")
 	 (cond [(match_test "!TARGET_LONG_CALLS_SET && (!TARGET_MEDIUM_CALLS || GET_CODE (PATTERN (insn)) != COND_EXEC)") (const_string "call")
 		(match_test "flag_pic") (const_string "sfunc")]
@@ -188,7 +191,7 @@
 
 
 ;; Attribute describing the processor
-(define_attr "cpu" "none,ARC600,ARC700"
+(define_attr "cpu" "none,ARC600,ARC700,ARCEM,ARCHS"
   (const (symbol_ref "arc_cpu_attr")))
 
 ;; true for compact instructions (those with _s suffix)
@@ -226,8 +229,21 @@
 	(symbol_ref "get_attr_length (NEXT_INSN (PREV_INSN (insn)))
 		     - get_attr_length (insn)")))
 
+; for ARCv2 we need to disable/enable different instruction alternatives
+(define_attr "cpu_facility" "std,av1,av2"
+  (const_string "std"))
 
-(define_attr "enabled" "no,yes" (const_string "yes"))
+; We should consider all the instructions enabled until otherwise
+(define_attr "enabled" "no,yes"
+  (cond [(and (eq_attr "cpu_facility" "av1")
+	      (match_test "TARGET_V2"))
+	 (const_string "no")
+
+	 (and (eq_attr "cpu_facility" "av2")
+	      (not (match_test "TARGET_V2")))
+	 (const_string "no")
+	 ]
+	(const_string "yes")))
 
 (define_attr "predicable" "no,yes" (const_string "no"))
 ;; if 'predicable' were not so brain-dead, we would specify:
@@ -580,7 +596,8 @@
    stb%U0%V0 %1,%0"
   [(set_attr "type" "move,move,move,move,move,move,move,load,store,load,load,store,store")
    (set_attr "iscompact" "maybe,maybe,maybe,false,false,false,false,true,true,true,false,false,false")
-   (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,no,no,no,no,no,no")])
+   (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,no,no,no,no,no,no")
+   (set_attr "cpu_facility" "*,*,av1,*,*,*,*,*,*,*,*,*,*")])
 
 (define_expand "movhi"
   [(set (match_operand:HI 0 "move_dest_operand" "")
@@ -607,15 +624,16 @@
    mov%? %0,%1
    mov%? %0,%S1%&
    mov%? %0,%S1
-   ldw%? %0,%1%&
-   stw%? %1,%0%&
-   ldw%U1%V1 %0,%1
-   stw%U0%V0 %1,%0
-   stw%U0%V0 %1,%0
-   stw%U0%V0 %S1,%0"
+   ld%_%? %0,%1%&
+   st%_%? %1,%0%&
+   ld%_%U1%V1 %0,%1
+   st%_%U0%V0 %1,%0
+   st%_%U0%V0 %1,%0
+   st%_%U0%V0 %S1,%0"
   [(set_attr "type" "move,move,move,move,move,move,move,move,load,store,load,store,store,store")
    (set_attr "iscompact" "maybe,maybe,maybe,false,false,false,maybe_limm,false,true,true,false,false,false,false")
-   (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,yes,no,no,no,no,no,no")])
+   (set_attr "predicable" "yes,no,yes,yes,no,yes,yes,yes,no,no,no,no,no,no")
+   (set_attr "cpu_facility" "*,*,av1,*,*,*,*,*,*,*,*,*,*,*")])
 
 (define_expand "movsi"
   [(set (match_operand:SI 0 "move_dest_operand" "")
@@ -669,7 +687,8 @@
    ; Use default length for iscompact to allow for COND_EXEC.  But set length
    ; of Crr to 4.
    (set_attr "length" "*,*,*,4,4,4,4,8,8,*,8,*,*,*,*,*,*,*,*,8")
-   (set_attr "predicable" "yes,no,yes,yes,no,no,yes,no,no,yes,yes,no,no,no,no,no,no,no,no,no")])
+   (set_attr "predicable" "yes,no,yes,yes,no,no,yes,no,no,yes,yes,no,no,no,no,no,no,no,no,no")
+   (set_attr "cpu_facility" "*,*,av1,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*,*")])
 
 ;; Sometimes generated by the epilogue code.  We don't want to
 ;; recognize these addresses in general, because the limm is costly,
@@ -698,7 +717,7 @@
 
 (define_insn_and_split "*movsi_set_cc_insn"
   [(set (match_operand:CC_ZN 2 "cc_set_register" "")
-	(match_operator 3 "zn_compare_operator"
+	(match_operator:CC_ZN 3 "zn_compare_operator"
 	  [(match_operand:SI 1 "nonmemory_operand" "cI,cL,Cal") (const_int 0)]))
    (set (match_operand:SI 0 "register_operand" "=w,w,w")
 	(match_dup 1))]
@@ -715,7 +734,7 @@
 
 (define_insn "unary_comparison"
   [(set (match_operand:CC_ZN 0 "cc_set_register" "")
-	(match_operator 3 "zn_compare_operator"
+	(match_operator:CC_ZN 3 "zn_compare_operator"
 	  [(match_operator:SI 2 "unary_operator"
 	     [(match_operand:SI 1 "register_operand" "c")])
 	   (const_int 0)]))]
@@ -779,7 +798,7 @@
 
 (define_insn "*commutative_binary_comparison"
   [(set (match_operand:CC_ZN 0 "cc_set_register" "")
-	(match_operator 5 "zn_compare_operator"
+	(match_operator:CC_ZN 5 "zn_compare_operator"
 	  [(match_operator:SI 4 "commutative_operator"
 	     [(match_operand:SI 1 "register_operand" "%c,c,c")
 	      (match_operand:SI 2 "nonmemory_operand" "cL,I,?Cal")])
@@ -857,7 +876,7 @@
 	; Make sure to use the W class to not touch LP_COUNT.
    (set (match_operand:SI 0 "register_operand" "=W,W,W")
 	(match_dup 4))]
-  "TARGET_ARC700"
+  "!TARGET_ARC600_FAMILY"
   "%O4.f %0,%1,%2 ; mult commutative"
   [(set_attr "type" "compare,compare,compare")
    (set_attr "cond" "set_zn,set_zn,set_zn")
@@ -881,7 +900,7 @@
 
 (define_insn "*noncommutative_binary_comparison"
   [(set (match_operand:CC_ZN 0 "cc_set_register" "")
-	(match_operator 5 "zn_compare_operator"
+	(match_operator:CC_ZN 5 "zn_compare_operator"
 	  [(match_operator:SI 4 "noncommutative_operator"
 	     [(match_operand:SI 1 "register_operand" "c,c,c")
 	      (match_operand:SI 2 "nonmemory_operand" "cL,I,?Cal")])
@@ -1145,7 +1164,7 @@
    (set (match_operand:SI 0 "dest_reg_operand" "=w,w")
 	(plus:SI (match_dup 1) (match_dup 2)))]
   ""
-  "ldw.a%V4 %3,[%0,%S2]"
+  "ld%_.a%V4 %3,[%0,%S2]"
   [(set_attr "type" "load,load")
    (set_attr "length" "4,8")])
 
@@ -1157,7 +1176,7 @@
    (set (match_operand:SI 0 "dest_reg_operand" "=r,r")
 	(plus:SI (match_dup 1) (match_dup 2)))]
   ""
-  "ldw.a%V4 %3,[%0,%S2]"
+  "ld%_.a%V4 %3,[%0,%S2]"
   [(set_attr "type" "load,load")
    (set_attr "length" "4,8")])
 
@@ -1170,7 +1189,7 @@
    (set (match_operand:SI 0 "dest_reg_operand" "=w,w")
 	(plus:SI (match_dup 1) (match_dup 2)))]
   ""
-  "ldw.x.a%V4 %3,[%0,%S2]"
+  "ld%_.x.a%V4 %3,[%0,%S2]"
   [(set_attr "type" "load,load")
    (set_attr "length" "4,8")])
 
@@ -1182,7 +1201,7 @@
    (set (match_operand:SI 0 "dest_reg_operand" "=w")
 	(plus:SI (match_dup 1) (match_dup 2)))]
   ""
-  "stw.a%V4 %3,[%0,%2]"
+  "st%_.a%V4 %3,[%0,%2]"
   [(set_attr "type" "store")
    (set_attr "length" "4")])
 
@@ -1283,7 +1302,7 @@
       && satisfies_constraint_Rcq (operands[0]))
     return "sub%?.ne %0,%0,%0";
   /* ??? might be good for speed on ARC600 too, *if* properly scheduled.  */
-  if ((TARGET_ARC700 || optimize_size)
+  if ((optimize_size && (!TARGET_ARC600_FAMILY))
       && rtx_equal_p (operands[1], constm1_rtx)
       && GET_CODE (operands[3]) == LTU)
     return "sbc.cs %0,%0,%0";
@@ -1435,13 +1454,13 @@
 	(zero_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "0,q,0,c,Usd,Usd,m")))]
   ""
   "@
-   extw%? %0,%1%&
-   extw%? %0,%1%&
+   ext%_%? %0,%1%&
+   ext%_%? %0,%1%&
    bmsk%? %0,%1,15
-   extw %0,%1
-   ldw%? %0,%1%&
-   ldw%U1 %0,%1
-   ldw%U1%V1 %0,%1"
+   ext%_ %0,%1
+   ld%_%? %0,%1%&
+   ld%_%U1 %0,%1
+   ld%_%U1%V1 %0,%1"
   [(set_attr "type" "unary,unary,unary,unary,load,load,load")
    (set_attr "iscompact" "maybe,true,false,false,true,false,false")
    (set_attr "predicable" "no,no,yes,no,no,no,no")])
@@ -1498,9 +1517,9 @@
 	(sign_extend:SI (match_operand:HI 1 "nonvol_nonimm_operand" "Rcqq,c,m")))]
   ""
   "@
-   sexw%? %0,%1%&
-   sexw %0,%1
-   ldw.x%U1%V1 %0,%1"
+   sex%_%? %0,%1%&
+   sex%_ %0,%1
+   ld%_.x%U1%V1 %0,%1"
   [(set_attr "type" "unary,unary,load")
    (set_attr "iscompact" "true,false,false")])
 
@@ -1604,7 +1623,88 @@
    (set_attr "cond" "canuse,canuse,canuse,canuse,canuse,canuse,nocond,canuse,nocond,nocond,nocond,nocond,canuse_limm,canuse_limm,canuse,canuse,nocond")
 ])
 
-;; ARC700/ARC600 multiply
+;; ARCv2 MPYW and MPYUW
+(define_expand "mulhisi3"
+  [(set (match_operand:SI 0 "register_operand"                           "")
+	(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand"  ""))
+		 (sign_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))]
+  "TARGET_MPYW"
+  "{
+    if (CONSTANT_P (operands[2]))
+    {
+      emit_insn (gen_mulhisi3_imm (operands[0], operands[1], operands[2]));
+      DONE;
+    }
+   }"
+)
+
+(define_insn "mulhisi3_imm"
+  [(set (match_operand:SI 0 "register_operand"                         "=r,r,r,  r,  r")
+	(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" "0,r,0,  0,  r"))
+		 (match_operand:HI 2 "short_const_int_operand"          "L,L,I,C16,C16")))]
+  "TARGET_MPYW"
+  "mpyw%? %0,%1,%2"
+  [(set_attr "length" "4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "mul16_em")
+   (set_attr "predicable" "yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse_limm,nocond")
+   ])
+
+(define_insn "mulhisi3_reg"
+  [(set (match_operand:SI 0 "register_operand"                          "=Rcqq,r,r")
+	(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand"  "   0,0,r"))
+		 (sign_extend:SI (match_operand:HI 2 "nonmemory_operand" "Rcqq,r,r"))))]
+  "TARGET_MPYW"
+  "mpyw%? %0,%1,%2"
+  [(set_attr "length" "*,4,4")
+   (set_attr "iscompact" "maybe,false,false")
+   (set_attr "type" "mul16_em")
+   (set_attr "predicable" "yes,yes,no")
+   (set_attr "cond" "canuse,canuse,nocond")
+   ])
+
+(define_expand "umulhisi3"
+  [(set (match_operand:SI 0 "register_operand"                           "")
+	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand"  ""))
+		 (zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""))))]
+  "TARGET_MPYW"
+  "{
+    if (CONSTANT_P (operands[2]))
+    {
+      emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
+      DONE;
+    }
+  }"
+)
+
+(define_insn "umulhisi3_imm"
+  [(set (match_operand:SI 0 "register_operand"                          "=r, r,r,  r,  r")
+	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, r,0,  0,  r"))
+		 (match_operand:HI 2 "short_const_int_operand"          " L, L,I,C16,C16")))]
+  "TARGET_MPYW"
+  "mpyuw%? %0,%1,%2"
+  [(set_attr "length" "4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "mul16_em")
+   (set_attr "predicable" "yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse_limm,nocond")
+   ])
+
+(define_insn "umulhisi3_reg"
+  [(set (match_operand:SI 0 "register_operand"                          "=Rcqq, r, r")
+	(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "    0, 0, r"))
+		 (zero_extend:SI (match_operand:HI 2 "register_operand" " Rcqq, r, r"))))]
+  "TARGET_MPYW"
+  "mpyuw%? %0,%1,%2"
+  [(set_attr "length" "*,4,4")
+   (set_attr "iscompact" "maybe,false,false")
+   (set_attr "type" "mul16_em")
+   (set_attr "predicable" "yes,yes,no")
+   (set_attr "cond" "canuse,canuse,nocond")
+   ])
+
+;; ARC700/ARC600/V2 multiply
 ;; SI <- SI * SI
 
 (define_expand "mulsi3"
@@ -1613,7 +1713,7 @@
 		 (match_operand:SI 2 "nonmemory_operand" "")))]
   ""
 {
-  if (TARGET_ARC700 && !TARGET_NOMPY_SET)
+  if (TARGET_MPY)
     {
       if (!register_operand (operands[0], SImode))
 	{
@@ -1743,8 +1843,7 @@
    (clobber (reg:SI LP_START))
    (clobber (reg:SI LP_END))
    (clobber (reg:CC CC_REG))]
-  "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET
-   && (!TARGET_ARC700 || TARGET_NOMPY_SET)
+  "!TARGET_ANY_MPY
    && SFUNC_CHECK_PREDICABLE"
   "*return arc_output_libcall (\"__mulsi3\");"
   [(set_attr "is_sfunc" "yes")
@@ -1794,23 +1893,35 @@
  [(set (match_operand:SI 0 "mpy_dest_reg_operand"        "=Rcr,r,r,Rcr,r")
 	(mult:SI (match_operand:SI 1 "register_operand"  " 0,c,0,0,c")
 		 (match_operand:SI 2 "nonmemory_operand" "cL,cL,I,Cal,Cal")))]
-"TARGET_ARC700 && !TARGET_NOMPY_SET"
+ "TARGET_ARC700_MPY"
   "mpyu%? %0,%1,%2"
   [(set_attr "length" "4,4,4,8,8")
    (set_attr "type" "umulti")
    (set_attr "predicable" "yes,no,no,yes,no")
    (set_attr "cond" "canuse,nocond,canuse_limm,canuse,nocond")])
 
+; ARCv2 has no penalties between mpy and mpyu. So, we use mpy because of its
+; short variant. LP_COUNT constraints are still valid.
+(define_insn "mulsi3_v2"
+ [(set (match_operand:SI 0 "mpy_dest_reg_operand"        "=Rcqq,Rcr, r,r,Rcr,  r")
+	(mult:SI (match_operand:SI 1 "register_operand"     "%0,  0, c,0,  0,  c")
+		 (match_operand:SI 2 "nonmemory_operand" " Rcqq, cL,cL,I,Cal,Cal")))]
+ "TARGET_MULTI"
+ "mpy%? %0,%1,%2"
+ [(set_attr "length" "*,4,4,4,8,8")
+  (set_attr "iscompact" "maybe,false,false,false,false,false")
+  (set_attr "type" "umulti")
+  (set_attr "predicable" "no,yes,no,no,yes,no")
+  (set_attr "cond" "nocond,canuse,nocond,canuse_limm,canuse,nocond")])
+
 (define_expand "mulsidi3"
   [(set (match_operand:DI 0 "nonimmediate_operand" "")
 	(mult:DI (sign_extend:DI(match_operand:SI 1 "register_operand" ""))
 		 (sign_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))]
-  "(TARGET_ARC700 && !TARGET_NOMPY_SET)
-   || TARGET_MUL64_SET
-   || TARGET_MULMAC_32BY16_SET"
+  "TARGET_ANY_MPY"
 "
 {
-  if (TARGET_ARC700 && !TARGET_NOMPY_SET)
+  if (TARGET_MPY)
     {
       operands[2] = force_reg (SImode, operands[2]);
       if (!register_operand (operands[0], DImode))
@@ -1892,7 +2003,7 @@
   [(set (match_operand:DI 0 "register_operand" "=&r")
 	(mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "%c"))
 		 (sign_extend:DI (match_operand:SI 2 "extend_operand" "cL"))))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET"
+  "TARGET_MPY"
   "#"
   "&& reload_completed"
   [(const_int 0)]
@@ -1902,7 +2013,7 @@
   rtx l0 = simplify_gen_subreg (word_mode, operands[0], DImode, lo);
   rtx h0 = simplify_gen_subreg (word_mode, operands[0], DImode, hi);
   emit_insn (gen_mulsi3_highpart (h0, operands[1], operands[2]));
-  emit_insn (gen_mulsi3_700 (l0, operands[1], operands[2]));
+  emit_insn (gen_mulsi3 (l0, operands[1], operands[2]));
   DONE;
 }
   [(set_attr "type" "multi")
@@ -1916,8 +2027,8 @@
 	   (sign_extend:DI (match_operand:SI 1 "register_operand" "%0,c,  0,c"))
 	   (sign_extend:DI (match_operand:SI 2 "extend_operand"    "c,c,  i,i")))
 	  (const_int 32))))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET"
-  "mpyh%? %0,%1,%2"
+  "TARGET_MPY"
+  "mpy%+%? %0,%1,%2"
   [(set_attr "length" "4,4,8,8")
    (set_attr "type" "multi")
    (set_attr "predicable" "yes,no,yes,no")
@@ -1933,8 +2044,8 @@
 	   (zero_extend:DI (match_operand:SI 1 "register_operand" "%0,c,  0,c"))
 	   (zero_extend:DI (match_operand:SI 2 "extend_operand"    "c,c,  i,i")))
 	  (const_int 32))))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET"
-  "mpyhu%? %0,%1,%2"
+  "TARGET_MPY"
+  "mpy%+u%? %0,%1,%2"
   [(set_attr "length" "4,4,8,8")
    (set_attr "type" "multi")
    (set_attr "predicable" "yes,no,yes,no")
@@ -1956,8 +2067,7 @@
    (clobber (reg:DI MUL64_OUT_REG))
    (clobber (reg:CC CC_REG))]
   "!TARGET_BIG_ENDIAN
-   && !TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET
-   && (!TARGET_ARC700 || TARGET_NOMPY_SET)
+   && !TARGET_ANY_MPY
    && SFUNC_CHECK_PREDICABLE"
   "*return arc_output_libcall (\"__umulsi3_highpart\");"
   [(set_attr "is_sfunc" "yes")
@@ -1977,8 +2087,7 @@
    (clobber (reg:DI MUL64_OUT_REG))
    (clobber (reg:CC CC_REG))]
   "TARGET_BIG_ENDIAN
-   && !TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET
-   && (!TARGET_ARC700 || TARGET_NOMPY_SET)
+   && !TARGET_ANY_MPY
    && SFUNC_CHECK_PREDICABLE"
   "*return arc_output_libcall (\"__umulsi3_highpart\");"
   [(set_attr "is_sfunc" "yes")
@@ -1995,8 +2104,8 @@
 	   (zero_extend:DI (match_operand:SI 1 "register_operand"  " 0, c, 0,  0,  c"))
 	   (match_operand:DI 2 "immediate_usidi_operand" "L, L, I, Cal, Cal"))
 	  (const_int 32))))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET"
-  "mpyhu%? %0,%1,%2"
+  "TARGET_MPY"
+  "mpy%+u%? %0,%1,%2"
   [(set_attr "length" "4,4,4,8,8")
    (set_attr "type" "multi")
    (set_attr "predicable" "yes,no,no,yes,no")
@@ -2010,12 +2119,12 @@
 	   (zero_extend:DI (match_operand:SI 1 "register_operand" ""))
 	   (zero_extend:DI (match_operand:SI 2 "nonmemory_operand" "")))
 	  (const_int 32))))]
-  "TARGET_ARC700 || (!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET)"
+  "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET"
   "
 {
   rtx target = operands[0];
 
-  if (!TARGET_ARC700 || TARGET_NOMPY_SET)
+  if (!TARGET_MPY)
     {
       emit_move_insn (gen_rtx_REG (SImode, 0), operands[1]);
       emit_move_insn (gen_rtx_REG (SImode, 1), operands[2]);
@@ -2047,7 +2156,7 @@
 		 (zero_extend:DI(match_operand:SI 2 "nonmemory_operand" ""))))]
   ""
 {
-  if (TARGET_ARC700 && !TARGET_NOMPY_SET)
+  if (TARGET_MPY)
     {
       operands[2] = force_reg (SImode, operands[2]);
       if (!register_operand (operands[0], DImode))
@@ -2141,7 +2250,7 @@
   [(set (match_operand:DI 0 "dest_reg_operand" "=&r")
 	(mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "%c"))
 		 (zero_extend:DI (match_operand:SI 2 "extend_operand" "cL"))))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET"
+  "TARGET_MPY"
   "#"
   "reload_completed"
   [(const_int 0)]
@@ -2151,7 +2260,7 @@
   rtx l0 = operand_subword (operands[0], lo, 0, DImode);
   rtx h0 = operand_subword (operands[0], hi, 0, DImode);
   emit_insn (gen_umulsi3_highpart (h0, operands[1], operands[2]));
-  emit_insn (gen_mulsi3_700 (l0, operands[1], operands[2]));
+  emit_insn (gen_mulsi3 (l0, operands[1], operands[2]));
   DONE;
 }
   [(set_attr "type" "umulti")
@@ -2166,8 +2275,7 @@
    (clobber (reg:SI R12_REG))
    (clobber (reg:DI MUL64_OUT_REG))
    (clobber (reg:CC CC_REG))]
-   "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET
-   && (!TARGET_ARC700 || TARGET_NOMPY_SET)
+   "!TARGET_ANY_MPY
    && SFUNC_CHECK_PREDICABLE"
   "*return arc_output_libcall (\"__umulsidi3\");"
   [(set_attr "is_sfunc" "yes")
@@ -2183,8 +2291,7 @@
       (clobber (reg:SI R12_REG))
       (clobber (reg:DI MUL64_OUT_REG))
       (clobber (reg:CC CC_REG))])]
-  "!TARGET_MUL64_SET && !TARGET_MULMAC_32BY16_SET
-   && (!TARGET_ARC700 || TARGET_NOMPY_SET)
+  "!TARGET_ANY_MPY
    && peep2_regno_dead_p (1, TARGET_BIG_ENDIAN ? R1_REG : R0_REG)"
   [(pc)]
 {
@@ -2350,7 +2457,7 @@
 	adc %0,%1,%2"
   ; if we have a bad schedule after sched2, split.
   "reload_completed
-   && !optimize_size && TARGET_ARC700
+   && !optimize_size && (!TARGET_ARC600_FAMILY)
    && arc_scheduling_not_expected ()
    && arc_sets_cc_p (prev_nonnote_insn (insn))
    /* If next comes a return or other insn that needs a delay slot,
@@ -2564,7 +2671,7 @@
 	sbc %0,%1,%2"
   ; if we have a bad schedule after sched2, split.
   "reload_completed
-   && !optimize_size && TARGET_ARC700
+   && !optimize_size && (!TARGET_ARC600_FAMILY)
    && arc_scheduling_not_expected ()
    && arc_sets_cc_p (prev_nonnote_insn (insn))
    /* If next comes a return or other insn that needs a delay slot,
@@ -2802,7 +2909,7 @@
       return \"bclr%? %0,%1,%M2%&\";
     case 4:
       return (INTVAL (operands[2]) == 0xff
-	      ? \"extb%? %0,%1%&\" : \"extw%? %0,%1%&\");
+	      ? \"extb%? %0,%1%&\" : \"ext%_%? %0,%1%&\");
     case 9: case 14: return \"bic%? %0,%1,%n2-1\";
     case 18:
       if (TARGET_BIG_ENDIAN)
@@ -2813,11 +2920,11 @@
 	  xop[1] = adjust_address (operands[1], QImode,
 				   INTVAL (operands[2]) == 0xff ? 3 : 2);
 	  output_asm_insn (INTVAL (operands[2]) == 0xff
-			   ? \"ldb %0,%1\" : \"ldw %0,%1\",
+			   ? \"ldb %0,%1\" : \"ld%_ %0,%1\",
 			   xop);
 	  return \"\";
 	}
-      return INTVAL (operands[2]) == 0xff ? \"ldb %0,%1\" : \"ldw %0,%1\";
+      return INTVAL (operands[2]) == 0xff ? \"ldb %0,%1\" : \"ld%_ %0,%1\";
     default:
       gcc_unreachable ();
     }
@@ -3196,19 +3303,19 @@
 ;; Next come the scc insns.
 
 (define_expand "cstoresi4"
-  [(set (reg:CC CC_REG)
-	(compare:CC (match_operand:SI 2 "nonmemory_operand" "")
-		    (match_operand:SI 3 "nonmemory_operand" "")))
-   (set (match_operand:SI 0 "dest_reg_operand" "")
-	(match_operator:SI 1 "ordered_comparison_operator" [(reg CC_REG)
-							    (const_int 0)]))]
+  [(set (match_operand:SI 0 "dest_reg_operand" "")
+	(match_operator:SI 1 "ordered_comparison_operator" [(match_operand:SI 2 "nonmemory_operand" "")
+							    (match_operand:SI 3 "nonmemory_operand" "")]))]
   ""
 {
-  gcc_assert (XEXP (operands[1], 0) == operands[2]);
-  gcc_assert (XEXP (operands[1], 1) == operands[3]);
-  operands[1] = gen_compare_reg (operands[1], SImode);
-  emit_insn (gen_scc_insn (operands[0], operands[1]));
-  DONE;
+  if (!TARGET_CODE_DENSITY)
+  {
+   gcc_assert (XEXP (operands[1], 0) == operands[2]);
+   gcc_assert (XEXP (operands[1], 1) == operands[3]);
+   operands[1] = gen_compare_reg (operands[1], SImode);
+   emit_insn (gen_scc_insn (operands[0], operands[1]));
+   DONE;
+  }
 })
 
 (define_mode_iterator SDF [SF DF])
@@ -3590,8 +3697,8 @@
       return \"ld.as %0,[%1,%2]%&\";
     case HImode:
       if (ADDR_DIFF_VEC_FLAGS (diff_vec).offset_unsigned)
-	return \"ldw.as %0,[%1,%2]\";
-      return \"ldw.x.as %0,[%1,%2]\";
+	return \"ld%_.as %0,[%1,%2]\";
+      return \"ld%_.x.as %0,[%1,%2]\";
     case QImode:
       if (ADDR_DIFF_VEC_FLAGS (diff_vec).offset_unsigned)
 	return \"ldb%? %0,[%1,%2]%&\";
@@ -3658,7 +3765,7 @@
 	 2 of these are for alignment, and are anticipated in the length
 	 of the ADDR_DIFF_VEC.  */
       if (unalign && !satisfies_constraint_Rcq (xop[0]))
-	s = \"add2 %2,pcl,%0\n\tld_s%2,[%2,12]\";
+	s = \"add2 %2,pcl,%0\n\tld_s %2,[%2,12]\";
       else if (unalign)
 	s = \"add_s %2,%0,2\n\tld.as %2,[pcl,%2]\";
       else
@@ -3670,12 +3777,12 @@
 	{
 	  if (satisfies_constraint_Rcq (xop[0]))
 	    {
-	      s = \"add_s %2,%0,%1\n\tldw.as %2,[pcl,%2]\";
+	      s = \"add_s %2,%0,%1\n\tld%_.as %2,[pcl,%2]\";
 	      xop[1] = GEN_INT ((10 - unalign) / 2U);
 	    }
 	  else
 	    {
-	      s = \"add1 %2,pcl,%0\n\tldw_s %2,[%2,%1]\";
+	      s = \"add1 %2,pcl,%0\n\tld%__s %2,[%2,%1]\";
 	      xop[1] = GEN_INT (10 + unalign);
 	    }
 	}
@@ -3683,12 +3790,12 @@
 	{
 	  if (satisfies_constraint_Rcq (xop[0]))
 	    {
-	      s = \"add_s %2,%0,%1\n\tldw.x.as %2,[pcl,%2]\";
+	      s = \"add_s %2,%0,%1\n\tld%_.x.as %2,[pcl,%2]\";
 	      xop[1] = GEN_INT ((10 - unalign) / 2U);
 	    }
 	  else
 	    {
-	      s = \"add1 %2,pcl,%0\n\tldw_s.x %2,[%2,%1]\";
+	      s = \"add1 %2,pcl,%0\n\tld%__s.x %2,[%2,%1]\";
 	      xop[1] = GEN_INT (10 + unalign);
 	    }
 	}
@@ -3886,6 +3993,14 @@
    (set_attr "cond" "canuse")
    (set_attr "length" "2")])
 
+(define_insn "nopv"
+  [(unspec_volatile [(const_int 0)] VUNSPEC_NOP)]
+  ""
+  "nop%?"
+  [(set_attr "type" "misc")
+   (set_attr "iscompact" "true")
+   (set_attr "length" "2")])
+
 ;; Special pattern to flush the icache.
 ;; ??? Not sure what to do here.  Some ARC's are known to support this.
 
@@ -3985,7 +4100,7 @@
    (set (match_operand:SI 4 "register_operand" "")
   	(mult:SI (match_operand:SI 2 "register_operand")
 		 (match_operand:SI 3 "nonmemory_operand" "")))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET
+  "TARGET_ARC700_MPY
    && (rtx_equal_p (operands[0], operands[2])
        || rtx_equal_p (operands[0], operands[3]))
    && peep2_regno_dead_p (0, CC_REG)
@@ -4015,7 +4130,7 @@
    (set (match_operand:SI 4 "register_operand" "")
   	(mult:SI (match_operand:SI 2 "register_operand")
 		 (match_operand:SI 3 "nonmemory_operand" "")))]
-  "TARGET_ARC700 && !TARGET_NOMPY_SET
+  "TARGET_ARC700_MPY
    && (rtx_equal_p (operands[0], operands[2])
        || rtx_equal_p (operands[0], operands[3]))
    && peep2_regno_dead_p (2, CC_REG)"
@@ -4068,8 +4183,8 @@
 	  (clrsb:HI (match_operand:HI 1 "general_operand" "cL,Cal"))))]
   "TARGET_NORM"
   "@
-   normw \t%0, %1
-   normw \t%0, %S1"
+   norm%_ \t%0, %1
+   norm%_ \t%0, %S1"
   [(set_attr "length" "4,8")
    (set_attr "type" "two_cycle_core,two_cycle_core")])
 
@@ -4479,6 +4594,11 @@
     = gen_rtx_REG (Pmode,
 		   arc_return_address_regs[arc_compute_function_type (cfun)]);
 
+  if (arc_compute_function_type (cfun) == ARC_FUNCTION_ILINK1
+      && TARGET_V2)
+  {
+    return \"rtie\";
+  }
   if (TARGET_PAD_RETURN)
     arc_pad_return ();
   output_asm_insn (\"j%!%* [%0]%&\", &reg);
@@ -4487,8 +4607,13 @@
   [(set_attr "type" "return")
    ; predicable won't help here since the canonical rtl looks different
    ; for branches.
-   (set_attr "cond" "canuse")
-   (set (attr "iscompact")
+   (set (attr "cond")
+	(cond [(and (eq (symbol_ref "arc_compute_function_type (cfun)")
+			(symbol_ref "ARC_FUNCTION_ILINK1"))
+		    (match_test "TARGET_V2"))
+	       (const_string "nocond")]
+	      (const_string "canuse")))
+  (set (attr "iscompact")
 	(cond [(eq (symbol_ref "arc_compute_function_type (cfun)")
 		   (symbol_ref "ARC_FUNCTION_NORMAL"))
 	       (const_string "maybe")]
@@ -4504,7 +4629,9 @@
 	(if_then_else (match_operator 0 "proper_comparison_operator"
 				      [(reg CC_REG) (const_int 0)])
 		      (simple_return) (pc)))]
-  "reload_completed"
+  "reload_completed
+   && !(TARGET_V2
+     && arc_compute_function_type (cfun) == ARC_FUNCTION_ILINK1)"
 {
   rtx xop[2];
   xop[0] = operands[0];
@@ -4909,7 +5036,7 @@
 (define_expand "doloop_end"
   [(use (match_operand 0 "register_operand" ""))
    (use (label_ref (match_operand 1 "" "")))]
-  "TARGET_ARC600 || TARGET_ARC700"
+  "!TARGET_ARC601"
 {
   /* We could do smaller bivs with biv widening, and wider bivs by having
      a high-word counter in an outer loop - but punt on this for now.  */
@@ -5158,6 +5285,247 @@
 ;; this would not work right for -0.  OTOH optabs.c has already code
 ;; to synthesyze negate by flipping the sign bit.
 
+;;V2 instructions
+(define_insn "bswapsi2"
+  [(set (match_operand:SI 0 "register_operand"           "= r,r")
+	(bswap:SI (match_operand:SI 1 "nonmemory_operand" "rL,Cal")))]
+  "TARGET_V2 && TARGET_SWAP"
+  "swape %0, %1"
+  [(set_attr "length" "4,8")
+   (set_attr "type" "two_cycle_core")])
+
+(define_expand "prefetch"
+  [(prefetch (match_operand:SI 0 "address_operand" "")
+	     (match_operand:SI 1 "const_int_operand" "")
+	     (match_operand:SI 2 "const_int_operand" ""))]
+  "TARGET_HS"
+  "")
+
+(define_insn "prefetch_1"
+  [(prefetch (match_operand:SI 0 "register_operand" "r")
+	     (match_operand:SI 1 "const_int_operand" "n")
+	     (match_operand:SI 2 "const_int_operand" "n"))]
+  "TARGET_HS"
+  {
+   if (INTVAL (operands[1]))
+      return "prefetchw [%0]";
+   else
+      return "prefetch [%0]";
+  }
+  [(set_attr "type" "load")
+   (set_attr "length" "4")])
+
+(define_insn "prefetch_2"
+  [(prefetch (plus:SI (match_operand:SI 0 "register_operand" "r,r,r")
+		      (match_operand:SI 1 "nonmemory_operand" "r,Cm2,Cal"))
+	     (match_operand:SI 2 "const_int_operand" "n,n,n")
+	     (match_operand:SI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_HS"
+  {
+   if (INTVAL (operands[2]))
+      return "prefetchw [%0, %1]";
+   else
+      return "prefetch [%0, %1]";
+  }
+  [(set_attr "type" "load")
+   (set_attr "length" "4,4,8")])
+
+(define_insn "prefetch_3"
+  [(prefetch (match_operand:SI 0 "address_operand" "p")
+	     (match_operand:SI 1 "const_int_operand" "n")
+	     (match_operand:SI 2 "const_int_operand" "n"))]
+  "TARGET_HS"
+  {
+   operands[0] = gen_rtx_MEM (SImode, operands[0]);
+   if (INTVAL (operands[1]))
+      return "prefetchw%U0 %0";
+   else
+      return "prefetch%U0 %0";
+   }
+  [(set_attr "type" "load")
+   (set_attr "length" "8")])
+
+(define_insn "divsi3"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r,r,r,  r,  r")
+	(div:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,  r,L,L,I,Cal,Cal")))]
+  "TARGET_DIVREM"
+  "div%? %0, %1, %2"
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "div_rem")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+(define_insn "udivsi3"
+  [(set (match_operand:SI 0 "register_operand"          "=r,r,  r,r,r,r,  r,  r")
+	(udiv:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0,  0,  r")
+		 (match_operand:SI 2 "nonmemory_operand" "r,r,  r,L,L,I,Cal,Cal")))]
+  "TARGET_DIVREM"
+  "divu%? %0, %1, %2"
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "div_rem")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+(define_insn "modsi3"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r,r,r,  r,  r")
+	(mod:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,  r,L,L,I,Cal,Cal")))]
+  "TARGET_DIVREM"
+  "rem%? %0, %1, %2"
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "div_rem")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+(define_insn "umodsi3"
+  [(set (match_operand:SI 0 "register_operand"          "=r,r,  r,r,r,r,  r,  r")
+	(umod:SI (match_operand:SI 1 "nonmemory_operand" "0,r,Cal,0,r,0,  0,  r")
+		 (match_operand:SI 2 "nonmemory_operand" "r,r,  r,L,L,I,Cal,Cal")))]
+  "TARGET_DIVREM"
+  "remu%? %0, %1, %2"
+  [(set_attr "length" "4,4,8,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "div_rem")
+   (set_attr "predicable" "yes,no,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+;; SETcc instructions
+(define_code_iterator arcCC_cond [eq ne gt lt ge le])
+
+(define_insn "arcset<code>"
+  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r")
+	(arcCC_cond:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,0,r")
+		       (match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,n,n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY"
+  "set<code>%? %0, %1, %2"
+  [(set_attr "length" "4,4,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "compare")
+   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+(define_insn "arcsetltu"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,  r,  r")
+	(ltu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY"
+  "setlo%? %0, %1, %2"
+  [(set_attr "length" "4,4,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "compare")
+   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+(define_insn "arcsetgeu"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,r,r,r,  r,  r")
+	(geu:SI (match_operand:SI 1 "nonmemory_operand" "0,r,0,r,0,  0,  r")
+		(match_operand:SI 2 "nonmemory_operand" "r,r,L,L,I,  n,  n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY"
+  "seths%? %0, %1, %2"
+  [(set_attr "length" "4,4,4,4,4,8,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "compare")
+   (set_attr "predicable" "yes,no,yes,no,no,yes,no")
+   (set_attr "cond" "canuse,nocond,canuse,nocond,nocond,canuse,nocond")
+   ])
+
+;; Special cases of SETCC
+(define_insn_and_split "arcsethi"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r")
+	(gtu:SI (match_operand:SI 1 "nonmemory_operand" "r,r,  r,r")
+		(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY"
+  "setlo%? %0, %2, %1"
+  "reload_completed
+   && CONST_INT_P (operands[2])
+   && satisfies_constraint_C62 (operands[2])"
+  [(const_int 0)]
+  "{
+    /* sethi a,b,u6 => seths a,b,u6 + 1.  */
+    operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+    emit_insn (gen_arcsetgeu (operands[0], operands[1], operands[2]));
+    DONE;
+ }"
+ [(set_attr "length" "4,4,4,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "compare")
+   (set_attr "predicable" "yes,no,no,no")
+   (set_attr "cond" "canuse,nocond,nocond,nocond")]
+)
+
+(define_insn_and_split "arcsetls"
+  [(set (match_operand:SI 0 "register_operand"         "=r,r,  r,r")
+	(leu:SI (match_operand:SI 1 "nonmemory_operand" "r,r,  r,r")
+		(match_operand:SI 2 "nonmemory_operand" "0,r,C62,n")))]
+  "TARGET_V2 && TARGET_CODE_DENSITY"
+  "seths%? %0, %2, %1"
+  "reload_completed
+   && CONST_INT_P (operands[2])
+   && satisfies_constraint_C62 (operands[2])"
+  [(const_int 0)]
+  "{
+    /* setls a,b,u6 => setlo a,b,u6 + 1.  */
+    operands[2] = GEN_INT (INTVAL (operands[2]) + 1);
+    emit_insn (gen_arcsetltu (operands[0], operands[1], operands[2]));
+    DONE;
+ }"
+ [(set_attr "length" "4,4,4,8")
+   (set_attr "iscompact" "false")
+   (set_attr "type" "compare")
+   (set_attr "predicable" "yes,no,no,no")
+   (set_attr "cond" "canuse,nocond,nocond,nocond")]
+)
+
+; Any mode that needs to be solved by secondary reload
+(define_mode_iterator SRI [QI HI])
+
+(define_expand "reload_<mode>_load"
+  [(parallel [(match_operand:SRI 0 "register_operand" "=r")
+	      (match_operand:SRI 1 "memory_operand" "m")
+	      (match_operand:SI 2 "register_operand" "=&r")])]
+  ""
+{
+ arc_secondary_reload_conv (operands[0], operands[1], operands[2], false);
+ DONE;
+})
+
+(define_expand "reload_<mode>_store"
+  [(parallel [(match_operand:SRI 0 "memory_operand" "=m")
+	      (match_operand:SRI 1 "register_operand" "r")
+	      (match_operand:SI 2 "register_operand" "=&r")])]
+  ""
+{
+ arc_secondary_reload_conv (operands[1], operands[0], operands[2], true);
+ DONE;
+})
+
+
+(define_insn "extzvsi"
+  [(set (match_operand:SI 0 "register_operand"                  "=r  , r  , r, r, r")
+	(zero_extract:SI (match_operand:SI 1 "register_operand"  "0  , r  , 0, 0, r")
+			 (match_operand:SI 2 "const_int_operand" "C3p, C3p, i, i, i")
+			 (match_operand:SI 3 "const_int_operand" "i  , i  , i, i, i")))]
+  "TARGET_HS && TARGET_BARREL_SHIFTER"
+  {
+   int assemble_op2 = (((INTVAL (operands[2]) - 1) & 0x1f) << 5) | (INTVAL (operands[3]) & 0x1f);
+   operands[2] = GEN_INT (assemble_op2);
+   return "xbfu%? %0,%1,%2";
+  }
+  [(set_attr "type"       "shift")
+   (set_attr "iscompact"  "false")
+   (set_attr "length"     "4,4,4,8,8")
+   (set_attr "predicable" "yes,no,no,yes,no")
+   (set_attr "cond"       "canuse,nocond,nocond,canuse,nocond")])
 
 ;; include the arc-FPX instructions
 (include "fpx.md")
diff --git a/gcc/config/arc/arc.opt b/gcc/config/arc/arc.opt
index 29e89f93d15..0c10c67c4e7 100644
--- a/gcc/config/arc/arc.opt
+++ b/gcc/config/arc/arc.opt
@@ -53,6 +53,18 @@ mARC700
 Target Report
 Same as -mA7.
 
+mmpy-option=
+Target RejectNegative Joined UInteger Var(arc_mpy_option) Init(2)
+-mmpy-option={0,1,2,3,4,5,6,7,8,9} Compile ARCv2 code with a multiplier design option.  Option 2 is default on.
+
+mdiv-rem
+Target Report Mask(DIVREM)
+Enable DIV-REM instructions for ARCv2
+
+mcode-density
+Target Report Mask(CODE_DENSITY)
+Enable code density instructions for ARCv2
+
 mmixed-code
 Target Report Mask(MIXED_CODE_SET)
 Tweak register allocation to help 16-bit instruction generation.
@@ -162,11 +174,32 @@ EnumValue
 Enum(processor_type) String(ARC600) Value(PROCESSOR_ARC600)
 
 EnumValue
+Enum(processor_type) String(arc600) Value(PROCESSOR_ARC600)
+
+EnumValue
 Enum(processor_type) String(ARC601) Value(PROCESSOR_ARC601)
 
 EnumValue
+Enum(processor_type) String(arc601) Value(PROCESSOR_ARC601)
+
+EnumValue
 Enum(processor_type) String(ARC700) Value(PROCESSOR_ARC700)
 
+EnumValue
+Enum(processor_type) String(arc700) Value(PROCESSOR_ARC700)
+
+EnumValue
+Enum(processor_type) String(ARCEM) Value(PROCESSOR_ARCEM)
+
+EnumValue
+Enum(processor_type) String(arcem) Value(PROCESSOR_ARCEM)
+
+EnumValue
+Enum(processor_type) String(ARCHS) Value(PROCESSOR_ARCHS)
+
+EnumValue
+Enum(processor_type) String(archs) Value(PROCESSOR_ARCHS)
+
 msize-level=
 Target RejectNegative Joined UInteger Var(arc_size_opt_level) Init(-1)
 size optimization level: 0:none 1:opportunistic 2: regalloc 3:drop align, -Os.
diff --git a/gcc/config/arc/arcEM.md b/gcc/config/arc/arcEM.md
new file mode 100644
index 00000000000..a72d2504e52
--- /dev/null
+++ b/gcc/config/arc/arcEM.md
@@ -0,0 +1,93 @@
+;; DFA scheduling description of the Synopsys DesignWare ARC EM cpu
+;; for GNU C compiler
+;; Copyright (C) 2007-2015 Free Software Foundation, Inc.
+;; Contributor: Claudiu Zissulescu <claudiu.zissulescu@synopsys.com>
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "ARCEM")
+
+(define_cpu_unit "em_issue, ld_st, mul_em, divrem_em" "ARCEM")
+
+(define_insn_reservation "em_data_load" 2
+  (and (match_test "TARGET_EM")
+       (eq_attr "type" "load"))
+  "em_issue+ld_st,nothing")
+
+(define_insn_reservation "em_data_store" 1
+  (and (match_test "TARGET_EM")
+       (eq_attr "type" "store"))
+  "em_issue+ld_st")
+
+;; Multipliers options
+(define_insn_reservation "mul_em_mpyw_1" 1
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option > 0")
+       (match_test "arc_mpy_option <= 2")
+       (eq_attr "type" "mul16_em"))
+  "em_issue+mul_em")
+
+(define_insn_reservation "mul_em_mpyw_2" 2
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option > 2")
+       (match_test "arc_mpy_option <= 5")
+       (eq_attr "type" "mul16_em"))
+  "em_issue+mul_em, nothing")
+
+(define_insn_reservation "mul_em_mpyw_4" 4
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option == 6")
+       (eq_attr "type" "mul16_em"))
+  "em_issue+mul_em, mul_em*3")
+
+(define_insn_reservation "mul_em_multi_wlh1" 1
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option == 2")
+       (eq_attr "type" "multi,umulti"))
+  "em_issue+mul_em")
+
+(define_insn_reservation "mul_em_multi_wlh2" 2
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option == 3")
+       (eq_attr "type" "multi,umulti"))
+  "em_issue+mul_em, nothing")
+
+(define_insn_reservation "mul_em_multi_wlh3" 3
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option == 4")
+       (eq_attr "type" "multi,umulti"))
+  "em_issue+mul_em, mul_em*2")
+
+;; FIXME! Make the difference between MPY and MPYM for WLH4
+(define_insn_reservation "mul_em_multi_wlh4" 4
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option == 5")
+       (eq_attr "type" "multi,umulti"))
+  "em_issue+mul_em, mul_em*4")
+
+(define_insn_reservation "mul_em_multi_wlh5" 9
+  (and (match_test "TARGET_EM")
+       (match_test "arc_mpy_option == 6")
+       (eq_attr "type" "multi,umulti"))
+  "em_issue+mul_em, mul_em*8")
+
+;; Radix-4 divider timing
+(define_insn_reservation "em_divrem" 3
+  (and (match_test "TARGET_EM")
+       (match_test "TARGET_DIVREM")
+       (eq_attr "type" "div_rem"))
+  "em_issue+mul_em+divrem_em, (mul_em+divrem_em)*2")
diff --git a/gcc/config/arc/arcHS.md b/gcc/config/arc/arcHS.md
new file mode 100644
index 00000000000..06937445a47
--- /dev/null
+++ b/gcc/config/arc/arcHS.md
@@ -0,0 +1,76 @@
+;; DFA scheduling description of the Synopsys DesignWare ARC HS cpu
+;; for GNU C compiler
+;; Copyright (C) 2007-2015 Free Software Foundation, Inc.
+;; Contributor: Claudiu Zissulescu <claudiu.zissulescu@synopsys.com>
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_automaton "ARCHS")
+
+(define_cpu_unit "hs_issue, hs_ld_st, divrem_hs, mul_hs, x1, x2" "ARCHS")
+
+(define_insn_reservation "hs_data_load" 4
+  (and (match_test "TARGET_HS")
+       (eq_attr "type" "load"))
+  "hs_issue+hs_ld_st,hs_ld_st,nothing*2")
+
+(define_insn_reservation "hs_data_store" 1
+  (and (match_test "TARGET_HS")
+       (eq_attr "type" "store"))
+  "hs_issue+hs_ld_st")
+
+(define_insn_reservation "hs_alu0" 2
+  (and (match_test "TARGET_HS")
+       (eq_attr "type" "cc_arith, two_cycle_core, shift, lr, sr"))
+  "hs_issue+x1,x2")
+
+(define_insn_reservation "hs_alu1" 4
+  (and (match_test "TARGET_HS")
+       (eq_attr "type" "move, cmove, unary, binary, compare, misc"))
+  "hs_issue+x1, nothing*3")
+
+(define_insn_reservation "hs_divrem" 13
+  (and (match_test "TARGET_HS")
+       (match_test "TARGET_DIVREM")
+       (eq_attr "type" "div_rem"))
+  "hs_issue+divrem_hs, (divrem_hs)*12")
+
+(define_insn_reservation "hs_mul" 3
+  (and (match_test "TARGET_HS")
+       (eq_attr "type" "mul16_em, multi, umulti"))
+  "hs_issue+mul_hs, nothing*3")
+
+;; BYPASS EALU ->
+(define_bypass 1 "hs_alu0" "hs_divrem")
+(define_bypass 1 "hs_alu0" "hs_mul")
+
+;; BYPASS BALU ->
+(define_bypass 1 "hs_alu1" "hs_alu1")
+(define_bypass 1 "hs_alu1" "hs_data_store" "store_data_bypass_p")
+
+;; BYPASS LD ->
+(define_bypass 1 "hs_data_load" "hs_alu1")
+(define_bypass 3 "hs_data_load" "hs_divrem")
+(define_bypass 3 "hs_data_load" "hs_data_load")
+(define_bypass 3 "hs_data_load" "hs_mul")
+(define_bypass 1 "hs_data_load" "hs_data_store" "store_data_bypass_p")
+
+;; BYPASS MPY ->
+;;(define_bypass 3 "hs_mul" "hs_mul")
+(define_bypass 1 "hs_mul" "hs_alu1")
+(define_bypass 3 "hs_mul" "hs_divrem")
+(define_bypass 1 "hs_mul" "hs_data_store" "store_data_bypass_p")
diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
index 3d0db360557..65ea44a9f13 100644
--- a/gcc/config/arc/constraints.md
+++ b/gcc/config/arc/constraints.md
@@ -127,6 +127,12 @@
   (and (match_code "const_int")
        (match_test "UNSIGNED_INT6 (-ival)")))
 
+(define_constraint "C16"
+  "@internal
+   A 16-bit signed integer constant"
+  (and (match_code "const_int")
+       (match_test "SIGNED_INT16 (ival)")))
+
 (define_constraint "M"
   "@internal
    A 5-bit unsigned integer constant"
@@ -212,6 +218,12 @@
   (and (match_code "const_int")
        (match_test "ival && IS_POWEROF2_P (ival + 1)")))
 
+(define_constraint "C3p"
+ "@internal
+  constant int used to select xbfu a,b,u6 instruction.  The values accepted are 1 and 2."
+  (and (match_code "const_int")
+       (match_test "((ival == 1) || (ival == 2))")))
+
 (define_constraint "Ccp"
  "@internal
   constant such that ~x (one's Complement) is a power of two"
@@ -397,3 +409,15 @@
    Integer constant zero"
   (and (match_code "const_int")
        (match_test "IS_ZERO (ival)")))
+
+(define_constraint "Cm2"
+  "@internal
+   A signed 9-bit integer constant."
+  (and (match_code "const_int")
+       (match_test "(ival >= -256) && (ival <=255)")))
+
+(define_constraint "C62"
+  "@internal
+   An unsigned 6-bit integer constant, up to 62."
+  (and (match_code "const_int")
+       (match_test "UNSIGNED_INT6 (ival - 1)")))
diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index d72f097eb71..43f9474c691 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -664,7 +664,7 @@
        (match_operand 0 "shiftr4_operator")))
 
 (define_predicate "mult_operator"
-    (and (match_code "mult") (match_test "TARGET_ARC700 && !TARGET_NOMPY_SET"))
+    (and (match_code "mult") (match_test "TARGET_MPY"))
 )
 
 (define_predicate "commutative_operator"
@@ -809,3 +809,7 @@
     (match_test "INTVAL (op) >= 0")
     (and (match_test "const_double_operand (op, mode)")
 	 (match_test "CONST_DOUBLE_HIGH (op) == 0"))))
+
+(define_predicate "short_const_int_operand"
+  (and (match_operand 0 "const_int_operand")
+       (match_test "satisfies_constraint_C16 (op)")))
diff --git a/gcc/config/arc/t-arc-newlib b/gcc/config/arc/t-arc-newlib
index 8823805b8aa..ea43a52cdc0 100644
--- a/gcc/config/arc/t-arc-newlib
+++ b/gcc/config/arc/t-arc-newlib
@@ -17,8 +17,8 @@
 # with GCC; see the file COPYING3.  If not see
 # <http://www.gnu.org/licenses/>.
 
-MULTILIB_OPTIONS=mcpu=ARC600/mcpu=ARC601 mmul64/mmul32x16 mnorm
-MULTILIB_DIRNAMES=arc600 arc601 mul64 mul32x16 norm
+MULTILIB_OPTIONS=mcpu=ARC600/mcpu=ARC601/mcpu=ARC700/mcpu=ARCEM/mcpu=ARCHS mmul64/mmul32x16 mnorm
+MULTILIB_DIRNAMES=arc600 arc601 arc700 em hs mul64 mul32x16 norm
 #
 # Aliases:
 MULTILIB_MATCHES  = mcpu?ARC600=mcpu?arc600
@@ -26,10 +26,21 @@ MULTILIB_MATCHES += mcpu?ARC600=mARC600
 MULTILIB_MATCHES += mcpu?ARC600=mA6
 MULTILIB_MATCHES += mcpu?ARC600=mno-mpy
 MULTILIB_MATCHES += mcpu?ARC601=mcpu?arc601
+MULTILIB_MATCHES += mcpu?ARC700=mA7
+MULTILIB_MATCHES += mcpu?ARC700=mARC700
+MULTILIB_MATCHES += mcpu?ARC700=mcpu?arc700
+MULTILIB_MATCHES += mcpu?ARCEM=mcpu?arcem
+MULTILIB_MATCHES += mcpu?ARCHS=mcpu?archs
 MULTILIB_MATCHES += EL=mlittle-endian
 MULTILIB_MATCHES += EB=mbig-endian
 #
 # These don't make sense for the ARC700 default target:
-MULTILIB_EXCEPTIONS=mmul64* mmul32x16* mnorm*
+MULTILIB_EXCEPTIONS=mmul64* mmul32x16* norm*
 # And neither of the -mmul* options make sense without -mnorm:
 MULTILIB_EXCLUSIONS=mARC600/mmul64/!mnorm mcpu=ARC601/mmul64/!mnorm mARC600/mmul32x16/!mnorm
+# Exclusions for ARC700
+MULTILIB_EXCEPTIONS += mcpu=ARC700/mnorm* mcpu=ARC700/mmul64* mcpu=ARC700/mmul32x16*
+# Exclusions for ARCv2EM
+MULTILIB_EXCEPTIONS += mcpu=ARCEM/mmul64* mcpu=ARCEM/mmul32x16*
+# Exclusions for ARCv2HS
+MULTILIB_EXCEPTIONS += mcpu=ARCHS/mmul64* mcpu=ARCHS/mmul32x16* mcpu=ARCHS/mnorm*
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index bad3dc381a1..f73afc269c3 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -67,7 +67,9 @@ enum arm_type_qualifiers
   /* Polynomial types.  */
   qualifier_poly = 0x100,
   /* Lane indices - must be within range of previous argument = a vector.  */
-  qualifier_lane_index = 0x200
+  qualifier_lane_index = 0x200,
+  /* Lane indices for single lane structure loads and stores.  */
+  qualifier_struct_load_store_lane_index = 0x400
 };
 
 /*  The qualifier_internal allows generation of a unary builtin from
@@ -150,7 +152,7 @@ arm_load1_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 static enum arm_type_qualifiers
 arm_load1_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_const_pointer_map_mode,
-      qualifier_none, qualifier_immediate };
+      qualifier_none, qualifier_struct_load_store_lane_index };
 #define LOAD1LANE_QUALIFIERS (arm_load1_lane_qualifiers)
 
 /* The first argument (return type) of a store should be void type,
@@ -169,7 +171,7 @@ arm_store1_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 static enum arm_type_qualifiers
 arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_void, qualifier_pointer_map_mode,
-      qualifier_none, qualifier_immediate };
+      qualifier_none, qualifier_struct_load_store_lane_index };
 #define STORE1LANE_QUALIFIERS (arm_storestruct_lane_qualifiers)
 
 #define v8qi_UP  V8QImode
@@ -1963,6 +1965,7 @@ typedef enum {
   NEON_ARG_COPY_TO_REG,
   NEON_ARG_CONSTANT,
   NEON_ARG_LANE_INDEX,
+  NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX,
   NEON_ARG_MEMORY,
   NEON_ARG_STOP
 } builtin_arg;
@@ -2020,9 +2023,9 @@ neon_dereference_pointer (tree exp, tree type, machine_mode mem_mode,
 /* Expand a Neon builtin.  */
 static rtx
 arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
-		      int icode, int have_retval, tree exp, ...)
+		      int icode, int have_retval, tree exp,
+		      builtin_arg *args)
 {
-  va_list ap;
   rtx pat;
   tree arg[SIMD_MAX_BUILTIN_ARGS];
   rtx op[SIMD_MAX_BUILTIN_ARGS];
@@ -2037,13 +2040,11 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
 	  || !(*insn_data[icode].operand[0].predicate) (target, tmode)))
     target = gen_reg_rtx (tmode);
 
-  va_start (ap, exp);
-
   formals = TYPE_ARG_TYPES (TREE_TYPE (arm_builtin_decls[fcode]));
 
   for (;;)
     {
-      builtin_arg thisarg = (builtin_arg) va_arg (ap, int);
+      builtin_arg thisarg = args[argc];
 
       if (thisarg == NEON_ARG_STOP)
 	break;
@@ -2079,6 +2080,18 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
 		op[argc] = copy_to_mode_reg (mode[argc], op[argc]);
 	      break;
 
+	    case NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX:
+	      gcc_assert (argc > 1);
+	      if (CONST_INT_P (op[argc]))
+		{
+		  neon_lane_bounds (op[argc], 0,
+				    GET_MODE_NUNITS (map_mode), exp);
+		  /* Keep to GCC-vector-extension lane indices in the RTL.  */
+		  op[argc] =
+		    GEN_INT (NEON_ENDIAN_LANE_N (map_mode, INTVAL (op[argc])));
+		}
+	      goto constant_arg;
+
 	    case NEON_ARG_LANE_INDEX:
 	      /* Previous argument must be a vector, which this indexes.  */
 	      gcc_assert (argc > 0);
@@ -2089,19 +2102,22 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
 		}
 	      /* Fall through - if the lane index isn't a constant then
 		 the next case will error.  */
+
 	    case NEON_ARG_CONSTANT:
+constant_arg:
 	      if (!(*insn_data[icode].operand[opno].predicate)
 		  (op[argc], mode[argc]))
-		error_at (EXPR_LOCATION (exp), "incompatible type for argument %d, "
-		       "expected %<const int%>", argc + 1);
+		{
+		  error ("%Kargument %d must be a constant immediate",
+			 exp, argc + 1);
+		  return const0_rtx;
+		}
 	      break;
+
             case NEON_ARG_MEMORY:
 	      /* Check if expand failed.  */
 	      if (op[argc] == const0_rtx)
-	      {
-		va_end (ap);
 		return 0;
-	      }
 	      gcc_assert (MEM_P (op[argc]));
 	      PUT_MODE (op[argc], mode[argc]);
 	      /* ??? arm_neon.h uses the same built-in functions for signed
@@ -2122,8 +2138,6 @@ arm_expand_neon_args (rtx target, machine_mode map_mode, int fcode,
 	}
     }
 
-  va_end (ap);
-
   if (have_retval)
     switch (argc)
       {
@@ -2235,6 +2249,8 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target)
 
       if (d->qualifiers[qualifiers_k] & qualifier_lane_index)
 	args[k] = NEON_ARG_LANE_INDEX;
+      else if (d->qualifiers[qualifiers_k] & qualifier_struct_load_store_lane_index)
+	args[k] = NEON_ARG_STRUCT_LOAD_STORE_LANE_INDEX;
       else if (d->qualifiers[qualifiers_k] & qualifier_immediate)
 	args[k] = NEON_ARG_CONSTANT;
       else if (d->qualifiers[qualifiers_k] & qualifier_maybe_immediate)
@@ -2260,11 +2276,7 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target)
      the function is void, and a 1 if it is not.  */
   return arm_expand_neon_args
 	  (target, d->mode, fcode, icode, !is_void, exp,
-	   args[1],
-	   args[2],
-	   args[3],
-	   args[4],
-	   NEON_ARG_STOP);
+	   &args[1]);
 }
 
 /* Expand an expression EXP that calls a built-in function,
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f4ebbc80f16..709369441d0 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -11049,6 +11049,23 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code,
     case UNSIGNED_FIX:
       if (TARGET_HARD_FLOAT)
 	{
+	  /* The *combine_vcvtf2i reduces a vmul+vcvt into
+	     a vcvt fixed-point conversion.  */
+	  if (code == FIX && mode == SImode
+	      && GET_CODE (XEXP (x, 0)) == FIX
+	      && GET_MODE (XEXP (x, 0)) == SFmode
+	      && GET_CODE (XEXP (XEXP (x, 0), 0)) == MULT
+	      && vfp3_const_double_for_bits (XEXP (XEXP (XEXP (x, 0), 0), 1))
+		 > 0)
+	    {
+	      if (speed_p)
+		*cost += extra_cost->fp[0].toint;
+
+	      *cost += rtx_cost (XEXP (XEXP (XEXP (x, 0), 0), 0), mode,
+				 code, 0, speed_p);
+	      return true;
+	    }
+
 	  if (GET_MODE_CLASS (mode) == MODE_INT)
 	    {
 	      mode = GET_MODE (XEXP (x, 0));
@@ -12339,32 +12356,15 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse,
     {
       rtx el = vector ? CONST_VECTOR_ELT (op, i) : op;
       unsigned HOST_WIDE_INT elpart;
-      unsigned int part, parts;
 
-      if (CONST_INT_P (el))
-        {
-          elpart = INTVAL (el);
-          parts = 1;
-        }
-      else if (CONST_DOUBLE_P (el))
-        {
-          elpart = CONST_DOUBLE_LOW (el);
-          parts = 2;
-        }
-      else
-        gcc_unreachable ();
+      gcc_assert (CONST_INT_P (el));
+      elpart = INTVAL (el);
 
-      for (part = 0; part < parts; part++)
-        {
-          unsigned int byte;
-          for (byte = 0; byte < innersize; byte++)
-            {
-              bytes[idx++] = (elpart & 0xff) ^ invmask;
-              elpart >>= BITS_PER_UNIT;
-            }
-          if (CONST_DOUBLE_P (el))
-            elpart = CONST_DOUBLE_HIGH (el);
-        }
+      for (unsigned int byte = 0; byte < innersize; byte++)
+	{
+	  bytes[idx++] = (elpart & 0xff) ^ invmask;
+	  elpart >>= BITS_PER_UNIT;
+	}
     }
 
   /* Sanity check.  */
@@ -12960,14 +12960,14 @@ neon_vector_mem_operand (rtx op, int type, bool strict)
   rtx ind;
 
   /* Reject eliminable registers.  */
-  if (! (reload_in_progress || reload_completed)
-      && (   reg_mentioned_p (frame_pointer_rtx, op)
+  if (strict && ! (reload_in_progress || reload_completed)
+      && (reg_mentioned_p (frame_pointer_rtx, op)
 	  || reg_mentioned_p (arg_pointer_rtx, op)
 	  || reg_mentioned_p (virtual_incoming_args_rtx, op)
 	  || reg_mentioned_p (virtual_outgoing_args_rtx, op)
 	  || reg_mentioned_p (virtual_stack_dynamic_rtx, op)
 	  || reg_mentioned_p (virtual_stack_vars_rtx, op)))
-    return !strict;
+    return FALSE;
 
   /* Constants are converted into offsets from labels.  */
   if (!MEM_P (op))
@@ -30103,4 +30103,5 @@ arm_sched_fusion_priority (rtx_insn *insn, int max_pri,
   *pri = tmp;
   return;
 }
+
 #include "gt-arm.h"
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index a1a04a94ef2..313fed5b450 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -284,6 +284,12 @@ extern void (*arm_lang_output_object_attributes_hook)(void);
 #define TARGET_BPABI false
 #endif
 
+/* Transform lane numbers on big endian targets. This is used to allow for the
+   endianness difference between NEON architectural lane numbers and those
+   used in RTL */
+#define NEON_ENDIAN_LANE_N(mode, n)  \
+  (BYTES_BIG_ENDIAN ? GET_MODE_NUNITS (mode) - 1 - n : n)
+
 /* Support for a compile-time default CPU, et cetera.  The rules are:
    --with-arch is ignored if -march or -mcpu are specified.
    --with-cpu is ignored if -march or -mcpu are specified, and is overridden
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index e5a2b0f1c9a..119550c4baa 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -4253,6 +4253,9 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_load1_1reg<q>")]
 )
 
+;; The lane numbers in the RTL are in GCC lane order, having been flipped
+;; in arm_expand_neon_args. The lane numbers are restored to architectural
+;; lane order here.
 (define_insn "neon_vld1_lane<mode>"
   [(set (match_operand:VDX 0 "s_register_operand" "=w")
         (unspec:VDX [(match_operand:<V_elem> 1 "neon_struct_operand" "Um")
@@ -4261,10 +4264,9 @@ if (BYTES_BIG_ENDIAN)
                     UNSPEC_VLD1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
+  operands[3] = GEN_INT (lane);
   if (max == 1)
     return "vld1.<V_sz_elem>\t%P0, %A1";
   else
@@ -4273,6 +4275,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_load1_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld1_lane<mode>"
   [(set (match_operand:VQX 0 "s_register_operand" "=w")
         (unspec:VQX [(match_operand:<V_elem> 1 "neon_struct_operand" "Um")
@@ -4281,12 +4285,11 @@ if (BYTES_BIG_ENDIAN)
                     UNSPEC_VLD1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
+  operands[3] = GEN_INT (lane);
   int regno = REGNO (operands[0]);
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
@@ -4359,6 +4362,8 @@ if (BYTES_BIG_ENDIAN)
   "vst1.<V_sz_elem>\t%h1, %A0"
   [(set_attr "type" "neon_store1_1reg<q>")])
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst1_lane<mode>"
   [(set (match_operand:<V_elem> 0 "neon_struct_operand" "=Um")
 	(unspec:<V_elem>
@@ -4367,10 +4372,9 @@ if (BYTES_BIG_ENDIAN)
 	  UNSPEC_VST1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
+  operands[2] = GEN_INT (lane);
   if (max == 1)
     return "vst1.<V_sz_elem>\t{%P1}, %A0";
   else
@@ -4379,6 +4383,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_store1_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst1_lane<mode>"
   [(set (match_operand:<V_elem> 0 "neon_struct_operand" "=Um")
 	(unspec:<V_elem>
@@ -4387,17 +4393,15 @@ if (BYTES_BIG_ENDIAN)
 	  UNSPEC_VST1_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
-      operands[2] = GEN_INT (lane);
     }
+  operands[2] = GEN_INT (lane);
   operands[1] = gen_rtx_REG (<V_HALF>mode, regno);
   if (max == 2)
     return "vst1.<V_sz_elem>\t{%P1}, %A0";
@@ -4448,6 +4452,8 @@ if (BYTES_BIG_ENDIAN)
   "vld2.<V_sz_elem>\t%h0, %A1"
   [(set_attr "type" "neon_load2_2reg_q")])
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld2_lane<mode>"
   [(set (match_operand:TI 0 "s_register_operand" "=w")
         (unspec:TI [(match_operand:<V_two_elem> 1 "neon_struct_operand" "Um")
@@ -4457,22 +4463,22 @@ if (BYTES_BIG_ENDIAN)
                    UNSPEC_VLD2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[0]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = operands[1];
-  ops[3] = operands[3];
+  ops[3] = GEN_INT (lane);
   output_asm_insn ("vld2.<V_sz_elem>\t{%P0[%c3], %P1[%c3]}, %A2", ops);
   return "";
 }
   [(set_attr "type" "neon_load2_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld2_lane<mode>"
   [(set (match_operand:OI 0 "s_register_operand" "=w")
         (unspec:OI [(match_operand:<V_two_elem> 1 "neon_struct_operand" "Um")
@@ -4482,13 +4488,11 @@ if (BYTES_BIG_ENDIAN)
                    UNSPEC_VLD2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[0]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
@@ -4563,6 +4567,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_store2_4reg<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst2_lane<mode>"
   [(set (match_operand:<V_two_elem> 0 "neon_struct_operand" "=Um")
 	(unspec:<V_two_elem>
@@ -4572,22 +4578,22 @@ if (BYTES_BIG_ENDIAN)
 	  UNSPEC_VST2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
-  ops[3] = operands[2];
+  ops[3] = GEN_INT (lane);
   output_asm_insn ("vst2.<V_sz_elem>\t{%P1[%c3], %P2[%c3]}, %A0", ops);
   return "";
 }
   [(set_attr "type" "neon_store2_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst2_lane<mode>"
   [(set (match_operand:<V_two_elem> 0 "neon_struct_operand" "=Um")
         (unspec:<V_two_elem>
@@ -4597,13 +4603,11 @@ if (BYTES_BIG_ENDIAN)
            UNSPEC_VST2_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
   rtx ops[4];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
@@ -4707,6 +4711,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_load3_3reg<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld3_lane<mode>"
   [(set (match_operand:EI 0 "s_register_operand" "=w")
         (unspec:EI [(match_operand:<V_three_elem> 1 "neon_struct_operand" "Um")
@@ -4716,17 +4722,15 @@ if (BYTES_BIG_ENDIAN)
                    UNSPEC_VLD3_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N (<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[0]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = gen_rtx_REG (DImode, regno + 4);
   ops[3] = operands[1];
-  ops[4] = operands[3];
+  ops[4] = GEN_INT (lane);
   output_asm_insn ("vld3.<V_sz_elem>\t{%P0[%c4], %P1[%c4], %P2[%c4]}, %3",
                    ops);
   return "";
@@ -4734,6 +4738,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_load3_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld3_lane<mode>"
   [(set (match_operand:CI 0 "s_register_operand" "=w")
         (unspec:CI [(match_operand:<V_three_elem> 1 "neon_struct_operand" "Um")
@@ -4743,13 +4749,11 @@ if (BYTES_BIG_ENDIAN)
                    UNSPEC_VLD3_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[0]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
@@ -4879,6 +4883,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_store3_3reg<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst3_lane<mode>"
   [(set (match_operand:<V_three_elem> 0 "neon_struct_operand" "=Um")
         (unspec:<V_three_elem>
@@ -4888,17 +4894,15 @@ if (BYTES_BIG_ENDIAN)
            UNSPEC_VST3_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
   ops[3] = gen_rtx_REG (DImode, regno + 4);
-  ops[4] = operands[2];
+  ops[4] = GEN_INT (lane);
   output_asm_insn ("vst3.<V_sz_elem>\t{%P1[%c4], %P2[%c4], %P3[%c4]}, %0",
                    ops);
   return "";
@@ -4906,6 +4910,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_store3_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst3_lane<mode>"
   [(set (match_operand:<V_three_elem> 0 "neon_struct_operand" "=Um")
         (unspec:<V_three_elem>
@@ -4915,13 +4921,11 @@ if (BYTES_BIG_ENDIAN)
            UNSPEC_VST3_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
   rtx ops[5];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
@@ -5029,6 +5033,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_load4_4reg<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld4_lane<mode>"
   [(set (match_operand:OI 0 "s_register_operand" "=w")
         (unspec:OI [(match_operand:<V_four_elem> 1 "neon_struct_operand" "Um")
@@ -5038,18 +5044,16 @@ if (BYTES_BIG_ENDIAN)
                    UNSPEC_VLD4_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[0]);
   rtx ops[6];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
   ops[0] = gen_rtx_REG (DImode, regno);
   ops[1] = gen_rtx_REG (DImode, regno + 2);
   ops[2] = gen_rtx_REG (DImode, regno + 4);
   ops[3] = gen_rtx_REG (DImode, regno + 6);
   ops[4] = operands[1];
-  ops[5] = operands[3];
+  ops[5] = GEN_INT (lane);
   output_asm_insn ("vld4.<V_sz_elem>\t{%P0[%c5], %P1[%c5], %P2[%c5], %P3[%c5]}, %A4",
                    ops);
   return "";
@@ -5057,6 +5061,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_load4_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vld4_lane<mode>"
   [(set (match_operand:XI 0 "s_register_operand" "=w")
         (unspec:XI [(match_operand:<V_four_elem> 1 "neon_struct_operand" "Um")
@@ -5066,13 +5072,11 @@ if (BYTES_BIG_ENDIAN)
                    UNSPEC_VLD4_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[3]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[3]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[0]);
   rtx ops[6];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
@@ -5209,6 +5213,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_store4_4reg<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst4_lane<mode>"
   [(set (match_operand:<V_four_elem> 0 "neon_struct_operand" "=Um")
         (unspec:<V_four_elem>
@@ -5218,18 +5224,16 @@ if (BYTES_BIG_ENDIAN)
            UNSPEC_VST4_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
   rtx ops[6];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
   ops[0] = operands[0];
   ops[1] = gen_rtx_REG (DImode, regno);
   ops[2] = gen_rtx_REG (DImode, regno + 2);
   ops[3] = gen_rtx_REG (DImode, regno + 4);
   ops[4] = gen_rtx_REG (DImode, regno + 6);
-  ops[5] = operands[2];
+  ops[5] = GEN_INT (lane);
   output_asm_insn ("vst4.<V_sz_elem>\t{%P1[%c5], %P2[%c5], %P3[%c5], %P4[%c5]}, %A0",
                    ops);
   return "";
@@ -5237,6 +5241,8 @@ if (BYTES_BIG_ENDIAN)
   [(set_attr "type" "neon_store4_one_lane<q>")]
 )
 
+;; see comment on neon_vld1_lane for reason why the lane numbers are reversed
+;; here on big endian targets.
 (define_insn "neon_vst4_lane<mode>"
   [(set (match_operand:<V_four_elem> 0 "neon_struct_operand" "=Um")
         (unspec:<V_four_elem>
@@ -5246,13 +5252,11 @@ if (BYTES_BIG_ENDIAN)
            UNSPEC_VST4_LANE))]
   "TARGET_NEON"
 {
-  HOST_WIDE_INT lane = INTVAL (operands[2]);
+  HOST_WIDE_INT lane = NEON_ENDIAN_LANE_N(<MODE>mode, INTVAL (operands[2]));
   HOST_WIDE_INT max = GET_MODE_NUNITS (<MODE>mode);
   int regno = REGNO (operands[1]);
   rtx ops[6];
-  if (lane < 0 || lane >= max)
-    error ("lane out of range");
-  else if (lane >= max / 2)
+  if (lane >= max / 2)
     {
       lane -= max / 2;
       regno += 2;
diff --git a/gcc/config/ft32/ft32.c b/gcc/config/ft32/ft32.c
index 85e5ba3bbe5..ab620617bf7 100644
--- a/gcc/config/ft32/ft32.c
+++ b/gcc/config/ft32/ft32.c
@@ -238,7 +238,7 @@ ft32_print_operand (FILE * file, rtx x, int code)
       return;
 
     case MEM:
-      output_address (XEXP (operand, 0));
+      output_address (GET_MODE (XEXP (operand, 0)), XEXP (operand, 0));
       return;
 
     default:
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 9e20714099d..bd084dc9714 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -132,6 +132,7 @@ extern bool ix86_expand_vec_perm_const (rtx[]);
 extern bool ix86_expand_mask_vec_cmp (rtx[]);
 extern bool ix86_expand_int_vec_cmp (rtx[]);
 extern bool ix86_expand_fp_vec_cmp (rtx[]);
+extern void ix86_expand_sse_movcc (rtx, rtx, rtx, rtx);
 extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
 extern bool ix86_expand_int_addcc (rtx[]);
 extern rtx ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index f6c17dfd405..571f7d7b5ec 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -80,7 +80,7 @@ along with GCC; see the file COPYING3.  If not see
 static rtx legitimize_dllimport_symbol (rtx, bool);
 static rtx legitimize_pe_coff_extern_decl (rtx, bool);
 static rtx legitimize_pe_coff_symbol (rtx, bool);
-static void ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t);
+static void ix86_print_operand_address_as (FILE *, rtx, addr_space_t, bool);
 
 #ifndef CHECK_STACK_LIMIT
 #define CHECK_STACK_LIMIT (-1)
@@ -2175,7 +2175,7 @@ const struct processor_costs *ix86_cost = &pentium_cost;
 #define m_BONNELL (1<<PROCESSOR_BONNELL)
 #define m_SILVERMONT (1<<PROCESSOR_SILVERMONT)
 #define m_KNL (1<<PROCESSOR_KNL)
-#define m_SKYLAKE_AVX512 (1<<PROCESSOT_SKYLAKE_AVX512)
+#define m_SKYLAKE_AVX512 (1<<PROCESSOR_SKYLAKE_AVX512)
 #define m_INTEL (1<<PROCESSOR_INTEL)
 
 #define m_GEODE (1<<PROCESSOR_GEODE)
@@ -17131,13 +17131,6 @@ ix86_print_operand (FILE *file, rtx x, int code)
     {
       rtx addr = XEXP (x, 0);
 
-      /* Avoid (%rip) for call operands.  */
-      if (code == 'P' && CONSTANT_ADDRESS_P (x) && !CONST_INT_P (x))
-	{
-	  output_addr_const (file, addr);
-	  return;
-	}
-
       /* No `byte ptr' prefix for call instructions ... */
       if (ASSEMBLER_DIALECT == ASM_INTEL && code != 'X' && code != 'P')
 	{
@@ -17187,7 +17180,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
       if (this_is_asm_operands && ! address_operand (addr, VOIDmode))
 	output_operand_lossage ("invalid constraints for operand");
       else
-	ix86_print_operand_address_as (file, addr, MEM_ADDR_SPACE (x));
+	ix86_print_operand_address_as
+	  (file, addr, MEM_ADDR_SPACE (x), code == 'p' || code == 'P');
     }
 
   else if (CONST_DOUBLE_P (x) && GET_MODE (x) == SFmode)
@@ -17272,7 +17266,8 @@ ix86_print_operand_punct_valid_p (unsigned char code)
 /* Print a memory operand whose address is ADDR.  */
 
 static void
-ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as)
+ix86_print_operand_address_as (FILE *file, rtx addr,
+			       addr_space_t as, bool no_rip)
 {
   struct ix86_address parts;
   rtx base, index, disp;
@@ -17346,7 +17341,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as)
     }
 
   /* Use one byte shorter RIP relative addressing for 64bit mode.  */
-  if (TARGET_64BIT && !base && !index)
+  if (TARGET_64BIT && !base && !index && !no_rip)
     {
       rtx symbol = disp;
 
@@ -17360,10 +17355,10 @@ ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as)
 	      && SYMBOL_REF_TLS_MODEL (symbol) == 0))
 	base = pc_rtx;
     }
+
   if (!base && !index)
     {
       /* Displacement only requires special attention.  */
-
       if (CONST_INT_P (disp))
 	{
 	  if (ASSEMBLER_DIALECT == ASM_INTEL && parts.seg == ADDR_SPACE_GENERIC)
@@ -17505,7 +17500,7 @@ ix86_print_operand_address_as (FILE *file, rtx addr, addr_space_t as)
 static void
 ix86_print_operand_address (FILE *file, machine_mode /*mode*/, rtx addr)
 {
-  ix86_print_operand_address_as (file, addr, ADDR_SPACE_GENERIC);
+  ix86_print_operand_address_as (file, addr, ADDR_SPACE_GENERIC, false);
 }
 
 /* Implementation of TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA.  */
@@ -22633,7 +22628,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1,
 /* Expand DEST = CMP ? OP_TRUE : OP_FALSE into a sequence of logical
    operations.  This is used for both scalar and vector conditional moves.  */
 
-static void
+void
 ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false)
 {
   machine_mode mode = GET_MODE (dest);
@@ -36113,7 +36108,11 @@ get_builtin_code_for_version (tree decl, tree *predicate_list)
 	      priority = P_PROC_AVX;
 	      break;
 	    case PROCESSOR_HASWELL:
-	      if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_ADX)
+	      if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_AVX512VL)
+	        arg_str = "skylake-avx512";
+	      else if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_XSAVES)
+	        arg_str = "skylake";
+	      else if (new_target->x_ix86_isa_flags & OPTION_MASK_ISA_ADX)
 		arg_str = "broadwell";
 	      else
 		arg_str = "haswell";
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 52dd03717b4..34a6d3f4d82 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2601,7 +2601,7 @@
   switch (which_alternative)
     {
     case 0:
-      return "movabs{<imodesuffix>}\t{%1, %0|%0, %1}";
+      return "movabs{<imodesuffix>}\t{%1, %P0|<iptrsize> PTR [%P0], %1}";
     case 1:
       return "mov{<imodesuffix>}\t{%1, %0|%0, %1}";
     default:
@@ -2625,7 +2625,7 @@
   switch (which_alternative)
     {
     case 0:
-      return "movabs{<imodesuffix>}\t{%1, %0|%0, %1}";
+      return "movabs{<imodesuffix>}\t{%P1, %0|%0, <iptrsize> PTR [%P1]}";
     case 1:
       return "mov{<imodesuffix>}\t{%1, %0|%0, %1}";
     default:
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index f804255aedf..aad6a0ddd98 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -799,6 +799,14 @@
   [(V32QI "t") (V16HI "t") (V8SI "t") (V4DI "t") (V8SF "t") (V4DF "t")
    (V64QI "g") (V32HI "g") (V16SI "g") (V8DI "g") (V16SF "g") (V8DF "g")])
 
+;; Half mask mode for unpacks
+(define_mode_attr HALFMASKMODE
+  [(DI "SI") (SI "HI")])
+
+;; Double mask mode for packs
+(define_mode_attr DOUBLEMASKMODE
+  [(HI "SI") (SI "DI")])
+
 
 ;; Include define_subst patterns for instructions with mask
 (include "subst.md")
@@ -3015,6 +3023,87 @@
   DONE;
 })
 
+(define_expand "vcond_mask_<mode><avx512fmaskmodelower>"
+  [(set (match_operand:V48_AVX512VL 0 "register_operand")
+	(vec_merge:V48_AVX512VL
+	  (match_operand:V48_AVX512VL 1 "nonimmediate_operand")
+	  (match_operand:V48_AVX512VL 2 "vector_move_operand")
+	  (match_operand:<avx512fmaskmode> 3 "register_operand")))]
+  "TARGET_AVX512F")
+
+(define_expand "vcond_mask_<mode><avx512fmaskmodelower>"
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand")
+	(vec_merge:VI12_AVX512VL
+	  (match_operand:VI12_AVX512VL 1 "nonimmediate_operand")
+	  (match_operand:VI12_AVX512VL 2 "vector_move_operand")
+	  (match_operand:<avx512fmaskmode> 3 "register_operand")))]
+  "TARGET_AVX512BW")
+
+(define_expand "vcond_mask_<mode><sseintvecmodelower>"
+  [(set (match_operand:VI_256 0 "register_operand")
+	(vec_merge:VI_256
+	  (match_operand:VI_256 1 "nonimmediate_operand")
+	  (match_operand:VI_256 2 "vector_move_operand")
+	  (match_operand:<sseintvecmode> 3 "register_operand")))]
+  "TARGET_AVX2"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+			 operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_<mode><sseintvecmodelower>"
+  [(set (match_operand:VI124_128 0 "register_operand")
+	(vec_merge:VI124_128
+	  (match_operand:VI124_128 1 "nonimmediate_operand")
+	  (match_operand:VI124_128 2 "vector_move_operand")
+	  (match_operand:<sseintvecmode> 3 "register_operand")))]
+  "TARGET_SSE2"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+			 operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_v2div2di"
+  [(set (match_operand:V2DI 0 "register_operand")
+	(vec_merge:V2DI
+	  (match_operand:V2DI 1 "nonimmediate_operand")
+	  (match_operand:V2DI 2 "vector_move_operand")
+	  (match_operand:V2DI 3 "register_operand")))]
+  "TARGET_SSE4_2"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+			 operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_<mode><sseintvecmodelower>"
+  [(set (match_operand:VF_256 0 "register_operand")
+	(vec_merge:VF_256
+	  (match_operand:VF_256 1 "nonimmediate_operand")
+	  (match_operand:VF_256 2 "vector_move_operand")
+	  (match_operand:<sseintvecmode> 3 "register_operand")))]
+  "TARGET_AVX"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+			 operands[1], operands[2]);
+  DONE;
+})
+
+(define_expand "vcond_mask_<mode><sseintvecmodelower>"
+  [(set (match_operand:VF_128 0 "register_operand")
+	(vec_merge:VF_128
+	  (match_operand:VF_128 1 "nonimmediate_operand")
+	  (match_operand:VF_128 2 "vector_move_operand")
+	  (match_operand:<sseintvecmode> 3 "register_operand")))]
+  "TARGET_SSE"
+{
+  ix86_expand_sse_movcc (operands[0], operands[3],
+			 operands[1], operands[2]);
+  DONE;
+})
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
 ;; Parallel floating point logical operations
@@ -11497,6 +11586,23 @@
   DONE;
 })
 
+(define_expand "vec_pack_trunc_qi"
+  [(set (match_operand:HI 0 ("register_operand"))
+        (ior:HI (ashift:HI (zero_extend:HI (match_operand:QI 1 ("register_operand")))
+                           (const_int 8))
+                (zero_extend:HI (match_operand:QI 2 ("register_operand")))))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_pack_trunc_<mode>"
+  [(set (match_operand:<DOUBLEMASKMODE> 0 ("register_operand"))
+        (ior:<DOUBLEMASKMODE> (ashift:<DOUBLEMASKMODE> (zero_extend:<DOUBLEMASKMODE> (match_operand:SWI24 1 ("register_operand")))
+                           (match_dup 3))
+                (zero_extend:<DOUBLEMASKMODE> (match_operand:SWI24 2 ("register_operand")))))]
+  "TARGET_AVX512BW"
+{
+  operands[3] = GEN_INT (GET_MODE_BITSIZE (<MODE>mode));
+})
+
 (define_insn "<sse2_avx2>_packsswb<mask_name>"
   [(set (match_operand:VI1_AVX512 0 "register_operand" "=x,x")
 	(vec_concat:VI1_AVX512
@@ -13393,12 +13499,42 @@
   "TARGET_SSE2"
   "ix86_expand_sse_unpack (operands[0], operands[1], true, false); DONE;")
 
+(define_expand "vec_unpacks_lo_hi"
+  [(set (match_operand:QI 0 "register_operand")
+        (subreg:QI (match_operand:HI 1 "register_operand") 0))]
+  "TARGET_AVX512DQ")
+
+(define_expand "vec_unpacks_lo_si"
+  [(set (match_operand:HI 0 "register_operand")
+        (subreg:HI (match_operand:SI 1 "register_operand") 0))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_unpacks_lo_di"
+  [(set (match_operand:SI 0 "register_operand")
+        (subreg:SI (match_operand:DI 1 "register_operand") 0))]
+  "TARGET_AVX512BW")
+
 (define_expand "vec_unpacku_hi_<mode>"
   [(match_operand:<sseunpackmode> 0 "register_operand")
    (match_operand:VI124_AVX2_24_AVX512F_1_AVX512BW 1 "register_operand")]
   "TARGET_SSE2"
   "ix86_expand_sse_unpack (operands[0], operands[1], true, true); DONE;")
 
+(define_expand "vec_unpacks_hi_hi"
+  [(set (subreg:HI (match_operand:QI 0 "register_operand") 0)
+        (lshiftrt:HI (match_operand:HI 1 "register_operand")
+                     (const_int 8)))]
+  "TARGET_AVX512F")
+
+(define_expand "vec_unpacks_hi_<mode>"
+  [(set (subreg:SWI48x (match_operand:<HALFMASKMODE> 0 "register_operand") 0)
+        (lshiftrt:SWI48x (match_operand:SWI48x 1 "register_operand")
+                         (match_dup 2)))]
+  "TARGET_AVX512BW"
+{
+  operands[2] = GEN_INT (GET_MODE_BITSIZE (<HALFMASKMODE>mode));
+})
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
 ;; Miscellaneous
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 9880b236d6d..d3b7730486d 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -16824,6 +16824,34 @@ mips_avoid_hazard (rtx_insn *after, rtx_insn *insn, int *hilo_delay,
       }
 }
 
+/* A SEQUENCE is breakable iff the branch inside it has a compact form
+   and the target has compact branches.  */
+
+static bool
+mips_breakable_sequence_p (rtx_insn *insn)
+{
+  return (insn && GET_CODE (PATTERN (insn)) == SEQUENCE
+	  && TARGET_CB_MAYBE
+	  && get_attr_compact_form (SEQ_BEGIN (insn)) != COMPACT_FORM_NEVER);
+}
+
+/* Remove a SEQUENCE and replace it with the delay slot instruction
+   followed by the branch and return the instruction in the delay slot.
+   Return the first of the two new instructions.
+   Subroutine of mips_reorg_process_insns.  */
+
+static rtx_insn *
+mips_break_sequence (rtx_insn *insn)
+{
+  rtx_insn *before = PREV_INSN (insn);
+  rtx_insn *branch = SEQ_BEGIN (insn);
+  rtx_insn *ds = SEQ_END (insn);
+  remove_insn (insn);
+  add_insn_after (ds, before, NULL);
+  add_insn_after (branch, ds, NULL);
+  return ds;
+}
+
 /* Go through the instruction stream and insert nops where necessary.
    Also delete any high-part relocations whose partnering low parts
    are now all dead.  See if the whole function can then be put into
@@ -16916,6 +16944,68 @@ mips_reorg_process_insns (void)
 	{
 	  if (GET_CODE (PATTERN (insn)) == SEQUENCE)
 	    {
+	      rtx_insn *next_active = next_active_insn (insn);
+	      /* Undo delay slots to avoid bubbles if the next instruction can
+		 be placed in a forbidden slot or the cost of adding an
+		 explicit NOP in a forbidden slot is OK and if the SEQUENCE is
+		 safely breakable.  */
+	      if (TARGET_CB_MAYBE
+		  && mips_breakable_sequence_p (insn)
+		  && INSN_P (SEQ_BEGIN (insn))
+		  && INSN_P (SEQ_END (insn))
+		  && ((next_active
+		       && INSN_P (next_active)
+		       && GET_CODE (PATTERN (next_active)) != SEQUENCE
+		       && get_attr_can_delay (next_active) == CAN_DELAY_YES)
+		      || !optimize_size))
+		{
+		  /* To hide a potential pipeline bubble, if we scan backwards
+		     from the current SEQUENCE and find that there is a load
+		     of a value that is used in the CTI and there are no
+		     dependencies between the CTI and instruction in the delay
+		     slot, break the sequence so the load delay is hidden.  */
+		  HARD_REG_SET uses;
+		  CLEAR_HARD_REG_SET (uses);
+		  note_uses (&PATTERN (SEQ_BEGIN (insn)), record_hard_reg_uses,
+			     &uses);
+		  HARD_REG_SET delay_sets;
+		  CLEAR_HARD_REG_SET (delay_sets);
+		  note_stores (PATTERN (SEQ_END (insn)), record_hard_reg_sets,
+			       &delay_sets);
+
+		  rtx_insn *prev = prev_active_insn (insn);
+		  if (prev
+		      && GET_CODE (PATTERN (prev)) == SET
+		      && MEM_P (SET_SRC (PATTERN (prev))))
+		    {
+		      HARD_REG_SET sets;
+		      CLEAR_HARD_REG_SET (sets);
+		      note_stores (PATTERN (prev), record_hard_reg_sets,
+				   &sets);
+
+		      /* Re-order if safe.  */
+		      if (!hard_reg_set_intersect_p (delay_sets, uses)
+			  && hard_reg_set_intersect_p (uses, sets))
+			{
+			  next_insn = mips_break_sequence (insn);
+			  /* Need to process the hazards of the newly
+			     introduced instructions.  */
+			  continue;
+			}
+		    }
+
+		  /* If we find an orphaned high-part relocation in a delay
+		     slot then we can convert to a compact branch and get
+		     the orphaned high part deleted.  */
+		  if (mips_orphaned_high_part_p (&htab, SEQ_END (insn)))
+		    {
+		      next_insn = mips_break_sequence (insn);
+		      /* Need to process the hazards of the newly
+			 introduced instructions.  */
+		      continue;
+		    }
+		}
+
 	      /* If we find an orphaned high-part relocation in a delay
 		 slot, it's easier to turn that instruction into a NOP than
 		 to delete it.  The delay slot will be a NOP either way.  */
@@ -16950,6 +17040,33 @@ mips_reorg_process_insns (void)
 		{
 		  mips_avoid_hazard (last_insn, insn, &hilo_delay,
 				     &delayed_reg, lo_reg, &fs_delay);
+		  /* When a compact branch introduces a forbidden slot hazard
+		     and the next useful instruction is a SEQUENCE of a jump
+		     and a non-nop instruction in the delay slot, remove the
+		     sequence and replace it with the delay slot instruction
+		     then the jump to clear the forbidden slot hazard.  */
+
+		  if (fs_delay)
+		    {
+		      /* Search onwards from the current position looking for
+			 a SEQUENCE.  We are looking for pipeline hazards here
+			 and do not need to worry about labels or barriers as
+			 the optimization only undoes delay slot filling which
+			 only affects the order of the branch and its delay
+			 slot.  */
+		      rtx_insn *next = next_active_insn (insn);
+		      if (next
+			  && USEFUL_INSN_P (next)
+			  && GET_CODE (PATTERN (next)) == SEQUENCE
+			  && mips_breakable_sequence_p (next))
+			{
+			  last_insn = insn;
+			  next_insn = mips_break_sequence (next);
+			  /* Need to process the hazards of the newly
+			     introduced instructions.  */
+			  continue;
+			}
+		    }
 		  last_insn = insn;
 		}
 	    }
diff --git a/gcc/config/moxie/moxie.c b/gcc/config/moxie/moxie.c
index a45b825ced0..756e2f74e2d 100644
--- a/gcc/config/moxie/moxie.c
+++ b/gcc/config/moxie/moxie.c
@@ -106,7 +106,7 @@ moxie_operand_lossage (const char *msgid, rtx op)
 /* The PRINT_OPERAND_ADDRESS worker.  */
 
 static void
-moxie_print_operand_address (FILE *file, rtx x)
+moxie_print_operand_address (FILE *file, machine_mode, rtx x)
 {
   switch (GET_CODE (x))
     {
@@ -183,7 +183,7 @@ moxie_print_operand (FILE *file, rtx x, int code)
       return;
 
     case MEM:
-      output_address (XEXP (operand, 0));
+      output_address (GET_MODE (XEXP (operand, 0)), XEXP (operand, 0));
       return;
 
     default:
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index f1ac307b346..d8673018819 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -137,6 +137,9 @@ nvptx_option_override (void)
   write_symbols = NO_DEBUG;
   debug_info_level = DINFO_LEVEL_NONE;
 
+  if (nvptx_optimize < 0)
+    nvptx_optimize = optimize > 0;
+
   declared_fndecls_htab = hash_table<tree_hasher>::create_ggc (17);
   needed_fndecls_htab = hash_table<tree_hasher>::create_ggc (17);
   declared_libfuncs_htab
@@ -2942,6 +2945,69 @@ nvptx_skip_par (unsigned mask, parallel *par)
   nvptx_single (mask, par->forked_block, pre_tail);
 }
 
+/* If PAR has a single inner parallel and PAR itself only contains
+   empty entry and exit blocks, swallow the inner PAR.  */
+
+static void
+nvptx_optimize_inner (parallel *par)
+{
+  parallel *inner = par->inner;
+
+  /* We mustn't be the outer dummy par.  */
+  if (!par->mask)
+    return;
+
+  /* We must have a single inner par.  */
+  if (!inner || inner->next)
+    return;
+
+  /* We must only contain 2 blocks ourselves -- the head and tail of
+     the inner par.  */
+  if (par->blocks.length () != 2)
+    return;
+
+  /* We must be disjoint partitioning.  As we only have vector and
+     worker partitioning, this is sufficient to guarantee the pars
+     have adjacent partitioning.  */
+  if ((par->mask & inner->mask) & (GOMP_DIM_MASK (GOMP_DIM_MAX) - 1))
+    /* This indicates malformed code generation.  */
+    return;
+
+  /* The outer forked insn should be immediately followed by the inner
+     fork insn.  */
+  rtx_insn *forked = par->forked_insn;
+  rtx_insn *fork = BB_END (par->forked_block);
+
+  if (NEXT_INSN (forked) != fork)
+    return;
+  gcc_checking_assert (recog_memoized (fork) == CODE_FOR_nvptx_fork);
+
+  /* The outer joining insn must immediately follow the inner join
+     insn.  */
+  rtx_insn *joining = par->joining_insn;
+  rtx_insn *join = inner->join_insn;
+  if (NEXT_INSN (join) != joining)
+    return;
+
+  /* Preconditions met.  Swallow the inner par.  */
+  if (dump_file)
+    fprintf (dump_file, "Merging loop %x [%d,%d] into %x [%d,%d]\n",
+	     inner->mask, inner->forked_block->index,
+	     inner->join_block->index,
+	     par->mask, par->forked_block->index, par->join_block->index);
+
+  par->mask |= inner->mask & (GOMP_DIM_MASK (GOMP_DIM_MAX) - 1);
+
+  par->blocks.reserve (inner->blocks.length ());
+  while (inner->blocks.length ())
+    par->blocks.quick_push (inner->blocks.pop ());
+
+  par->inner = inner->inner;
+  inner->inner = NULL;
+
+  delete inner;
+}
+
 /* Process the parallel PAR and all its contained
    parallels.  We do everything but the neutering.  Return mask of
    partitioned modes used within this parallel.  */
@@ -2949,6 +3015,9 @@ nvptx_skip_par (unsigned mask, parallel *par)
 static unsigned
 nvptx_process_pars (parallel *par)
 {
+  if (nvptx_optimize)
+    nvptx_optimize_inner (par);
+  
   unsigned inner_mask = par->mask;
 
   /* Do the inner parallels first.  */
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 80170465bea..342915d8095 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -28,3 +28,7 @@ Generate code for a 64-bit ABI.
 mmainkernel
 Target Report RejectNegative
 Link in code for a __main kernel.
+
+moptimize
+Target Report Var(nvptx_optimize) Init(-1)
+Optimize partition neutering
diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h
index dbcfb9579cb..375a13edb27 100644
--- a/gcc/config/rs6000/aix.h
+++ b/gcc/config/rs6000/aix.h
@@ -101,8 +101,6 @@
     {						\
       builtin_define ("_IBMR2");		\
       builtin_define ("_POWER");		\
-      builtin_define ("__powerpc__");           \
-      builtin_define ("__PPC__");               \
       builtin_define ("__unix__");              \
       builtin_define ("_AIX");			\
       builtin_define ("_AIX32");		\
@@ -112,6 +110,22 @@
         builtin_define ("__LONGDOUBLE128");	\
       builtin_assert ("system=unix");		\
       builtin_assert ("system=aix");		\
+      if (TARGET_64BIT)				\
+	{					\
+	  builtin_define ("__PPC__");		\
+	  builtin_define ("__PPC64__");		\
+	  builtin_define ("__powerpc__");	\
+	  builtin_define ("__powerpc64__");	\
+	  builtin_assert ("cpu=powerpc64");	\
+	  builtin_assert ("machine=powerpc64");	\
+	}					\
+      else					\
+	{					\
+	  builtin_define ("__PPC__");		\
+	  builtin_define ("__powerpc__");	\
+	  builtin_assert ("cpu=powerpc");	\
+	  builtin_assert ("machine=powerpc");	\
+	}					\
     }						\
   while (0)
 
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ca93609bb6b..7b6aca9e813 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -18150,28 +18150,7 @@ rs6000_secondary_reload_direct_move (enum rs6000_reg_type to_type,
 	}
     }
 
-  if (TARGET_POWERPC64 && size == 16)
-    {
-      /* Handle moving 128-bit values from GPRs to VSX point registers on
-	 power8 when running in 64-bit mode using XXPERMDI to glue the two
-	 64-bit values back together.  */
-      if (to_type == VSX_REG_TYPE && from_type == GPR_REG_TYPE)
-	{
-	  cost = 3;			/* 2 mtvsrd's, 1 xxpermdi.  */
-	  icode = reg_addr[mode].reload_vsx_gpr;
-	}
-
-      /* Handle moving 128-bit values from VSX point registers to GPRs on
-	 power8 when running in 64-bit mode using XXPERMDI to get access to the
-	 bottom 64-bit value.  */
-      else if (to_type == GPR_REG_TYPE && from_type == VSX_REG_TYPE)
-	{
-	  cost = 3;			/* 2 mfvsrd's, 1 xxpermdi.  */
-	  icode = reg_addr[mode].reload_gpr_vsx;
-	}
-    }
-
-  else if (!TARGET_POWERPC64 && size == 8)
+  else if (size == 8)
     {
       /* Handle moving 64-bit values from GPRs to floating point registers on
 	 power8 when running in 32-bit mode using FMRGOW to glue the two 32-bit
diff --git a/gcc/configure b/gcc/configure
index 0cd85fb8646..4b4e72457a7 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -28329,7 +28329,7 @@ else
   enable_default_ssp=no
 fi
 
-if test x$enable_default_ssp == xyes ; then
+if test x$enable_default_ssp = xyes ; then
 
 $as_echo "#define ENABLE_DEFAULT_SSP 1" >>confdefs.h
 
@@ -29181,7 +29181,7 @@ else
   enable_default_pie=no
 fi
 
-if test x$enable_default_pie == xyes ; then
+if test x$enable_default_pie = xyes ; then
 
 $as_echo "#define ENABLE_DEFAULT_PIE 1" >>confdefs.h
 
diff --git a/gcc/configure.ac b/gcc/configure.ac
index ed2e665b40c..42d8f136e9c 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -5463,7 +5463,7 @@ else
   enable_default_ssp=no
 fi],
 enable_default_ssp=no)
-if test x$enable_default_ssp == xyes ; then
+if test x$enable_default_ssp = xyes ; then
   AC_DEFINE(ENABLE_DEFAULT_SSP, 1,
       [Define if your target supports default stack protector and it is enabled.])
 fi
@@ -6028,7 +6028,7 @@ AC_ARG_ENABLE(default-pie,
   [enable Position Independent Executable as default])],
 enable_default_pie=$enableval,
 enable_default_pie=no)
-if test x$enable_default_pie == xyes ; then
+if test x$enable_default_pie = xyes ; then
   AC_DEFINE(ENABLE_DEFAULT_PIE, 1,
       [Define if your target supports default PIE and it is enabled.])
 fi
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 43d58a3475d..1c2fa5826dc 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -549,7 +549,9 @@ Objective-C and Objective-C++ Dialects}.
 -mexpand-adddi -mindexed-loads -mlra -mlra-priority-none @gol
 -mlra-priority-compact mlra-priority-noncompact -mno-millicode @gol
 -mmixed-code -mq-class -mRcq -mRcw -msize-level=@var{level} @gol
--mtune=@var{cpu} -mmultcost=@var{num} -munalign-prob-threshold=@var{probability}}
+-mtune=@var{cpu} -mmultcost=@var{num} @gol
+-munalign-prob-threshold=@var{probability} -mmpy-option=@var{multo} @gol
+-mdiv-rem -mcode-density}
 
 @emph{ARM Options}
 @gccoptlist{-mapcs-frame  -mno-apcs-frame @gol
@@ -873,7 +875,7 @@ Objective-C and Objective-C++ Dialects}.
 -march=@var{arch} -mbmx -mno-bmx -mcdx -mno-cdx}
 
 @emph{Nvidia PTX Options}
-@gccoptlist{-m32 -m64 -mmainkernel}
+@gccoptlist{-m32 -m64 -mmainkernel -moptimize}
 
 @emph{PDP-11 Options}
 @gccoptlist{-mfpu  -msoft-float  -mac0  -mno-ac0  -m40  -m45  -m10 @gol
@@ -12846,7 +12848,7 @@ is being compiled:
 @item -mbarrel-shifter
 @opindex mbarrel-shifter
 Generate instructions supported by barrel shifter.  This is the default
-unless @option{-mcpu=ARC601} is in effect.
+unless @option{-mcpu=ARC601} or @samp{-mcpu=ARCEM} is in effect.
 
 @item -mcpu=@var{cpu}
 @opindex mcpu
@@ -12859,17 +12861,28 @@ values for @var{cpu} are
 @opindex mA6
 @opindex mARC600
 @item ARC600
+@item arc600
 Compile for ARC600.  Aliases: @option{-mA6}, @option{-mARC600}.
 
 @item ARC601
+@item arc601
 @opindex mARC601
 Compile for ARC601.  Alias: @option{-mARC601}.
 
 @item ARC700
+@item arc700
 @opindex mA7
 @opindex mARC700
 Compile for ARC700.  Aliases: @option{-mA7}, @option{-mARC700}.
 This is the default when configured with @option{--with-cpu=arc700}@.
+
+@item ARCEM
+@item arcem
+Compile for ARC EM.
+
+@item ARCHS
+@item archs
+Compile for ARC HS.
 @end table
 
 @item -mdpfp
@@ -12940,6 +12953,62 @@ can overridden by FPX options; @samp{mspfp}, @samp{mspfp-compact}, or
 @opindex mswap
 Generate swap instructions.
 
+@item -mdiv-rem
+@opindex mdiv-rem
+Enable DIV/REM instructions for ARCv2 cores.
+
+@item -mcode-density
+@opindex mcode-density
+Enable code density instructions for ARC EM, default on for ARC HS.
+
+@item -mmpy-option=@var{multo}
+@opindex mmpy-option
+Compile ARCv2 code with a multiplier design option.  @samp{wlh1} is
+the default value.  The recognized values for @var{multo} are:
+
+@table @samp
+@item 0
+No multiplier available.
+
+@item 1
+@opindex w
+The multiply option is set to w: 16x16 multiplier, fully pipelined.
+The following instructions are enabled: MPYW, and MPYUW.
+
+@item 2
+@opindex wlh1
+The multiply option is set to wlh1: 32x32 multiplier, fully
+pipelined (1 stage).  The following instructions are additionaly
+enabled: MPY, MPYU, MPYM, MPYMU, and MPY_S.
+
+@item 3
+@opindex wlh2
+The multiply option is set to wlh2: 32x32 multiplier, fully pipelined
+(2 stages).  The following instructions are additionaly enabled: MPY,
+MPYU, MPYM, MPYMU, and MPY_S.
+
+@item 4
+@opindex wlh3
+The multiply option is set to wlh3: Two 16x16 multiplier, blocking,
+sequential.  The following instructions are additionaly enabled: MPY,
+MPYU, MPYM, MPYMU, and MPY_S.
+
+@item 5
+@opindex wlh4
+The multiply option is set to wlh4: One 16x16 multiplier, blocking,
+sequential.  The following instructions are additionaly enabled: MPY,
+MPYU, MPYM, MPYMU, and MPY_S.
+
+@item 6
+@opindex wlh5
+The multiply option is set to wlh5: One 32x4 multiplier, blocking,
+sequential.  The following instructions are additionaly enabled: MPY,
+MPYU, MPYM, MPYMU, and MPY_S.
+
+@end table
+
+This option is only available for ARCv2 cores@.
+
 @end table
 
 The following options are passed through to the assembler, and also
@@ -18965,6 +19034,11 @@ Generate code for 32-bit or 64-bit ABI.
 Link in code for a __main kernel.  This is for stand-alone instead of
 offloading execution.
 
+@item -moptimize
+@opindex moptimize
+Apply partitioned execution optimizations.  This is the default when any
+level of optimization is selected.
+
 @end table
 
 @node PDP-11 Options
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 8b437ab8f26..eb76117ca1a 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -11886,16 +11886,16 @@ get_array_ctor_element_at_index (tree ctor, offset_int access_index)
   offset_int low_bound = 0;
 
   if (TREE_CODE (TREE_TYPE (ctor)) == ARRAY_TYPE)
-  {
-    tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor));
-    if (domain_type && TYPE_MIN_VALUE (domain_type))
     {
-      /* Static constructors for variably sized objects makes no sense.  */
-      gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST);
-      index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type));
-      low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type));
+      tree domain_type = TYPE_DOMAIN (TREE_TYPE (ctor));
+      if (domain_type && TYPE_MIN_VALUE (domain_type))
+	{
+	  /* Static constructors for variably sized objects makes no sense.  */
+	  gcc_assert (TREE_CODE (TYPE_MIN_VALUE (domain_type)) == INTEGER_CST);
+	  index_type = TREE_TYPE (TYPE_MIN_VALUE (domain_type));
+	  low_bound = wi::to_offset (TYPE_MIN_VALUE (domain_type));
+	}
     }
-  }
 
   if (index_type)
     access_index = wi::ext (access_index, TYPE_PRECISION (index_type),
@@ -11911,29 +11911,29 @@ get_array_ctor_element_at_index (tree ctor, offset_int access_index)
   tree cfield, cval;
 
   FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), cnt, cfield, cval)
-  {
-    /* Array constructor might explicitely set index, or specify range
-     * or leave index NULL meaning that it is next index after previous
-     * one.  */
-    if (cfield)
     {
-      if (TREE_CODE (cfield) == INTEGER_CST)
-	max_index = index = wi::to_offset (cfield);
+      /* Array constructor might explicitly set index, or specify a range,
+	 or leave index NULL meaning that it is next index after previous
+	 one.  */
+      if (cfield)
+	{
+	  if (TREE_CODE (cfield) == INTEGER_CST)
+	    max_index = index = wi::to_offset (cfield);
+	  else
+	    {
+	      gcc_assert (TREE_CODE (cfield) == RANGE_EXPR);
+	      index = wi::to_offset (TREE_OPERAND (cfield, 0));
+	      max_index = wi::to_offset (TREE_OPERAND (cfield, 1));
+	    }
+	}
       else
-      {
-	gcc_assert (TREE_CODE (cfield) == RANGE_EXPR);
-	index = wi::to_offset (TREE_OPERAND (cfield, 0));
-	max_index = wi::to_offset (TREE_OPERAND (cfield, 1));
-      }
-    }
-    else
-    {
-      index += 1;
-      if (index_type)
-	index = wi::ext (index, TYPE_PRECISION (index_type),
-			 TYPE_SIGN (index_type));
-	max_index = index;
-    }
+	{
+	  index += 1;
+	  if (index_type)
+	    index = wi::ext (index, TYPE_PRECISION (index_type),
+			     TYPE_SIGN (index_type));
+	  max_index = index;
+	}
 
     /* Do we have match?  */
     if (wi::cmpu (access_index, index) >= 0
diff --git a/gcc/fortran/ChangeLog b/gcc/fortran/ChangeLog
index cd4c94e6764..33c541a38d6 100644
--- a/gcc/fortran/ChangeLog
+++ b/gcc/fortran/ChangeLog
@@ -1,3 +1,8 @@
+2015-11-11  Dominique d'Humieres <dominiq@lps.ens.fr>
+
+	PR fortran/67826
+	* openmp.c (gfc_omp_udr_find): Fix typo.
+
 2015-11-08  Steven g. Kargl  <kargl@gcc.gnu.org>
 
 	PR fortran/68053
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index a7c7a1927e3..4af139a2a17 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1820,7 +1820,7 @@ gfc_omp_udr_find (gfc_symtree *st, gfc_typespec *ts)
   for (omp_udr = st->n.omp_udr; omp_udr; omp_udr = omp_udr->next)
     if (omp_udr->ts.type == ts->type
 	|| ((omp_udr->ts.type == BT_DERIVED || omp_udr->ts.type == BT_CLASS)
-	    && (ts->type == BT_DERIVED && ts->type == BT_CLASS)))
+	    && (ts->type == BT_DERIVED || ts->type == BT_CLASS)))
       {
 	if (omp_udr->ts.type == BT_DERIVED || omp_udr->ts.type == BT_CLASS)
 	  {
diff --git a/gcc/gimple-ssa-strength-reduction.c b/gcc/gimple-ssa-strength-reduction.c
index ce32ad33e94..b8078230f34 100644
--- a/gcc/gimple-ssa-strength-reduction.c
+++ b/gcc/gimple-ssa-strength-reduction.c
@@ -2226,12 +2226,11 @@ create_phi_basis (slsr_cand_t c, gimple *from_phi, tree basis_name,
   int i;
   tree name, phi_arg;
   gphi *phi;
-  vec<tree> phi_args;
   slsr_cand_t basis = lookup_cand (c->basis);
   int nargs = gimple_phi_num_args (from_phi);
   basic_block phi_bb = gimple_bb (from_phi);
   slsr_cand_t phi_cand = base_cand_from_table (gimple_phi_result (from_phi));
-  phi_args.create (nargs);
+  auto_vec<tree> phi_args (nargs);
 
   /* Process each argument of the existing phi that represents
      conditionally-executed add candidates.  */
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index f325bb33ecb..d23a6cb5f58 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-012ab5cb2ef1c26e8023ce90d3a2bba174da7b30
+e3aef41ce0c5be81e2589e60d9cb0db1516e9e2d
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/optabs.c b/gcc/optabs.c
index f9fbfde967d..4ffbc0cdefd 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -1047,7 +1047,8 @@ expand_binop_directly (machine_mode mode, optab binoptab,
       /* The mode of the result is different then the mode of the
 	 arguments.  */
       tmp_mode = insn_data[(int) icode].operand[0].mode;
-      if (GET_MODE_NUNITS (tmp_mode) != 2 * GET_MODE_NUNITS (mode))
+      if (VECTOR_MODE_P (mode)
+	  && GET_MODE_NUNITS (tmp_mode) != 2 * GET_MODE_NUNITS (mode))
 	{
 	  delete_insns_since (last);
 	  return NULL_RTX;
diff --git a/gcc/passes.c b/gcc/passes.c
index d3d6e1d76b5..8a283ae8a7a 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2058,6 +2058,18 @@ verify_curr_properties (function *fn, void *data)
   gcc_assert ((fn->curr_properties & props) == props);
 }
 
+/* Release dump file name if set.  */
+
+static void
+release_dump_file_name (void)
+{
+  if (dump_file_name)
+    {
+      free (CONST_CAST (char *, dump_file_name));
+      dump_file_name = NULL;
+    }
+}
+
 /* Initialize pass dump file.  */
 /* This is non-static so that the plugins can use it.  */
 
@@ -2071,6 +2083,7 @@ pass_init_dump_file (opt_pass *pass)
       gcc::dump_manager *dumps = g->get_dumps ();
       bool initializing_dump =
 	!dumps->dump_initialized_p (pass->static_pass_number);
+      release_dump_file_name ();
       dump_file_name = dumps->get_dump_file_name (pass->static_pass_number);
       dumps->dump_start (pass->static_pass_number, &dump_flags);
       if (dump_file && current_function_decl)
@@ -2098,11 +2111,7 @@ pass_fini_dump_file (opt_pass *pass)
   timevar_push (TV_DUMP);
 
   /* Flush and close dump file.  */
-  if (dump_file_name)
-    {
-      free (CONST_CAST (char *, dump_file_name));
-      dump_file_name = NULL;
-    }
+  release_dump_file_name ();
 
   g->get_dumps ()->dump_finish (pass->static_pass_number);
   timevar_pop (TV_DUMP);
diff --git a/gcc/regrename.c b/gcc/regrename.c
index d727dd9095b..d41410a9348 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -1068,7 +1068,9 @@ scan_rtx_reg (rtx_insn *insn, rtx *loc, enum reg_class cl, enum scan_actions act
 	      && GET_CODE (pat) == SET
 	      && GET_CODE (SET_DEST (pat)) == REG
 	      && GET_CODE (SET_SRC (pat)) == REG
-	      && terminated_this_insn)
+	      && terminated_this_insn
+	      && terminated_this_insn->nregs
+		 == REG_NREGS (recog_data.operand[1]))
 	    {
 	      gcc_assert (terminated_this_insn->regno
 			  == REGNO (recog_data.operand[1]));
@@ -1593,6 +1595,7 @@ build_def_use (basic_block bb)
 	  enum rtx_code set_code = SET;
 	  enum rtx_code clobber_code = CLOBBER;
 	  insn_rr_info *insn_info = NULL;
+	  terminated_this_insn = NULL;
 
 	  /* Process the insn, determining its effect on the def-use
 	     chains and live hard registers.  We perform the following
@@ -1749,8 +1752,6 @@ build_def_use (basic_block bb)
 	      scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_read,
 			OP_INOUT);
 
-	  terminated_this_insn = NULL;
-
 	  /* Step 4: Close chains for registers that die here, unless
 	     the register is mentioned in a REG_UNUSED note.  In that
 	     case we keep the chain open until step #7 below to ensure
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 1ac009f2613..4637d5fc6a8 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,25 @@
+2015-11-11  Simon Dardis  <simon.dardis@imgtec.com>
+
+	* gcc.target/mips/split-ds-sequence.c: New test.
+
+2015-11-11  Julia Koval  <julia.koval@intel.com>
+
+	* g++.dg/ext/mv16.C: New functions.
+
+2015-11-11  Richard Biener  <rguenth@gcc.gnu.org>
+	    Jiong Wang	    <jiong.wang@arm.com>
+
+	* gcc.dg/tree-ssa/pr68234.c: New testcase. 
+
+2015-11-10  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* gcc.dg/goacc/nvptx-opt-1.c: New test.
+
+2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
+
+	* gcc.target/i386/mask-pack.c: New test.
+	* gcc.target/i386/mask-unpack.c: New test.
+
 2015-11-10  Ilya Enkovich  <enkovich.gnu@gmail.com>
 
 	* gcc.target/i386/avx2-vec-mask-bit-not.c: New test.
diff --git a/gcc/testsuite/g++.dg/ext/mv16.C b/gcc/testsuite/g++.dg/ext/mv16.C
index 8992bfc6fc1..a3a0fe804fd 100644
--- a/gcc/testsuite/g++.dg/ext/mv16.C
+++ b/gcc/testsuite/g++.dg/ext/mv16.C
@@ -44,6 +44,18 @@ foo ()
   return 12;
 }
 
+int __attribute__ ((target("arch=broadwell"))) foo () {
+  return 13;
+}
+
+int __attribute__ ((target("arch=skylake"))) foo () {
+  return 14;
+}
+
+int __attribute__ ((target("arch=skylake-avx512"))) foo () {
+  return 15;
+}
+
 int main ()
 {
   int val = foo ();
@@ -58,6 +70,12 @@ int main ()
     assert (val == 9);
   else if (__builtin_cpu_is ("haswell"))
     assert (val == 12);
+  else if (__builtin_cpu_is ("broadwell"))
+    assert (val == 13);
+  else if (__builtin_cpu_is ("skylake"))
+    assert (val == 14);
+  else if (__builtin_cpu_is ("skylake-avx512"))
+    assert (val == 15);
   else
     assert (val == 0);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68234.c b/gcc/testsuite/gcc.dg/tree-ssa/pr68234.c
new file mode 100644
index 00000000000..e7c2a95aa4c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68234.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-vrp2" } */
+
+extern int nc;
+void ff (unsigned long long);
+
+void
+f (void)
+{
+  unsigned char resp[1024];
+  int c;
+  int bl = 0;
+  unsigned long long *dwords = (unsigned long long *) (resp + 5);
+  for (c = 0; c < nc; c++)
+    {
+      /* PR middle-end/68234, this signed division should be optimized into
+	 right shift as vrp pass should deduct range info of 'bl' falls into
+	 positive number.  */
+      ff (dwords[bl / 64]);
+      bl++;
+    }
+}
+
+/* { dg-final { scan-tree-dump ">> 6" "vrp2" } } */
diff --git a/gcc/testsuite/gcc.target/i386/mask-pack.c b/gcc/testsuite/gcc.target/i386/mask-pack.c
new file mode 100644
index 00000000000..0b564ef4284
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mask-pack.c
@@ -0,0 +1,100 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 10 "vect" } } */
+/* { dg-final { scan-assembler-not "maskmov" } } */
+
+#define LENGTH 1000
+
+long l1[LENGTH], l2[LENGTH];
+int i1[LENGTH], i2[LENGTH];
+short s1[LENGTH], s2[LENGTH];
+char c1[LENGTH], c2[LENGTH];
+double d1[LENGTH], d2[LENGTH];
+
+int test1 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (l1[i] > l2[i])
+      i1[i] = 1;
+}
+
+int test2 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (i1[i] > i2[i])
+      s1[i] = 1;
+}
+
+int test3 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (s1[i] > s2[i])
+      c1[i] = 1;
+}
+
+int test4 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (d1[i] > d2[i])
+      c1[i] = 1;
+}
+
+int test5 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    i1[i] = l1[i] > l2[i] ? 3 : 4;
+}
+
+int test6 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    s1[i] = i1[i] > i2[i] ? 3 : 4;
+}
+
+int test7 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    c1[i] = s1[i] > s2[i] ? 3 : 4;
+}
+
+int test8 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    c1[i] = d1[i] > d2[i] ? 3 : 4;
+}
+
+int test9 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (l1[i] > l2[i] && i1[i] < i2[i])
+      c1[i] = 1;
+}
+
+int test10 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (l1[i] > l2[i] && i1[i] < i2[i])
+      c1[i] = 1;
+    else
+      c1[i] = 2;
+}
diff --git a/gcc/testsuite/gcc.target/i386/mask-unpack.c b/gcc/testsuite/gcc.target/i386/mask-unpack.c
new file mode 100644
index 00000000000..5905e1cf00f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mask-unpack.c
@@ -0,0 +1,100 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -mavx512dq -O3 -fopenmp-simd -fdump-tree-vect-details" } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 10 "vect" } } */
+/* { dg-final { scan-assembler-not "maskmov" } } */
+
+#define LENGTH 1000
+
+long l1[LENGTH], l2[LENGTH];
+int i1[LENGTH], i2[LENGTH];
+short s1[LENGTH], s2[LENGTH];
+char c1[LENGTH], c2[LENGTH];
+double d1[LENGTH], d2[LENGTH];
+
+int test1 ()
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (i1[i] > i2[i])
+      l1[i] = 1;
+}
+
+int test2 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    if (s1[i] > s2[i])
+      i1[i] = 1;
+}
+
+int test3 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    if (c1[i] > c2[i])
+      s1[i] = 1;
+}
+
+int test4 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    if (c1[i] > c2[i])
+      d1[i] = 1;
+}
+
+int test5 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    l1[i] = i1[i] > i2[i] ? 1 : 2;
+}
+
+int test6 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    i1[i] = s1[i] > s2[i] ? 1 : 2;
+}
+
+int test7 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    s1[i] = c1[i] > c2[i] ? 1 : 2;
+}
+
+int test8 (int n)
+{
+  int i;
+  #pragma omp simd safelen(32)
+  for (i = 0; i < LENGTH; i++)
+    d1[i] = c1[i] > c2[i] ? 1 : 2;
+}
+
+int test9 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (c1[i] > c2[i] && i1[i] < i2[i])
+      l1[i] = 1;
+}
+
+int test10 (int n)
+{
+  int i;
+  #pragma omp simd safelen(16)
+  for (i = 0; i < LENGTH; i++)
+    if (c1[i] > c2[i] && i1[i] < i2[i])
+      l1[i] = 1;
+    else
+      l1[i] = 2;
+}
diff --git a/gcc/testsuite/gcc.target/mips/split-ds-sequence.c b/gcc/testsuite/gcc.target/mips/split-ds-sequence.c
new file mode 100644
index 00000000000..e60270db304
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/split-ds-sequence.c
@@ -0,0 +1,19 @@
+/* { dg-options "isa_rev>=6" } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-mcompact-branches=never" } { "" } } */
+/* { dg-final { scan-assembler-not "nop" } } */
+
+int
+testg2 (int a, int c)
+{
+
+  int j = 0;
+  do
+    {
+      j += a;
+    }
+  while (j < 56);
+
+  j += c;
+  return j;
+
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 30aee19aae7..2835c993588 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -4996,9 +4996,9 @@ convert_callers_for_node (struct cgraph_node *node,
 
       if (dump_file)
 	fprintf (dump_file, "Adjusting call %s/%i -> %s/%i\n",
-		 xstrdup (cs->caller->name ()),
+		 xstrdup_for_dump (cs->caller->name ()),
 		 cs->caller->order,
-		 xstrdup (cs->callee->name ()),
+		 xstrdup_for_dump (cs->callee->name ()),
 		 cs->callee->order);
 
       ipa_modify_call_arguments (cs, cs->call_stmt, *adjustments);
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index cbf0073ffcf..55e53093caa 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -492,20 +492,27 @@ vect_determine_vectorization_factor (loop_vec_info loop_vinfo)
 		}
             }
 
-	  /* The vectorization factor is according to the smallest
-	     scalar type (or the largest vector size, but we only
-	     support one vector size per loop).  */
-	  if (!bool_result)
-	    scalar_type = vect_get_smallest_scalar_type (stmt, &dummy,
-							 &dummy);
-	  if (dump_enabled_p ())
+	  /* Don't try to compute VF out scalar types if we stmt
+	     produces boolean vector.  Use result vectype instead.  */
+	  if (VECTOR_BOOLEAN_TYPE_P (vectype))
+	    vf_vectype = vectype;
+	  else
 	    {
-	      dump_printf_loc (MSG_NOTE, vect_location,
-                               "get vectype for scalar type:  ");
-	      dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type);
-              dump_printf (MSG_NOTE, "\n");
+	      /* The vectorization factor is according to the smallest
+		 scalar type (or the largest vector size, but we only
+		 support one vector size per loop).  */
+	      if (!bool_result)
+		scalar_type = vect_get_smallest_scalar_type (stmt, &dummy,
+							     &dummy);
+	      if (dump_enabled_p ())
+		{
+		  dump_printf_loc (MSG_NOTE, vect_location,
+				   "get vectype for scalar type:  ");
+		  dump_generic_expr (MSG_NOTE, TDF_SLIM, scalar_type);
+		  dump_printf (MSG_NOTE, "\n");
+		}
+	      vf_vectype = get_vectype_for_scalar_type (scalar_type);
 	    }
-	  vf_vectype = get_vectype_for_scalar_type (scalar_type);
 	  if (!vf_vectype)
 	    {
 	      if (dump_enabled_p ())
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index e91c6e008a0..4e1d2dbe858 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -66,6 +66,7 @@ static gimple *vect_recog_mult_pattern (vec<gimple *> *,
 static gimple *vect_recog_mixed_size_cond_pattern (vec<gimple *> *,
 						  tree *, tree *);
 static gimple *vect_recog_bool_pattern (vec<gimple *> *, tree *, tree *);
+static gimple *vect_recog_mask_conversion_pattern (vec<gimple *> *, tree *, tree *);
 static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
 	vect_recog_widen_mult_pattern,
 	vect_recog_widen_sum_pattern,
@@ -79,7 +80,8 @@ static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = {
 	vect_recog_divmod_pattern,
 	vect_recog_mult_pattern,
 	vect_recog_mixed_size_cond_pattern,
-	vect_recog_bool_pattern};
+	vect_recog_bool_pattern,
+	vect_recog_mask_conversion_pattern};
 
 static inline void
 append_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt)
@@ -3152,7 +3154,7 @@ search_type_for_mask (tree var, vec_info *vinfo)
   enum vect_def_type dt;
   tree rhs1;
   enum tree_code rhs_code;
-  tree res = NULL_TREE;
+  tree res = NULL_TREE, res2;
 
   if (TREE_CODE (var) != SSA_NAME)
     return NULL_TREE;
@@ -3185,13 +3187,26 @@ search_type_for_mask (tree var, vec_info *vinfo)
     case BIT_AND_EXPR:
     case BIT_IOR_EXPR:
     case BIT_XOR_EXPR:
-      if (!(res = search_type_for_mask (rhs1, vinfo)))
-	res = search_type_for_mask (gimple_assign_rhs2 (def_stmt), vinfo);
+      res = search_type_for_mask (rhs1, vinfo);
+      res2 = search_type_for_mask (gimple_assign_rhs2 (def_stmt), vinfo);
+      if (!res || (res2 && TYPE_PRECISION (res) > TYPE_PRECISION (res2)))
+	res = res2;
       break;
 
     default:
       if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
 	{
+	  tree comp_vectype, mask_type;
+
+	  comp_vectype = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
+	  if (comp_vectype == NULL_TREE)
+	    return NULL_TREE;
+
+	  mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1));
+	  if (!mask_type
+	      || !expand_vec_cmp_expr_p (comp_vectype, mask_type))
+	    return NULL_TREE;
+
 	  if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE
 	      || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
 	    {
@@ -3461,6 +3476,255 @@ vect_recog_bool_pattern (vec<gimple *> *stmts, tree *type_in,
 }
 
 
+/* A helper for vect_recog_mask_conversion_pattern.  Build
+   conversion of MASK to a type suitable for masking VECTYPE.
+   Built statement gets required vectype and is appended to
+   a pattern sequence of STMT_VINFO.
+
+   Return converted mask.  */
+
+static tree
+build_mask_conversion (tree mask, tree vectype, stmt_vec_info stmt_vinfo,
+		       vec_info *vinfo)
+{
+  gimple *stmt;
+  tree masktype, tmp;
+  stmt_vec_info new_stmt_info;
+
+  masktype = build_same_sized_truth_vector_type (vectype);
+  tmp = vect_recog_temp_ssa_var (TREE_TYPE (masktype), NULL);
+  stmt = gimple_build_assign (tmp, CONVERT_EXPR, mask);
+  new_stmt_info = new_stmt_vec_info (stmt, vinfo);
+  set_vinfo_for_stmt (stmt, new_stmt_info);
+  STMT_VINFO_VECTYPE (new_stmt_info) = masktype;
+  append_pattern_def_seq (stmt_vinfo, stmt);
+
+  return tmp;
+}
+
+
+/* Function vect_recog_mask_conversion_pattern
+
+   Try to find statements which require boolean type
+   converison.  Additional conversion statements are
+   added to handle such cases.  For example:
+
+   bool m_1, m_2, m_3;
+   int i_4, i_5;
+   double d_6, d_7;
+   char c_1, c_2, c_3;
+
+   S1   m_1 = i_4 > i_5;
+   S2   m_2 = d_6 < d_7;
+   S3   m_3 = m_1 & m_2;
+   S4   c_1 = m_3 ? c_2 : c_3;
+
+   Will be transformed into:
+
+   S1   m_1 = i_4 > i_5;
+   S2   m_2 = d_6 < d_7;
+   S3'' m_2' = (_Bool[bitsize=32])m_2
+   S3'  m_3' = m_1 & m_2';
+   S4'' m_3'' = (_Bool[bitsize=8])m_3'
+   S4'  c_1' = m_3'' ? c_2 : c_3;  */
+
+static gimple *
+vect_recog_mask_conversion_pattern (vec<gimple *> *stmts, tree *type_in,
+				    tree *type_out)
+{
+  gimple *last_stmt = stmts->pop ();
+  enum tree_code rhs_code;
+  tree lhs, rhs1, rhs2, tmp, rhs1_type, rhs2_type, vectype1, vectype2;
+  stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  stmt_vec_info pattern_stmt_info;
+  vec_info *vinfo = stmt_vinfo->vinfo;
+  gimple *pattern_stmt;
+
+  /* Check for MASK_LOAD ans MASK_STORE calls requiring mask conversion.  */
+  if (is_gimple_call (last_stmt)
+      && gimple_call_internal_p (last_stmt)
+      && (gimple_call_internal_fn (last_stmt) == IFN_MASK_STORE
+	  || gimple_call_internal_fn (last_stmt) == IFN_MASK_LOAD))
+    {
+      bool load = (gimple_call_internal_fn (last_stmt) == IFN_MASK_LOAD);
+
+      if (load)
+	{
+	  lhs = gimple_call_lhs (last_stmt);
+	  vectype1 = get_vectype_for_scalar_type (TREE_TYPE (lhs));
+	}
+      else
+	{
+	  rhs2 = gimple_call_arg (last_stmt, 3);
+	  vectype1 = get_vectype_for_scalar_type (TREE_TYPE (rhs2));
+	}
+
+      rhs1 = gimple_call_arg (last_stmt, 2);
+      rhs1_type = search_type_for_mask (rhs1, vinfo);
+      if (!rhs1_type)
+	return NULL;
+      vectype2 = get_mask_type_for_scalar_type (rhs1_type);
+
+      if (!vectype1 || !vectype2
+	  || TYPE_VECTOR_SUBPARTS (vectype1) == TYPE_VECTOR_SUBPARTS (vectype2))
+	return NULL;
+
+      tmp = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo);
+
+      if (load)
+	{
+	  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+	  pattern_stmt
+	    = gimple_build_call_internal (IFN_MASK_LOAD, 3,
+					  gimple_call_arg (last_stmt, 0),
+					  gimple_call_arg (last_stmt, 1),
+					  tmp);
+	  gimple_call_set_lhs (pattern_stmt, lhs);
+	}
+      else
+	  pattern_stmt
+	    = gimple_build_call_internal (IFN_MASK_STORE, 4,
+					  gimple_call_arg (last_stmt, 0),
+					  gimple_call_arg (last_stmt, 1),
+					  tmp,
+					  gimple_call_arg (last_stmt, 3));
+
+
+      pattern_stmt_info = new_stmt_vec_info (pattern_stmt, vinfo);
+      set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info);
+      STMT_VINFO_DATA_REF (pattern_stmt_info)
+	= STMT_VINFO_DATA_REF (stmt_vinfo);
+      STMT_VINFO_DR_BASE_ADDRESS (pattern_stmt_info)
+	= STMT_VINFO_DR_BASE_ADDRESS (stmt_vinfo);
+      STMT_VINFO_DR_INIT (pattern_stmt_info) = STMT_VINFO_DR_INIT (stmt_vinfo);
+      STMT_VINFO_DR_OFFSET (pattern_stmt_info)
+	= STMT_VINFO_DR_OFFSET (stmt_vinfo);
+      STMT_VINFO_DR_STEP (pattern_stmt_info) = STMT_VINFO_DR_STEP (stmt_vinfo);
+      STMT_VINFO_DR_ALIGNED_TO (pattern_stmt_info)
+	= STMT_VINFO_DR_ALIGNED_TO (stmt_vinfo);
+      DR_STMT (STMT_VINFO_DATA_REF (stmt_vinfo)) = pattern_stmt;
+
+      *type_out = vectype1;
+      *type_in = vectype1;
+      stmts->safe_push (last_stmt);
+      if (dump_enabled_p ())
+	dump_printf_loc (MSG_NOTE, vect_location,
+                         "vect_recog_mask_conversion_pattern: detected:\n");
+
+      return pattern_stmt;
+    }
+
+  if (!is_gimple_assign (last_stmt))
+    return NULL;
+
+  lhs = gimple_assign_lhs (last_stmt);
+  rhs1 = gimple_assign_rhs1 (last_stmt);
+  rhs_code = gimple_assign_rhs_code (last_stmt);
+
+  /* Check for cond expression requiring mask conversion.  */
+  if (rhs_code == COND_EXPR)
+    {
+      /* vect_recog_mixed_size_cond_pattern could apply.
+	 Do nothing then.  */
+      if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo))
+	return NULL;
+
+      vectype1 = get_vectype_for_scalar_type (TREE_TYPE (lhs));
+
+      if (TREE_CODE (rhs1) == SSA_NAME)
+	{
+	  rhs1_type = search_type_for_mask (rhs1, vinfo);
+	  if (!rhs1_type)
+	    return NULL;
+	}
+      else
+	rhs1_type = TREE_TYPE (TREE_OPERAND (rhs1, 0));
+
+      vectype2 = get_mask_type_for_scalar_type (rhs1_type);
+
+      if (!vectype1 || !vectype2
+	  || TYPE_VECTOR_SUBPARTS (vectype1) == TYPE_VECTOR_SUBPARTS (vectype2))
+	return NULL;
+
+      /* If rhs1 is a comparison we need to move it into a
+	 separate statement.  */
+      if (TREE_CODE (rhs1) != SSA_NAME)
+	{
+	  tmp = vect_recog_temp_ssa_var (TREE_TYPE (rhs1), NULL);
+	  pattern_stmt = gimple_build_assign (tmp, rhs1);
+	  rhs1 = tmp;
+
+	  pattern_stmt_info = new_stmt_vec_info (pattern_stmt, vinfo);
+	  set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info);
+	  STMT_VINFO_VECTYPE (pattern_stmt_info) = vectype2;
+	  append_pattern_def_seq (stmt_vinfo, pattern_stmt);
+	}
+
+      tmp = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo);
+
+      lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+      pattern_stmt = gimple_build_assign (lhs, COND_EXPR, tmp,
+					  gimple_assign_rhs2 (last_stmt),
+					  gimple_assign_rhs3 (last_stmt));
+
+      *type_out = vectype1;
+      *type_in = vectype1;
+      stmts->safe_push (last_stmt);
+      if (dump_enabled_p ())
+	dump_printf_loc (MSG_NOTE, vect_location,
+                         "vect_recog_mask_conversion_pattern: detected:\n");
+
+      return pattern_stmt;
+    }
+
+  /* Now check for binary boolean operations requiring conversion for
+     one of operands.  */
+  if (TREE_CODE (TREE_TYPE (lhs)) != BOOLEAN_TYPE)
+    return NULL;
+
+  if (rhs_code != BIT_IOR_EXPR
+      && rhs_code != BIT_XOR_EXPR
+      && rhs_code != BIT_AND_EXPR)
+    return NULL;
+
+  rhs2 = gimple_assign_rhs2 (last_stmt);
+
+  rhs1_type = search_type_for_mask (rhs1, vinfo);
+  rhs2_type = search_type_for_mask (rhs2, vinfo);
+
+  if (!rhs1_type || !rhs2_type
+      || TYPE_PRECISION (rhs1_type) == TYPE_PRECISION (rhs2_type))
+    return NULL;
+
+  if (TYPE_PRECISION (rhs1_type) < TYPE_PRECISION (rhs2_type))
+    {
+      vectype1 = get_mask_type_for_scalar_type (rhs1_type);
+      if (!vectype1)
+	return NULL;
+      rhs2 = build_mask_conversion (rhs2, vectype1, stmt_vinfo, vinfo);
+    }
+  else
+    {
+      vectype1 = get_mask_type_for_scalar_type (rhs2_type);
+      if (!vectype1)
+	return NULL;
+      rhs1 = build_mask_conversion (rhs1, vectype1, stmt_vinfo, vinfo);
+    }
+
+  lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+  pattern_stmt = gimple_build_assign (lhs, rhs_code, rhs1, rhs2);
+
+  *type_out = vectype1;
+  *type_in = vectype1;
+  stmts->safe_push (last_stmt);
+  if (dump_enabled_p ())
+    dump_printf_loc (MSG_NOTE, vect_location,
+		     "vect_recog_mask_conversion_pattern: detected:\n");
+
+  return pattern_stmt;
+}
+
+
 /* Mark statements that are involved in a pattern.  */
 
 static inline void
@@ -3556,7 +3820,8 @@ vect_pattern_recog_1 (vect_recog_func_ptr vect_recog_func,
   stmt_info = vinfo_for_stmt (stmt);
   loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
  
-  if (VECTOR_MODE_P (TYPE_MODE (type_in)))
+  if (VECTOR_BOOLEAN_TYPE_P (type_in)
+      || VECTOR_MODE_P (TYPE_MODE (type_in)))
     {
       /* No need to check target support (already checked by the pattern
          recognition function).  */
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index bdf16faff79..e6a320b341e 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1974,6 +1974,11 @@ vectorizable_mask_load_store (gimple *stmt, gimple_stmt_iterator *gsi,
 
       /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed
 	 from the IL.  */
+      if (STMT_VINFO_RELATED_STMT (stmt_info))
+	{
+	  stmt = STMT_VINFO_RELATED_STMT (stmt_info);
+	  stmt_info = vinfo_for_stmt (stmt);
+	}
       tree lhs = gimple_call_lhs (stmt);
       new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
       set_vinfo_for_stmt (new_stmt, stmt_info);
@@ -2092,6 +2097,11 @@ vectorizable_mask_load_store (gimple *stmt, gimple_stmt_iterator *gsi,
     {
       /* Ensure that even with -fno-tree-dce the scalar MASK_LOAD is removed
 	 from the IL.  */
+      if (STMT_VINFO_RELATED_STMT (stmt_info))
+	{
+	  stmt = STMT_VINFO_RELATED_STMT (stmt_info);
+	  stmt_info = vinfo_for_stmt (stmt);
+	}
       tree lhs = gimple_call_lhs (stmt);
       new_stmt = gimple_build_assign (lhs, build_zero_cst (TREE_TYPE (lhs)));
       set_vinfo_for_stmt (new_stmt, stmt_info);
@@ -3565,12 +3575,13 @@ vectorizable_conversion (gimple *stmt, gimple_stmt_iterator *gsi,
 	       && SCALAR_FLOAT_TYPE_P (rhs_type))))
     return false;
 
-  if ((INTEGRAL_TYPE_P (lhs_type)
-       && (TYPE_PRECISION (lhs_type)
-	   != GET_MODE_PRECISION (TYPE_MODE (lhs_type))))
-      || (INTEGRAL_TYPE_P (rhs_type)
-	  && (TYPE_PRECISION (rhs_type)
-	      != GET_MODE_PRECISION (TYPE_MODE (rhs_type)))))
+  if (!VECTOR_BOOLEAN_TYPE_P (vectype_out)
+      && ((INTEGRAL_TYPE_P (lhs_type)
+	   && (TYPE_PRECISION (lhs_type)
+	       != GET_MODE_PRECISION (TYPE_MODE (lhs_type))))
+	  || (INTEGRAL_TYPE_P (rhs_type)
+	      && (TYPE_PRECISION (rhs_type)
+		  != GET_MODE_PRECISION (TYPE_MODE (rhs_type))))))
     {
       if (dump_enabled_p ())
 	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -3628,6 +3639,21 @@ vectorizable_conversion (gimple *stmt, gimple_stmt_iterator *gsi,
       return false;
     }
 
+  if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
+      && !VECTOR_BOOLEAN_TYPE_P (vectype_in))
+    {
+      if (dump_enabled_p ())
+	{
+	  dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                           "can't convert between boolean and non "
+			   "boolean vectors");
+	  dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM, rhs_type);
+          dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
+	}
+
+      return false;
+    }
+
   nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in);
   nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
   if (nunits_in < nunits_out)
@@ -8170,7 +8196,7 @@ free_stmt_vec_info (gimple *stmt)
 	  gimple *patt_stmt = STMT_VINFO_STMT (patt_info);
 	  gimple_set_bb (patt_stmt, NULL);
 	  tree lhs = gimple_get_lhs (patt_stmt);
-	  if (TREE_CODE (lhs) == SSA_NAME)
+	  if (lhs && TREE_CODE (lhs) == SSA_NAME)
 	    release_ssa_name (lhs);
 	  if (seq)
 	    {
@@ -8180,7 +8206,7 @@ free_stmt_vec_info (gimple *stmt)
 		  gimple *seq_stmt = gsi_stmt (si);
 		  gimple_set_bb (seq_stmt, NULL);
 		  lhs = gimple_get_lhs (seq_stmt);
-		  if (TREE_CODE (lhs) == SSA_NAME)
+		  if (lhs && TREE_CODE (lhs) == SSA_NAME)
 		    release_ssa_name (lhs);
 		  free_stmt_vec_info (seq_stmt);
 		}
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 9bf7949e081..45c2d9bbbe1 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1087,7 +1087,7 @@ extern gimple *vect_find_last_scalar_stmt_in_slp (slp_tree);
    Additional pattern recognition functions can (and will) be added
    in the future.  */
 typedef gimple *(* vect_recog_func_ptr) (vec<gimple *> *, tree *, tree *);
-#define NUM_PATTERNS 13
+#define NUM_PATTERNS 14
 void vect_pattern_recog (vec_info *);
 
 /* In tree-vectorizer.c.  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index cb43430ecb1..b0f6c78d734 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -8810,20 +8810,11 @@ vrp_visit_phi_node (gphi *phi)
 
       /* If we dropped either bound to +-INF then if this is a loop
 	 PHI node SCEV may known more about its value-range.  */
-      if ((cmp_min > 0 || cmp_min < 0
+      if (cmp_min > 0 || cmp_min < 0
 	   || cmp_max < 0 || cmp_max > 0)
-	  && (l = loop_containing_stmt (phi))
-	  && l->header == gimple_bb (phi))
-	adjust_range_with_scev (&vr_result, l, phi, lhs);
-
-      /* If we will end up with a (-INF, +INF) range, set it to
-	 VARYING.  Same if the previous max value was invalid for
-	 the type and we end up with vr_result.min > vr_result.max.  */
-      if ((vrp_val_is_max (vr_result.max)
-	   && vrp_val_is_min (vr_result.min))
-	  || compare_values (vr_result.min,
-			     vr_result.max) > 0)
-	goto varying;
+	goto scev_check;
+
+      goto infinite_check;
     }
 
   /* If the new range is different than the previous value, keep
@@ -8849,8 +8840,28 @@ update_range:
   /* Nothing changed, don't add outgoing edges.  */
   return SSA_PROP_NOT_INTERESTING;
 
-  /* No match found.  Set the LHS to VARYING.  */
 varying:
+  set_value_range_to_varying (&vr_result);
+
+scev_check:
+  /* If this is a loop PHI node SCEV may known more about its value-range.
+     scev_check can be reached from two paths, one is a fall through from above
+     "varying" label, the other is direct goto from code block which tries to
+     avoid infinite simulation.  */
+  if ((l = loop_containing_stmt (phi))
+      && l->header == gimple_bb (phi))
+    adjust_range_with_scev (&vr_result, l, phi, lhs);
+
+infinite_check:
+  /* If we will end up with a (-INF, +INF) range, set it to
+     VARYING.  Same if the previous max value was invalid for
+     the type and we end up with vr_result.min > vr_result.max.  */
+  if ((vr_result.type == VR_RANGE || vr_result.type == VR_ANTI_RANGE)
+      && !((vrp_val_is_max (vr_result.max) && vrp_val_is_min (vr_result.min))
+	   || compare_values (vr_result.min, vr_result.max) > 0))
+    goto update_range;
+
+  /* No match found.  Set the LHS to VARYING.  */
   set_value_range_to_varying (lhs_vr);
   return SSA_PROP_VARYING;
 }
diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c
index de2674058f5..c8be4e8b722 100644
--- a/gcc/var-tracking.c
+++ b/gcc/var-tracking.c
@@ -9814,7 +9814,7 @@ vt_initialize (void)
 
   alloc_aux_for_blocks (sizeof (variable_tracking_info));
 
-  empty_shared_hash = new shared_hash;
+  empty_shared_hash = shared_hash_pool.allocate ();
   empty_shared_hash->refcount = 1;
   empty_shared_hash->htab = new variable_table_type (1);
   changed_variables = new variable_table_type (10);
diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index 13ed8133857..549d0ac3990 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,19 @@
+2015-11-11  Claudiu Zissulescu  <claziss@synopsys.com>
+
+	* config/arc/dp-hack.h: Add support for ARCHS.
+	* config/arc/ieee-754/divdf3.S: Likewise.
+	* config/arc/ieee-754/divsf3-stdmul.S: Likewise.
+	* config/arc/ieee-754/muldf3.S: Likewise.
+	* config/arc/ieee-754/mulsf3.S: Likewise
+	* config/arc/lib1funcs.S: Likewise
+	* config/arc/gmon/dcache_linesz.S: Don't read the build register
+	for ARCv2 cores.
+	* config/arc/gmon/profil.S (__profil, __profil_irq): Don't profile
+	for ARCv2 cores.
+	* config/arc/ieee-754/arc-ieee-754.h (MPYHU, MPYH): Define.
+	* config/arc/t-arc700-uClibc: Remove hard selection for ARC 700
+	cores.
+
 2015-11-09  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>
 
 	* config/ia64/crtbegin.S: Check HAVE_INITFINI_ARRAY_SUPPORT
diff --git a/libgcc/config/arc/dp-hack.h b/libgcc/config/arc/dp-hack.h
index c1ab9b2294e..a212e3b8b60 100644
--- a/libgcc/config/arc/dp-hack.h
+++ b/libgcc/config/arc/dp-hack.h
@@ -48,7 +48,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define L_mul_df
 #define L_div_df
 #elif (!defined (__ARC700__) && !defined (__ARC_MUL64__) \
-       && !defined(__ARC_MUL32BY16__))
+       && !defined (__ARC_MUL32BY16__) && !defined (__HS__))
 #define L_mul_df
 #define L_div_df
 #undef QUIET_NAN
diff --git a/libgcc/config/arc/gmon/dcache_linesz.S b/libgcc/config/arc/gmon/dcache_linesz.S
index 8cf64426aca..972a5879fed 100644
--- a/libgcc/config/arc/gmon/dcache_linesz.S
+++ b/libgcc/config/arc/gmon/dcache_linesz.S
@@ -38,6 +38,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 	.global	__dcache_linesz
 	.balign	4
 __dcache_linesz:
+#if !defined (__EM__) && !defined (__HS__)
 	lr	r12,[D_CACHE_BUILD]
 	extb_s	r0,r12
 	breq_s	r0,0,.Lsz_nocache
@@ -51,5 +52,6 @@ __dcache_linesz:
 	asl_s	r0,r0,r12
 	j_s	[blink]
 .Lsz_nocache:
+#endif /* !__EM__  && !__HS__ */
 	mov_s	r0,1
 	j_s	[blink]
diff --git a/libgcc/config/arc/gmon/profil.S b/libgcc/config/arc/gmon/profil.S
index 3be2869c924..df10dbd6af7 100644
--- a/libgcc/config/arc/gmon/profil.S
+++ b/libgcc/config/arc/gmon/profil.S
@@ -45,6 +45,7 @@ __profil_offset:
 	.global	__dcache_linesz
 	.global __profil
 	FUNC(__profil)
+#if !defined (__EM__) && !defined (__HS__)
 .Lstop_profiling:
 	sr	r0,[CONTROL0]
 	j_s	[blink]
@@ -107,6 +108,12 @@ nocache:
 	j_s	[blink]
 	.balign	4
 1:	j	__profil_irq
+#else
+__profil:
+	.balign	4
+	mov_s	r0,-1
+	j_s	[blink]
+#endif /* !__EM__ && !__HS__ */
 	ENDFUNC(__profil)
 
 	FUNC(__profil_irq)
@@ -114,6 +121,7 @@ nocache:
 	.balign 32,0,12	; make sure the code spans no more that two cache lines
 	nop_s
 __profil_irq:
+#if !defined (__EM__) && !defined (__HS__)
 	push_s	r0
 	ld	r0,[__profil_offset]
 	push_s	r1
@@ -128,6 +136,9 @@ __profil_irq:
 nostore:ld.ab	r2,[sp,8]
 	pop_s	r0
 	j.f	[ilink1]
+#else
+	rtie
+#endif /* !__EM__  && !__HS__ */
 	ENDFUNC(__profil_irq)
 
 ; could save one cycle if the counters were allocated at link time and
diff --git a/libgcc/config/arc/ieee-754/arc-ieee-754.h b/libgcc/config/arc/ieee-754/arc-ieee-754.h
index 08a14a6f429..f1ac98e4278 100644
--- a/libgcc/config/arc/ieee-754/arc-ieee-754.h
+++ b/libgcc/config/arc/ieee-754/arc-ieee-754.h
@@ -54,3 +54,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define bmsk_l bmsk
 #define bxor_l bxor
 #define bcs_s blo_s
+#if defined (__HS__) || defined (__EM__)
+#define MPYHU   mpymu
+#define MPYH    mpym
+#else
+#define MPYHU   mpyhu
+#define MPYH    mpyh
+#endif
diff --git a/libgcc/config/arc/ieee-754/divdf3.S b/libgcc/config/arc/ieee-754/divdf3.S
index 2d000e40a04..27705ed5909 100644
--- a/libgcc/config/arc/ieee-754/divdf3.S
+++ b/libgcc/config/arc/ieee-754/divdf3.S
@@ -118,7 +118,7 @@ __divdf3_support: /* This label makes debugger output saner.  */
 	sub r11,r11,11
 	asl DBL1L,DBL1L,r11
 	sub r11,r11,1
-	mpyhu r5,r4,r8
+	MPYHU r5,r4,r8
 	sub r7,r7,r11
 	asl r4,r4,12
 	b.d .Lpast_denorm_dbl1
@@ -189,25 +189,33 @@ __divdf3:
 	asl r8,DBL1H,12
 	lsr r12,DBL1L,20
 	lsr r4,r8,26
+#ifdef __HS__
+	add3 r10,pcl,60 ; (.Ldivtab-.) >> 3
+#else
 	add3 r10,pcl,59 ; (.Ldivtab-.) >> 3
+#endif
 	ld.as r4,[r10,r4]
+#ifdef __HS__
+	ld.as r9,[pcl,182]; [pcl,(-((.-.L7ff00000) >> 2))] ; 0x7ff00000
+#else
 	ld.as r9,[pcl,180]; [pcl,(-((.-.L7ff00000) >> 2))] ; 0x7ff00000
+#endif
 	or r8,r8,r12
-	mpyhu r5,r4,r8
+	MPYHU r5,r4,r8
 	and.f r7,DBL1H,r9
 	asl r4,r4,12 ; having the asl here is a concession to the XMAC pipeline.
 	beq.d .Ldenorm_dbl1
 	and r6,DBL0H,r9
 .Lpast_denorm_dbl1: ; wb stall
 	sub r4,r4,r5
-	mpyhu r5,r4,r4
+	MPYHU r5,r4,r4
 	breq.d r6,0,.Ldenorm_dbl0
 	lsr r8,r8,1
 	asl r12,DBL0H,11
 	lsr r10,DBL0L,21
 .Lpast_denorm_dbl0: ; wb stall
 	bset r8,r8,31
-	mpyhu r11,r5,r8
+	MPYHU r11,r5,r8
 	add_s r12,r12,r10
 	bset r5,r12,31
 	cmp r5,r8
@@ -215,7 +223,7 @@ __divdf3:
 	; wb stall
 	lsr.cc r5,r5,1
 	sub r4,r4,r11 ; u1.31 inverse, about 30 bit
-	mpyhu r11,r5,r4 ; result fraction highpart
+	MPYHU r11,r5,r4 ; result fraction highpart
 	breq r7,r9,.Linf_nan_dbl1
 	lsr r8,r8,2 ; u3.29
 	add r5,r6, /* wait for immediate /  XMAC wb stall */ \
@@ -226,7 +234,7 @@ __divdf3:
 	asl_s DBL1L,DBL1L,9 ; u-29.23:9
 	sbc r6,r5,r7
 	; resource conflict (not for XMAC)
-	mpyhu r5,r11,DBL1L ; u-28.23:9
+	MPYHU r5,r11,DBL1L ; u-28.23:9
 	add.cs DBL0L,DBL0L,DBL0L
 	asl_s DBL0L,DBL0L,6 ; u-26.25:7
 	asl r10,r11,23
@@ -234,7 +242,7 @@ __divdf3:
 	; wb stall (before 'and' for XMAC)
 	lsr r7,r11,9
 	sub r5,DBL0L,r5 ; rest msw ; u-26.31:0
-	mpyh r12,r5,r4 ; result fraction lowpart
+	MPYH r12,r5,r4 ; result fraction lowpart
 	xor.f 0,DBL0H,DBL1H
 	and DBL0H,r6,r9
 	add_s DBL0H,DBL0H,r7 ; (XMAC wb stall)
@@ -261,7 +269,7 @@ __divdf3:
 	sub.cs DBL0H,DBL0H,1
 	sub.f r12,r12,2
 	; resource conflict (not for XMAC)
-	mpyhu r7,r12,DBL1L ; u-51.32
+	MPYHU r7,r12,DBL1L ; u-51.32
 	asl r5,r5,25 ; s-51.7:25
 	lsr r10,r10,7 ; u-51.30:2
 	; resource conflict (not for XMAC)
@@ -291,10 +299,21 @@ __divdf3:
 	rsub r7,r6,5
 	asr r10,r12,28
 	bmsk r4,r12,27
+#ifdef __HS__
+	min  r7, r7, 31
+	asr  DBL0L, r4, r7
+#else
 	asrs DBL0L,r4,r7
+#endif
 	add DBL1H,r11,r10
+#ifdef __HS__
+	abs.f r10, r4
+	sub.mi r10, r10, 1
+#endif
 	add.f r7,r6,32-5
+#ifdef __ARC700__
 	abss r10,r4
+#endif
 	asl r4,r4,r7
 	mov.mi r4,r10
 	add.f r10,r6,23
@@ -319,7 +338,7 @@ __divdf3:
 	and r9,DBL0L,1 ; tie-breaker: round to even
 	lsr r11,r11,7 ; u-51.30:2
 	; resource conflict (not for XMAC)
-	mpyhu r8,r12,DBL1L ; u-51.32
+	MPYHU r8,r12,DBL1L ; u-51.32
 	sub.mi r11,r11,DBL1L ; signed multiply adjust for r12*DBL1L
 	add_s DBL1H,DBL1H,r11
 	; resource conflict (not for XMAC)
diff --git a/libgcc/config/arc/ieee-754/divsf3-stdmul.S b/libgcc/config/arc/ieee-754/divsf3-stdmul.S
index 09861d3318c..f13944ae11a 100644
--- a/libgcc/config/arc/ieee-754/divsf3-stdmul.S
+++ b/libgcc/config/arc/ieee-754/divsf3-stdmul.S
@@ -144,7 +144,7 @@ __divsf3_support: /* This label makes debugger output saner.  */
 	ld.as r5,[r3,r5]
 	add r4,r6,r6
 	; load latency
-	mpyhu r7,r5,r4
+	MPYHU r7,r5,r4
 	bic.ne.f 0, \
 		0x60000000,r0 ; large number / denorm -> Inf
 	beq_s .Linf_NaN
@@ -152,7 +152,7 @@ __divsf3_support: /* This label makes debugger output saner.  */
 	; wb stall
 	; slow track
 	sub r7,r5,r7
-	mpyhu r8,r7,r6
+	MPYHU r8,r7,r6
 	asl_s r12,r12,23
 	and.f r2,r0,r9
 	add r2,r2,r12
@@ -160,7 +160,7 @@ __divsf3_support: /* This label makes debugger output saner.  */
 	; wb stall
 	bne.d .Lpast_denorm_fp1
 .Ldenorm_fp0:
-	mpyhu r8,r8,r7
+	MPYHU r8,r8,r7
 	bclr r12,r12,31
 	norm.f r3,r12 ; flag for 0/x -> 0 check
 	bic.ne.f 0,0x60000000,r1 ; denorm/large number -> 0
@@ -209,7 +209,7 @@ __divsf3:
 	ld.as r5,[r3,r2]
 	asl r4,r1,9
 	ld.as r9,[pcl,-114]; [pcl,(-((.-.L7f800000) >> 2))] ; 0x7f800000
-	mpyhu r7,r5,r4
+	MPYHU r7,r5,r4
 	asl r6,r1,8
 	and.f r11,r1,r9
 	bset r6,r6,31
@@ -217,14 +217,14 @@ __divsf3:
 	; wb stall
 	beq .Ldenorm_fp1
 	sub r7,r5,r7
-	mpyhu r8,r7,r6
+	MPYHU r8,r7,r6
 	breq.d r11,r9,.Linf_nan_fp1
 	and.f r2,r0,r9
 	beq.d .Ldenorm_fp0
 	asl r12,r0,8
 	; wb stall
 	breq r2,r9,.Linf_nan_fp0
-	mpyhu r8,r8,r7
+	MPYHU r8,r8,r7
 .Lpast_denorm_fp1:
 	bset r3,r12,31
 .Lpast_denorm_fp0:
@@ -234,7 +234,7 @@ __divsf3:
 	/* wb stall */ \
 		0x3f000000
 	sub r7,r7,r8 ; u1.31 inverse, about 30 bit
-	mpyhu r3,r3,r7
+	MPYHU r3,r3,r7
 	sbc r2,r2,r11
 	xor.f 0,r0,r1
 	and r0,r2,r9
diff --git a/libgcc/config/arc/ieee-754/muldf3.S b/libgcc/config/arc/ieee-754/muldf3.S
index 805db5c8922..5f562e23354 100644
--- a/libgcc/config/arc/ieee-754/muldf3.S
+++ b/libgcc/config/arc/ieee-754/muldf3.S
@@ -132,19 +132,19 @@ __muldf3_support: /* This label makes debugger output saner.  */
 	.balign 4
 __muldf3:
 	ld.as r9,[pcl,0x4b] ; ((.L7ff00000-.+2)/4)]
-	mpyhu r4,DBL0L,DBL1L
+	MPYHU r4,DBL0L,DBL1L
 	bmsk r6,DBL0H,19
 	bset r6,r6,20
 	mpyu r7,r6,DBL1L
 	and r11,DBL0H,r9
 	breq r11,0,.Ldenorm_dbl0
-	mpyhu r8,r6,DBL1L
+	MPYHU r8,r6,DBL1L
 	bmsk r10,DBL1H,19
 	bset r10,r10,20
-	mpyhu r5,r10,DBL0L
+	MPYHU r5,r10,DBL0L
 	add.f r4,r4,r7
 	and r12,DBL1H,r9
-	mpyhu r7,r6,r10
+	MPYHU r7,r6,r10
 	breq r12,0,.Ldenorm_dbl1
 	adc.f r5,r5,r8
 	mpyu r8,r10,DBL0L
diff --git a/libgcc/config/arc/ieee-754/mulsf3.S b/libgcc/config/arc/ieee-754/mulsf3.S
index 7a6c7916ddb..df2660a2102 100644
--- a/libgcc/config/arc/ieee-754/mulsf3.S
+++ b/libgcc/config/arc/ieee-754/mulsf3.S
@@ -64,7 +64,7 @@ __mulsf3:
 	bset	r2,r0,23
 	asl_s	r2,r2,8
 	bset	r3,r4,23
-	mpyhu	r6,r2,r3
+	MPYHU	r6,r2,r3
 	and	r11,r0,r9
 	breq	r11,0,.Ldenorm_dbl0
 	mpyu	r7,r2,r3
@@ -144,7 +144,7 @@ __mulsf3:
 	add_s	r2,r2,r2
 	asl	r2,r2,r4
 	asl	r4,r4,23
-	mpyhu	r6,r2,r3
+	MPYHU	r6,r2,r3
 	breq	r12,r9,.Ldenorm_dbl0_inf_nan_dbl1
 	sub.ne.f r12,r12,r4
 	mpyu	r7,r2,r3
@@ -163,7 +163,7 @@ __mulsf3:
 	asl	r4,r4,r3
 	sub_s	r3,r3,1
 	asl_s	r3,r3,23
-	mpyhu	r6,r2,r4
+	MPYHU	r6,r2,r4
 	sub.ne.f r11,r11,r3
 	bmsk	r8,r0,30
 	mpyu	r7,r2,r4
diff --git a/libgcc/config/arc/lib1funcs.S b/libgcc/config/arc/lib1funcs.S
index e59340a2242..022a2ea0cbe 100644
--- a/libgcc/config/arc/lib1funcs.S
+++ b/libgcc/config/arc/lib1funcs.S
@@ -79,7 +79,7 @@ SYM(__mulsi3):
 	j_s.d [blink]
 	mov_s r0,mlo
 	ENDFUNC(__mulsi3)
-#elif defined (__ARC700__)
+#elif defined (__ARC700__) || defined (__HS__)
 	HIDDEN_FUNC(__mulsi3)
 	mpyu	r0,r0,r1
 	nop_s
@@ -393,7 +393,12 @@ SYM(__udivmodsi4):
 	lsr_s r1,r1
 	cmp_s r0,r1
 	xor.f r2,lp_count,31
+#if !defined (__EM__)
 	mov_s lp_count,r2
+#else
+	mov lp_count,r2
+	nop_s
+#endif /* !__EM__ */
 #endif /* !__ARC_NORM__ */
 	sub.cc r0,r0,r1
 	mov_s r3,3
@@ -1260,7 +1265,7 @@ SYM(__ld_r13_to_r14_ret):
 #endif
 
 #ifdef  L_muldf3
-#ifdef __ARC700__
+#if defined (__ARC700__) || defined (__HS__)
 #include "ieee-754/muldf3.S"
 #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__)
 #include "ieee-754/arc600-mul64/muldf3.S"
@@ -1276,7 +1281,7 @@ SYM(__ld_r13_to_r14_ret):
 #endif
 
 #ifdef  L_mulsf3
-#ifdef __ARC700__
+#if defined (__ARC700__) || defined (__HS__)
 #include "ieee-754/mulsf3.S"
 #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__)
 #include "ieee-754/arc600-mul64/mulsf3.S"
@@ -1288,7 +1293,7 @@ SYM(__ld_r13_to_r14_ret):
 #endif
 
 #ifdef  L_divdf3
-#ifdef __ARC700__
+#if defined (__ARC700__) || defined (__HS__)
 #include "ieee-754/divdf3.S"
 #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__)
 #include "ieee-754/arc600-mul64/divdf3.S"
@@ -1298,7 +1303,7 @@ SYM(__ld_r13_to_r14_ret):
 #endif
 
 #ifdef  L_divsf3
-#ifdef __ARC700__
+#if defined (__ARC700__) || defined (__HS__)
 #include "ieee-754/divsf3-stdmul.S"
 #elif defined (__ARC_NORM__) && defined(__ARC_MUL64__)
 #include "ieee-754/arc600-mul64/divsf3.S"
diff --git a/libgcc/config/arc/t-arc700-uClibc b/libgcc/config/arc/t-arc700-uClibc
index 651c3de5260..ff570398d90 100644
--- a/libgcc/config/arc/t-arc700-uClibc
+++ b/libgcc/config/arc/t-arc700-uClibc
@@ -28,10 +28,10 @@
 CRTSTUFF_T_CFLAGS += -mno-sdata
 
 # Compile crtbeginS.o and crtendS.o with pic.
-CRTSTUFF_T_CFLAGS_S = $(CRTSTUFF_T_CFLAGS) -mA7 -fPIC
+CRTSTUFF_T_CFLAGS_S = $(CRTSTUFF_T_CFLAGS) -fPIC
 
 # Compile libgcc2.a with pic.
-TARGET_LIBGCC2_CFLAGS = -mA7 -fPIC
+TARGET_LIBGCC2_CFLAGS = -fPIC
 
 PROFILE_OSDEP = prof-freq.o
 
diff --git a/libgo/configure b/libgo/configure
index 08a197d5a61..eb37e29d2f8 100755
--- a/libgo/configure
+++ b/libgo/configure
@@ -14249,6 +14249,46 @@ fi
 fi
 
    unset ac_cv_func_gethostbyname
+   ac_fn_c_check_func "$LINENO" "sendfile" "ac_cv_func_sendfile"
+if test "x$ac_cv_func_sendfile" = x""yes; then :
+
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for main in -lsendfile" >&5
+$as_echo_n "checking for main in -lsendfile... " >&6; }
+if test "${ac_cv_lib_sendfile_main+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_check_lib_save_LIBS=$LIBS
+LIBS="-lsendfile  $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+
+int
+main ()
+{
+return main ();
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_lib_sendfile_main=yes
+else
+  ac_cv_lib_sendfile_main=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_sendfile_main" >&5
+$as_echo "$ac_cv_lib_sendfile_main" >&6; }
+if test "x$ac_cv_lib_sendfile_main" = x""yes; then :
+  libgo_cv_lib_sockets="$libgo_cv_lib_sockets -lsendfile"
+fi
+
+fi
+
    LIBS=$libgo_old_libs
 
 fi
diff --git a/libgo/configure.ac b/libgo/configure.ac
index 332e540a302..6e23a85fa6d 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -473,6 +473,9 @@ AC_CACHE_CHECK([for socket libraries], libgo_cv_lib_sockets,
 		 [AC_CHECK_LIB(nsl, main,
 		 	[libgo_cv_lib_sockets="$libgo_cv_lib_sockets -lnsl"])])
    unset ac_cv_func_gethostbyname
+   AC_CHECK_FUNC(sendfile, ,
+		 [AC_CHECK_LIB(sendfile, main,
+		 	[libgo_cv_lib_sockets="$libgo_cv_lib_sockets -lsendfile"])])
    LIBS=$libgo_old_libs
 ])
 NET_LIBS="$libgo_cv_lib_sockets"
diff --git a/libgo/go/cmd/go/build.go b/libgo/go/cmd/go/build.go
index 3afac2ee062..865871c5314 100644
--- a/libgo/go/cmd/go/build.go
+++ b/libgo/go/cmd/go/build.go
@@ -2555,17 +2555,9 @@ func (tools gccgoToolchain) ld(b *builder, root *action, out string, allactions
 		}
 	}
 
-	switch ldBuildmode {
-	case "c-archive", "c-shared":
-		ldflags = append(ldflags, "-Wl,--whole-archive")
-	}
-
+	ldflags = append(ldflags, "-Wl,--whole-archive")
 	ldflags = append(ldflags, afiles...)
-
-	switch ldBuildmode {
-	case "c-archive", "c-shared":
-		ldflags = append(ldflags, "-Wl,--no-whole-archive")
-	}
+	ldflags = append(ldflags, "-Wl,--no-whole-archive")
 
 	ldflags = append(ldflags, cgoldflags...)
 	ldflags = append(ldflags, envList("CGO_LDFLAGS", "")...)
diff --git a/libgo/mksysinfo.sh b/libgo/mksysinfo.sh
index 6d39df96e95..662619f2076 100755
--- a/libgo/mksysinfo.sh
+++ b/libgo/mksysinfo.sh
@@ -1488,4 +1488,24 @@ grep '^type _zone_net_addr_t ' gen-sysinfo.go | \
     sed -e 's/_in6_addr/[16]byte/' \
     >> ${OUT}
 
+# The Solaris 12 _flow_arp_desc_t struct.
+grep '^type _flow_arp_desc_t ' gen-sysinfo.go | \
+    sed -e 's/_in6_addr_t/[16]byte/g' \
+    >> ${OUT}
+
+# The Solaris 12 _flow_l3_desc_t struct.
+grep '^type _flow_l3_desc_t ' gen-sysinfo.go | \
+    sed -e 's/_in6_addr_t/[16]byte/g' \
+    >> ${OUT}
+
+# The Solaris 12 _mac_ipaddr_t struct.
+grep '^type _mac_ipaddr_t ' gen-sysinfo.go | \
+    sed -e 's/_in6_addr_t/[16]byte/g' \
+    >> ${OUT}
+
+# The Solaris 12 _mactun_info_t struct.
+grep '^type _mactun_info_t ' gen-sysinfo.go | \
+    sed -e 's/_in6_addr_t/[16]byte/g' \
+    >> ${OUT}
+
 exit $?
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 08d467b1055..ed86943bb32 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,6 +1,10 @@
 2015-11-09  Nathan Sidwell  <nathan@codesourcery.com>
 
-	* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: New.
+	* testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Remove
+	inadvertent commit.
+
+2015-11-09  Nathan Sidwell  <nathan@codesourcery.com>
+
 	* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: New.
 	* testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: New.
 	* testsuite/libgomp.oacc-c-c++-common/routine-v-1.c: New.
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
deleted file mode 100644
index 7f5d3d37617..00000000000
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c
+++ /dev/null
@@ -1,41 +0,0 @@
-/* { dg-do run } */
-
-#include  <openacc.h>
-
-int main ()
-{
-  int ok = 1;
-  int val = 2;
-  int ary[32];
-  int ondev = 0;
-
-  for (int i = 0; i < 32; i++)
-    ary[i] = ~0;
-  
-#pragma acc parallel num_gangs (32) copy (ok) firstprivate (val) copy(ary, ondev)
-  {
-    ondev = acc_on_device (acc_device_not_host);
-#pragma acc loop gang(static:1)
-    for (unsigned i = 0; i < 32; i++)
-      {
-	if (val != 2)
-	  ok = 0;
-	val += i;
-	ary[i] = val;
-      }
-  }
-
-  if (ondev)
-    {
-      if (!ok)
-	return 1;
-      if (val != 2)
-	return 1;
-
-      for (int i = 0; i < 32; i++)
-	if (ary[i] != 2 + i)
-	  return 1;
-    }
-  
-  return 0;
-}
diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 540041d63dd..960a56ca186 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,5 +1,24 @@
+2015-11-11  Jonathan Wakely  <jwakely@redhat.com>
+
+	PR libstdc++/64651
+	* libsupc++/exception_ptr.h (rethrow_exception): Add using-declaration
+	to __exception_ptr namespace.
+	* testsuite/18_support/exception_ptr/rethrow_exception.cc: Test ADL.
+	Remove unnecessary test variables.
+
 2015-11-10  Jonathan Wakely  <jwakely@redhat.com>
 
+	PR libstdc++/68190
+	* include/bits/stl_multiset.h (multiset::find): Fix return types.
+	* include/bits/stl_set.h (set::find): Likewise.
+	* testsuite/23_containers/map/operations/2.cc: Test find return types.
+	* testsuite/23_containers/multimap/operations/2.cc: Likewise.
+	* testsuite/23_containers/multiset/operations/2.cc: Likewise.
+	* testsuite/23_containers/set/operations/2.cc: Likewise.
+
+	* doc/xml/manual/status_cxx2017.xml: Update.
+	* doc/html/*: Regenerate.
+
 	* include/bits/functional_hash.h: Fix grammar in comment.
 
 2015-11-09  François Dumont  <fdumont@gcc.gnu.org>
diff --git a/libstdc++-v3/doc/html/manual/status.html b/libstdc++-v3/doc/html/manual/status.html
index cdbc8b94f2f..91404aace42 100644
--- a/libstdc++-v3/doc/html/manual/status.html
+++ b/libstdc++-v3/doc/html/manual/status.html
@@ -495,11 +495,11 @@ not in any particular release.
 	<a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4277.html" target="_top">
 	  N4277
 	</a>
-      </td><td align="left">TriviallyCopyable <code class="code">reference_wrapper</code> </td><td align="left">Y</td><td align="left"> </td></tr><tr bgcolor="#B0B0B0"><td align="left">
+      </td><td align="left">TriviallyCopyable <code class="code">reference_wrapper</code> </td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left">
 	<a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4258.pdf" target="_top">
 	  N4258
 	</a>
-      </td><td align="left">Cleaning-up noexcept in the Library</td><td align="left">Partial</td><td align="left">Changes to basic_string not complete.</td></tr><tr><td align="left">
+      </td><td align="left">Cleaning-up noexcept in the Library</td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left">
 	<a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4279.html" target="_top">
 	  N4279
 	</a>
@@ -507,11 +507,11 @@ not in any particular release.
 	<a class="link" href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2014/n3911.pdf" target="_top">
 	  N3911
 	</a>
-      </td><td align="left">Transformation Trait Alias <code class="code">void_t</code></td><td align="left">Y</td><td align="left"> </td></tr><tr bgcolor="#C8B0B0"><td align="left">
+      </td><td align="left">Transformation Trait Alias <code class="code">void_t</code></td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left">
 	<a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4169.html" target="_top">
 	  N4169
 	</a>
-      </td><td align="left">A proposal to add invoke function template</td><td align="left">N</td><td align="left">In progress</td></tr><tr><td align="left">
+      </td><td align="left">A proposal to add invoke function template</td><td align="left">Y</td><td align="left"> </td></tr><tr><td align="left">
 	<a class="link" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4280.pdf" target="_top">
 	  N4280
 	</a>
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index fc2ebd2466f..4ea0d1e0293 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -103,15 +103,14 @@ not in any particular release.
     </row>
 
     <row>
-      <?dbhtml bgcolor="#B0B0B0" ?>
       <entry>
 	<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4258.pdf">
 	  N4258
 	</link>
       </entry>
       <entry>Cleaning-up noexcept in the Library</entry>
-      <entry>Partial</entry>
-      <entry>Changes to basic_string not complete.</entry>
+      <entry>Y</entry>
+      <entry/>
     </row>
 
     <row>
@@ -137,15 +136,14 @@ not in any particular release.
     </row>
 
     <row>
-      <?dbhtml bgcolor="#C8B0B0" ?>
       <entry>
 	<link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4169.html">
 	  N4169
 	</link>
       </entry>
       <entry>A proposal to add invoke function template</entry>
-      <entry>N</entry>
-      <entry>In progress</entry>
+      <entry>Y</entry>
+      <entry/>
     </row>
 
     <row>
diff --git a/libstdc++-v3/include/bits/stl_multiset.h b/libstdc++-v3/include/bits/stl_multiset.h
index 5ccc6dd61f7..e6e233772b3 100644
--- a/libstdc++-v3/include/bits/stl_multiset.h
+++ b/libstdc++-v3/include/bits/stl_multiset.h
@@ -680,13 +680,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 #if __cplusplus > 201103L
       template<typename _Kt>
 	auto
-	find(const _Kt& __x) -> decltype(_M_t._M_find_tr(__x))
-	{ return _M_t._M_find_tr(__x); }
+	find(const _Kt& __x)
+	-> decltype(iterator{_M_t._M_find_tr(__x)})
+	{ return iterator{_M_t._M_find_tr(__x)}; }
 
       template<typename _Kt>
 	auto
-	find(const _Kt& __x) const -> decltype(_M_t._M_find_tr(__x))
-	{ return _M_t._M_find_tr(__x); }
+	find(const _Kt& __x) const
+	-> decltype(const_iterator{_M_t._M_find_tr(__x)})
+	{ return const_iterator{_M_t._M_find_tr(__x)}; }
 #endif
       //@}
 
diff --git a/libstdc++-v3/include/bits/stl_set.h b/libstdc++-v3/include/bits/stl_set.h
index cf74368fa0e..8bea61a3b23 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -699,13 +699,15 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 #if __cplusplus > 201103L
       template<typename _Kt>
 	auto
-	find(const _Kt& __x) -> decltype(_M_t._M_find_tr(__x))
-	{ return _M_t._M_find_tr(__x); }
+	find(const _Kt& __x)
+	-> decltype(iterator{_M_t._M_find_tr(__x)})
+	{ return iterator{_M_t._M_find_tr(__x)}; }
 
       template<typename _Kt>
 	auto
-	find(const _Kt& __x) const -> decltype(_M_t._M_find_tr(__x))
-	{ return _M_t._M_find_tr(__x); }
+	find(const _Kt& __x) const
+	-> decltype(const_iterator{_M_t._M_find_tr(__x)})
+	{ return const_iterator{_M_t._M_find_tr(__x)}; }
 #endif
       //@}
 
diff --git a/libstdc++-v3/libsupc++/exception_ptr.h b/libstdc++-v3/libsupc++/exception_ptr.h
index 8fbad1c86d1..7821c149f0e 100644
--- a/libstdc++-v3/libsupc++/exception_ptr.h
+++ b/libstdc++-v3/libsupc++/exception_ptr.h
@@ -68,6 +68,8 @@ namespace std
 
   namespace __exception_ptr
   {
+    using std::rethrow_exception;
+
     /**
      *  @brief An opaque pointer to an arbitrary exception.
      *  @ingroup exceptions
diff --git a/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc b/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc
index 31da2ecbe82..7d3989213e3 100644
--- a/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc
+++ b/libstdc++-v3/testsuite/18_support/exception_ptr/rethrow_exception.cc
@@ -30,7 +30,6 @@
 
 void test01()
 {
-  bool test __attribute__((unused)) = true;
   using namespace std;
 
   try {
@@ -54,7 +53,6 @@ void test02()
 
 void test03()
 {
-  bool test __attribute__((unused)) = true;
   using namespace std;
 
   exception_ptr ep;
@@ -71,7 +69,6 @@ void test03()
 
 void test04()
 {
-  bool test __attribute__((unused)) = true;
   using namespace std;
 
   // Weave the exceptions in an attempt to confuse the machinery.
@@ -103,12 +100,23 @@ void test04()
   }
 }
 
+void test05()
+{
+  // libstdc++/64651 std::rethrow_exception not found by ADL
+  // This is not required to work but is a conforming extension.
+  try {
+    rethrow_exception(std::make_exception_ptr(0));
+  } catch(...) {
+  }
+}
+
 int main()
 {
   test01();
   test02();
   test03();
   test04();
+  test05();
 
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/23_containers/map/operations/2.cc b/libstdc++-v3/testsuite/23_containers/map/operations/2.cc
index 6cc277aedce..ef301ef136c 100644
--- a/libstdc++-v3/testsuite/23_containers/map/operations/2.cc
+++ b/libstdc++-v3/testsuite/23_containers/map/operations/2.cc
@@ -54,6 +54,11 @@ test01()
   VERIFY( cit == cx.end() );
 
   VERIFY( Cmp::count == 0);
+
+  static_assert(std::is_same<decltype(it), test_type::iterator>::value,
+      "find returns iterator");
+  static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value,
+      "const find returns const_iterator");
 }
 
 void
diff --git a/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc b/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc
index 67c3bfd60a3..eef6ee4515d 100644
--- a/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc
+++ b/libstdc++-v3/testsuite/23_containers/multimap/operations/2.cc
@@ -54,6 +54,11 @@ test01()
   VERIFY( cit == cx.end() );
 
   VERIFY( Cmp::count == 0);
+
+  static_assert(std::is_same<decltype(it), test_type::iterator>::value,
+      "find returns iterator");
+  static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value,
+      "const find returns const_iterator");
 }
 
 void
diff --git a/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc b/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc
index ff2748f713a..4bea719160f 100644
--- a/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc
+++ b/libstdc++-v3/testsuite/23_containers/multiset/operations/2.cc
@@ -54,6 +54,11 @@ test01()
   VERIFY( cit == cx.end() );
 
   VERIFY( Cmp::count == 0);
+
+  static_assert(std::is_same<decltype(it), test_type::iterator>::value,
+      "find returns iterator");
+  static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value,
+      "const find returns const_iterator");
 }
 
 void
diff --git a/libstdc++-v3/testsuite/23_containers/set/operations/2.cc b/libstdc++-v3/testsuite/23_containers/set/operations/2.cc
index 84ddd1f1ddc..6a68453ec7b 100644
--- a/libstdc++-v3/testsuite/23_containers/set/operations/2.cc
+++ b/libstdc++-v3/testsuite/23_containers/set/operations/2.cc
@@ -54,6 +54,11 @@ test01()
   VERIFY( cit == cx.end() );
 
   VERIFY( Cmp::count == 0);
+
+  static_assert(std::is_same<decltype(it), test_type::iterator>::value,
+      "find returns iterator");
+  static_assert(std::is_same<decltype(cit), test_type::const_iterator>::value,
+      "const find returns const_iterator");
 }
 
 void
author	bstarynk <bstarynk@138bc75d-0d04-0410-961f-82ee72b054a4>	2016-04-15 13:13:48 +0000
committer	bstarynk <bstarynk@138bc75d-0d04-0410-961f-82ee72b054a4>	2016-04-15 13:13:48 +0000
commit	df168526dd4d08c5faa014d585874f978bf73d80 (patch)
tree	baab9f6705e45f350fc6dbdd45e2924ae75d2d1b
parent	ffbf47a37f7d2d4aa647f4bf0f231a8f2399049b (diff)
download	gcc-df168526dd4d08c5faa014d585874f978bf73d80.tar.gz